PX-HER0 Board

The expert field guide to embedded ARM systems

Limited items in stock

View Purchasing Options
Mar 18, 2020

Project update 4 of 16

The Joy of Tuning a Race Car

by Pieter C

Have you seen the movie Ford v Ferrari? For a balanced view, also see the documentary Shelby American. Interestingly the replica cars used in the movie was built in the same South African city where I grew up (see HERE), but now I’m digressing WAY too much…

If you want to make it really fast, you’ve got to shed dead weight. I recently tried to figure out why the STM32 took so long to read the interrupt flags of an RF transceiver on the SPI bus. I used the vendor library to communicate with their RF transceiver.

Here are the relevant two functions:

#define RF_TX_FIFO_SIZE   128
#define RF_BUF_SIZE       RF_TX_FIFO_SIZE

void RF_GPIO_IrqGetStatus(RFIrqs* pxIrqStatus)
{
  uint8_t tmp[4];
  uint8_t* pIrqPointer = (uint8_t*)pxIrqStatus;

  PX_DBG_PIN_HI(); // <- Start timing

  /* all the 4 bytes of irq status register is being read */
  g_xStatus = RF_ReadRegister(IRQ_STATUS3_ADDR, 4, tmp);

  /* Build the IRQ Status word */
  for(uint8_t i=0; i<4; i++) {
    *pIrqPointer = tmp[3-i];
    pIrqPointer++;
  }
}

StatusBytes RF_ReadRegister(uint8_t cRegAddress, uint8_t cNbBytes, uint8_t* pcBuffer )
{
    uint8_t tx_buff[(2 * RF_BUF_SIZE) - 1]={READ_HEADER,cRegAddress}; // <- Culprit
    uint8_t rx_buff[RF_CMD_SIZE];
    StatusBytes status;

    PX_DBG_PIN_LO(); // <- End timing

    RF_ENTER_CRITICAL();
    RF_RADIO_SPI_NSS_PIN_LOW();
    IO_func.WriteBuffer( tx_buff, rx_buff, 2 );
    IO_func.WriteBuffer( tx_buff, pcBuffer, cNbBytes );
    RF_RADIO_SPI_NSS_PIN_HIGH();
    RF_EXIT_CRITICAL();

    ((uint8_t*)&status)[1]=rx_buff[0];
    ((uint8_t*)&status)[0]=rx_buff[1]; 

    return status;
}

I use a spare GPIO pin to measure the time it takes to get into the RF_ReadRegister() function. Here is the measured timing:

I then tried to figure out what caused the long 80.82 us delay. I had to switch to disassembly and single-step:

It turns out that the function allocates a whopping 255 bytes on the stack for tx_buff[] array, initialize the first 2 bytes and clears the rest (253 bytes) using the memset() function. That explains it! See THIS StackOverflow question

Here is the timing after I fixed it:

These are the embedded pro skills that I’m trying to cultivate and the reason why the tutorials start at the deep end of the pool with 01 Flashing an LED in assembler

KEEP CALM and DON’T PANIC :)


Sign up to receive future updates for PX-HER0 Board.

Subscribe to the Crowd Supply newsletter, highlighting the latest creators and projects