I’m working on a project that involves a STM32 MCU (on the STM32303C-EVAL board to be exact) that has to respond to an external interrupt. I want the reaction to the external interrupt to be as fast as possible. I have modified a standard peripheral library example from the ST web page and the current program simply toggles a LED at each successive rising edge on PE6:
#include "stm32f30x.h"
#include "stm32303c_eval.h"
EXTI_InitTypeDef EXTI_InitStructure;
GPIO_InitTypeDef GPIO_InitStructure;
NVIC_InitTypeDef NVIC_InitStructure;
static void EXTI9_5_Config(void);
int main(void)
{
/* Initialize LEDs mounted on STM32303C-EVAL board */
STM_EVAL_LEDInit(LED1);
/* Configure PE6 in interrupt mode */
EXTI9_5_Config();
/* Infinite loop */
while (1)
{
}
}
// Configure PE6 and PD5 in interrupt mode
static void EXTI9_5_Config(void)
{
/* Enable clocks */
RCC_AHBPeriphClockCmd(RCC_AHBPeriph_GPIOD | RCC_AHBPeriph_GPIOE, ENABLE);
RCC_APB2PeriphClockCmd(RCC_APB2Periph_SYSCFG, ENABLE);
/* Configure input */
GPIO_InitStructure.GPIO_Mode = GPIO_Mode_IN;
GPIO_InitStructure.GPIO_PuPd = GPIO_PuPd_DOWN;
GPIO_InitStructure.GPIO_Pin = GPIO_Pin_6;
GPIO_Init(GPIOD, &GPIO_InitStructure);
/* Connect EXTI6 Line to PE6 pin */
SYSCFG_EXTILineConfig(EXTI_PortSourceGPIOE, EXTI_PinSource6);
/* Configure Button EXTI line */
EXTI_InitStructure.EXTI_Line = EXTI_Line6;
EXTI_InitStructure.EXTI_Mode = EXTI_Mode_Interrupt;
EXTI_InitStructure.EXTI_Trigger = EXTI_Trigger_Rising;
EXTI_InitStructure.EXTI_LineCmd = ENABLE;
EXTI_Init(&EXTI_InitStructure);
/* Enable and set interrupt to the highest priority */
NVIC_InitStructure.NVIC_IRQChannel = EXTI9_5_IRQn;
NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = 0x00;
NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0x00;
NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
NVIC_Init(&NVIC_InitStructure);
}
The interrupt handler looks like this:
void EXTI9_5_IRQHandler(void)
{
if((EXTI_GetITStatus(EXTI_Line6) != RESET))
{
/* Toggle LD1 */
STM_EVAL_LEDToggle(LED1);
/* Clear the EXTI line 6 pending bit */
EXTI_ClearITPendingBit(EXTI_Line6);
}
}
In this particular case, the interrupts are created by an external programmable function generator running at 100 Hz. After examining the MCU response on an oscilloscope, I was rather surprised that it takes nearly 1.32 us for the MCU to begin processing the interrupt: enter image description here
With the MCU running at 72 MHz (I’ve checked the SYSCLK output on the MCO pin beforehand) this amounts to nearly 89 clock cycles. Shouldn’t the MCU response to the interrupt be much faster?
P.S. The code was compiled with IAR Embedded Workbench and optimized for highest speed.
4 Answers 4
Problem
Well you have to look at the functions you are using, you can't just make assumptions on the speed of code you haven't looked at:
This is the EXTI_GetITStatus function:
ITStatus EXTI_GetITStatus ( uint32_t EXTI_Line )
{
ITStatus bitstatus = RESET;
uint32_t enablestatus = 0;
/* Check the parameters */
assert_param(IS_GET_EXTI_LINE(EXTI_Line));
enablestatus = *(__IO uint32_t *) (((uint32_t) &(EXTI->IMR)) + ((EXTI_Line) >> 5 ) * 0x20) & (uint32_t)(1 << (EXTI_Line & 0x1F));
if ( (((*(__IO uint32_t *) (((uint32_t) &(EXTI->PR)) + (((EXTI_Line) >> 5 ) * 0x20) )) & (uint32_t)(1 << (EXTI_Line & 0x1F))) != (uint32_t)RESET) && (enablestatus != (uint32_t)RESET))
{
bitstatus = SET;
}
else
{
bitstatus = RESET;
}
return bitstatus;
}
As you can see, this is not a simple thing requiring just a cycle or two.
Next is your LED toggle function:
void STM_EVAL_LEDToggle ( Led_TypeDef Led )
{
GPIO_PORT[Led]->ODR ^= GPIO_PIN[Led];
}
So here you have some array indexing and a read modify write to toggle the LED.
HALs often end up creating a good amount of overhead because they must take care of wrong settings and wrong usage of the functions. The needed parameter checking and also the translation from a simple parameter to a bit in the register can take a serious amount of computing (well for a time critical interrupt at least).
So in your case, you should implement your interrupt bare metal directly on the registers and not rely on any HAL.
Example solution
For example something like:
if (EXTI->PR & EXTI_PR_PR6)
{
GPIOE->BSRR = GPIO_BSRR_BS_8;
EXTI->PR = EXTI_PR_PR6;
}
Note: this will not toggle the LED but simply set it. There is no atomic toggle available on the STM GPIOs. I also don't like the if
construct I used, but it generates faster assembly than my preferred if (EXTI_PR_PR6 == (EXTI->PR & EXTI_PR_PR6))
.
A toggle variant could be something along these lines:
static bool LEDstate = false;
if (EXTI->PR & EXTI_PR_PR6)
{
if (!LEDstate)
{
GPIOE->BSRR = GPIO_BSRR_BS_8;
LEDstate = true;
}
else
{
GPIOE->BSRR = GPIO_BSRR_BR_8;
LEDstate = false;
}
EXTI->PR = EXTI_PR_PR6;
}
Using a variable residing in RAM instead of using the ODR
register should be faster, especially when you use 72 MHz, because access to the peripherals can be slower due to synchronization between different clock domains and peripheral clocks simply running at a lower frequency. Of course, you may not change the state of the LED outside of the interrupt for the toggle to work correctly. Or the variable must be global (then you have to use the volatile
keyword when declaring it) and you have to change it everywhere accordingly.
Also note, that I'm using C++, hence the bool
and not some uint8_t
type or similar to implement a flag. Although if speed is your primary concern you should probably opt for a uint32_t
for the flag as this will always be aligned correctly and not generate additional code when accessing.
The simplification is possible because you hopefully know what you are doing and always keep it that way. If you really just have a single interrupt enabled for the EXTI9_5 handler you can get rid of the pending register check altogether, reducing the number of cycles even further.
This leads to another optimization potential: use a EXTI line which has a single interrupt like one of EXTI1 to EXTI4. There you don't have to perform a check whether the correct line has triggered your interrupt.
-
1\$\begingroup\$ It's hard to tell from C code how much instructions it would take. I have seen bigger functions optimized to a couple of instructions which didn't even involve an actual call. \$\endgroup\$Dmitry Grigoryev– Dmitry Grigoryev2017年04月26日 08:21:16 +00:00Commented Apr 26, 2017 at 8:21
-
2\$\begingroup\$ @DmitryGrigoryev as register are declared as
volatile
the compiler is not allowed to optimize much in the functions above and if the functions are not implemented inline in the header, the call usually doesn't get optimized away either. \$\endgroup\$Arsenal– Arsenal2017年04月26日 08:26:28 +00:00Commented Apr 26, 2017 at 8:26
Following PeterJ's suggestion I've omitted the usage of SPL. The entirety of my code looks like this:
#include "stm32f30x.h"
void EXTI0_IRQHandler(void)
{
// I am simply toggling the pin within the interrupt, as I only want to check the response speed.
GPIOE->BSRR |= GPIO_BSRR_BS_10;
GPIOE->BRR |= GPIO_BRR_BR_10;
EXTI->PR |= EXTI_PR_PR0;
}
int main()
{
// Initialize the HSI:
RCC->CR |= RCC_CR_HSION;
while(!(RCC->CR&RCC_CR_HSIRDY));
// PLL configuration:
RCC->CFGR &= ~RCC_CFGR_PLLSRC; // HSI / 2 selected as the PLL input clock.
RCC->CFGR |= RCC_CFGR_PLLMULL16; // HSI / 2 * 16 = 64 MHz
RCC->CR |= RCC_CR_PLLON; // Enable PLL
while(!(RCC->CR&RCC_CR_PLLRDY)); // Wait until PLL is ready
// Flash configuration:
FLASH->ACR |= FLASH_ACR_PRFTBE;
FLASH->ACR |= FLASH_ACR_LATENCY_1;
// Main clock output (MCO):
RCC->AHBENR |= RCC_AHBENR_GPIOAEN;
GPIOA->MODER |= GPIO_MODER_MODER8_1;
GPIOA->OTYPER &= ~GPIO_OTYPER_OT_8;
GPIOA->PUPDR &= ~GPIO_PUPDR_PUPDR8;
GPIOA->OSPEEDR |= GPIO_OSPEEDER_OSPEEDR8;
GPIOA->AFR[0] &= ~GPIO_AFRL_AFRL0;
// Output on the MCO pin:
RCC->CFGR |= RCC_CFGR_MCO_SYSCLK;
// PLL as the system clock
RCC->CFGR &= ~RCC_CFGR_SW; // Clear the SW bits
RCC->CFGR |= RCC_CFGR_SW_PLL; //Select PLL as the system clock
while ((RCC->CFGR & RCC_CFGR_SWS_PLL) != RCC_CFGR_SWS_PLL); //Wait until PLL is used
// LED output:
RCC->AHBENR |= RCC_AHBENR_GPIOEEN;
GPIOE->MODER |= GPIO_MODER_MODER10_0;
GPIOE->OTYPER &= ~GPIO_OTYPER_OT_10;
GPIOE->PUPDR &= ~GPIO_PUPDR_PUPDR10;
GPIOE->OSPEEDR |= GPIO_OSPEEDER_OSPEEDR10;
// Interrupt on PA0:
RCC->AHBENR |= RCC_AHBENR_GPIOAEN;
GPIOA->MODER &= ~(GPIO_MODER_MODER0);
GPIOA->OSPEEDR |= (GPIO_OSPEEDER_OSPEEDR0);
GPIOA->PUPDR &= ~(GPIO_PUPDR_PUPDR0);
SYSCFG->EXTICR[0] &= SYSCFG_EXTICR1_EXTI0_PA;
EXTI->RTSR = EXTI_RTSR_TR0;
EXTI->IMR = EXTI_IMR_MR0;
NVIC_SetPriority(EXTI0_IRQn, 1);
NVIC_EnableIRQ(EXTI0_IRQn);
while(1)
{
}
}
and the assembly instruction look like this:
EXTI0_IRQHandler:
LDR.N R0,??DataTable1 ;; 0x48001018
LDR R1,[R0, #+0]
ORR R1,R1,#0x400
STR R1,[R0, #+0]
LDRH R2,[R0, #+16]
ORR R2,R2,#0x400
STRH R2,[R0, #+16]
LDR.N R0,??DataTable1_1 ;; 0x40010414
LDR R1,[R0, #+0]
ORR R1,R1,#0x1
STR R1,[R0, #+0]
BX LR ;; return
This improves matters quite a bit, as I've managed to get a response in ~440 ns @ 64 MHz (i.e., 28 clock cycles).
-
2\$\begingroup\$ Change your
BRR |=
andBSRR |=
to justBRR =
andBSRR =
, those registers are write only, your code is reading them,ORR
ing the value and then writing. that could be optimized to a singleSTR
instruction. \$\endgroup\$Colin– Colin2017年04月26日 11:41:07 +00:00Commented Apr 26, 2017 at 11:41 -
\$\begingroup\$ Move your EXTI handler & vectors to CCMRAM \$\endgroup\$0___________– 0___________2017年04月26日 17:08:46 +00:00Commented Apr 26, 2017 at 17:08
There are some errors in your code = BSRR register is write only. Do not use |= operator, just simple "=". It will set /reset the proper pins. Zeroes are ignored.
It will save you couple of clocks. Another hint: move your vector table & interrupt routines to CCMRAM. You will save some another ticks (flash waitstates etc)
PS I cant comment as I do not have enough reputation :)
The answer is extremely easy: great HAL (or SPL) library. If you do something time sensitive use bare peripheral registers instead. Then you will get the correct latency. I cant understand what is the point to use this ridiculous library to toggle the pin!! or to check statue register.
Explore related questions
See similar questions with these tags.
if{}
statement is needed because the interrupt routine does not know what the source of interrupt is. \$\endgroup\$