I'm trying to write atomic code, in my example below I need to perform simple operation a ^= 1;
static volatile int a = 0;
//-- a ^= 1;
__asm__ __volatile__( "xori %0, %0, 1"
: "=r"(a)
: "r"(a)
);
Generated code is not atomic:
9D0014E8 8F828018 LW V0, -32744(GP)
9D0014EC 38420001 XORI V0, V0, 1
9D0014F0 AF828018 SW V0, -32744(GP)
As I see in docs, operations LL and SC provide atomic Read-Modify-Write sequence. How can I make compiler to generate code with LL, SC instead of LW, SW? I tried to write such code myself:
static volatile int a = 0;
__asm__ __volatile__( "ll $t1, 0(%0)": : "r"(a) );
__asm__ __volatile__( "xori $t1, $t1, 1" );
__asm__ __volatile__( "sc $t1, 0(%0)": : "r"(a) );
But this is wrong, result is other than I need:
140: __asm__ __volatile__( "ll $t1, 0(%0)": : "r"(a) );
9D001454 8F828018 LW V0, -32744(GP) # WRONG! | I need for LL T1, -32744(GP) instead of
9D001458 C0490000 LL T1, 0(V0) # WRONG! | these two LW, LL instructions
141: __asm__ __volatile__( "xori $t1, $t1, 1" );
9D00145C 39290001 XORI T1, T1, 1
142: __asm__ __volatile__( "sc $t1, 0(%0)": : "r"(a) );
9D001460 8F828018 LW V0, -32744(GP) # WRONG! | I need for SC T1, -32744(GP) instead of
9D001464 E0490000 SC T1, 0(V0) # WRONG! | these two LW, SC instructions
How can I do that?
-
\$\begingroup\$ Which chip is this ? \$\endgroup\$User.1– User.12013年07月06日 23:13:10 +00:00Commented Jul 6, 2013 at 23:13
-
\$\begingroup\$ It's PIC32MX440F512H \$\endgroup\$Dmitry Frank– Dmitry Frank2013年07月07日 06:57:51 +00:00Commented Jul 7, 2013 at 6:57
1 Answer 1
Well, there's one of these happy moments when I need just to ask someone, and solution comes to my head immediately:
__asm__ __volatile__( "ll $t1, 0(%0)": : "r"(&a) );
__asm__ __volatile__( "xori $t1, $t1, 1" );
__asm__ __volatile__( "sc $t1, 0(%0)": : "r"(&a) );
I.e. I need to use &a
instead of a
. Now, generated code is:
104: __asm__ __volatile__( "ll $t1, 0(%0)": : "r"(&a) );
9D001434 27828018 ADDIU V0, GP, -32744
9D001438 C0490000 LL T1, 0(V0)
105: __asm__ __volatile__( "xori $t1, $t1, 1" );
9D00143C 39290001 XORI T1, T1, 1
106: __asm__ __volatile__( "sc $t1, 0(%0)": : "r"(&a) );
9D001440 E0490000 SC T1, 0(V0)
Which seems to be what I need. Note: to make it better, we need to use "beqz" instruction in order to loop if SC failed (there's an example in MIPS32 instruction quick reference). But this is another story.
More, at microchip forum user andersm suggested to use GCC's atomic builtins instead of re-inventing the wheel. (But, these builtins add two sync
instructions that are useless on PIC32, so, it might make sense to write my own macro)
-
\$\begingroup\$ Without researching it too deeply, it looks like this instruction pairing is only potential atomic - ie, it either works atomically or else it fails to write back and sets a flag. You don't seem to be checking for / handling the failure possibility in the way the example code at your instruction reference link does. It is permissible to accept your own answer if you are fully satisfied with it. \$\endgroup\$Chris Stratton– Chris Stratton2013年07月05日 15:18:30 +00:00Commented Jul 5, 2013 at 15:18
-
\$\begingroup\$ @ChrisStratton, i can't understand which instruction pairing are you talking about, and how instruction pairing might make code atomic in general. Atomicity in the code above is achieved by instructions
LL
andSC
. As to assepting my answer, I'll surely accept it, when system will permit it to me (I can accept it after two days only) \$\endgroup\$Dmitry Frank– Dmitry Frank2013年07月06日 09:27:31 +00:00Commented Jul 6, 2013 at 9:27 -
\$\begingroup\$ This answer admittedly solves the issue of incorrect assembly generation, but what Chris meant was that three separate assembly instructions cannot ensure a single atomic operation. This code can be interrupted at any point in between these instructions, so if you don't check if SC failed, you don't gain any benefits compared to your initial code in terms of atomicity. The other forum user was right that it's better not to reinvent the wheel and simply use
__sync_fetch_and_xor
instead (if your compiler/mcu combo permits it). \$\endgroup\$vgru– vgru2017年05月10日 08:03:34 +00:00Commented May 10, 2017 at 8:03