Skip to main content
Code Review

Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/
Source Link

This is largely a duplicate of http://stackoverflow.com/questions/6935442/x86-spinlock-using-cmpxchg https://stackoverflow.com/questions/6935442/x86-spinlock-using-cmpxchg, http://stackoverflow.com/questions/11959374/fastest-inline-assembly-spinlock https://stackoverflow.com/questions/11959374/fastest-inline-assembly-spinlock, etc.

(1) Your code works only on 32-bit x86. If you're using 64-bit (x86-64), the calling convention is totally wrong (you need to save and restore %rbp rather than %ebp, and you're looking for the argument in the wrong place). If you're implementing your own OS, I suppose you know what you're doing... but in general I think it's weird to see 32-bit x86 code these days.

(2) Useless trivia of the day: The two instructions test ecx, ecx; jz .acquired could be replaced with the single instruction jecxz .acquired. I don't know whether this is an optimization or a pessimization on your particular hardware. It would be a size optimization, at least.

(2.5) Really, consider using cmpxchg instead of xchg; test. The cmpxchg mnemonic maps more directly onto the "Compare And Swap" idiom that you're using. It will also help you when it comes time to implement "is the lock held by this thread"; you can set the word to current-thread-id instead of just 1.

(3) See 'About PAUSE' below: Unfortunately there is no "About PAUSE" below. However, it seems like you're doing the right thing there. You could definitely tighten up the code; for example I don't know why you're wasting time with that test dword [eax], 1. You should just retry the xchg as soon as the pause is over.

(4) Consider removing the stack frame; it's not doing anything useful (unless it's necessary for your debugger of choice), just slowing things down.

This is largely a duplicate of http://stackoverflow.com/questions/6935442/x86-spinlock-using-cmpxchg, http://stackoverflow.com/questions/11959374/fastest-inline-assembly-spinlock, etc.

(1) Your code works only on 32-bit x86. If you're using 64-bit (x86-64), the calling convention is totally wrong (you need to save and restore %rbp rather than %ebp, and you're looking for the argument in the wrong place). If you're implementing your own OS, I suppose you know what you're doing... but in general I think it's weird to see 32-bit x86 code these days.

(2) Useless trivia of the day: The two instructions test ecx, ecx; jz .acquired could be replaced with the single instruction jecxz .acquired. I don't know whether this is an optimization or a pessimization on your particular hardware. It would be a size optimization, at least.

(2.5) Really, consider using cmpxchg instead of xchg; test. The cmpxchg mnemonic maps more directly onto the "Compare And Swap" idiom that you're using. It will also help you when it comes time to implement "is the lock held by this thread"; you can set the word to current-thread-id instead of just 1.

(3) See 'About PAUSE' below: Unfortunately there is no "About PAUSE" below. However, it seems like you're doing the right thing there. You could definitely tighten up the code; for example I don't know why you're wasting time with that test dword [eax], 1. You should just retry the xchg as soon as the pause is over.

(4) Consider removing the stack frame; it's not doing anything useful (unless it's necessary for your debugger of choice), just slowing things down.

This is largely a duplicate of https://stackoverflow.com/questions/6935442/x86-spinlock-using-cmpxchg, https://stackoverflow.com/questions/11959374/fastest-inline-assembly-spinlock, etc.

(1) Your code works only on 32-bit x86. If you're using 64-bit (x86-64), the calling convention is totally wrong (you need to save and restore %rbp rather than %ebp, and you're looking for the argument in the wrong place). If you're implementing your own OS, I suppose you know what you're doing... but in general I think it's weird to see 32-bit x86 code these days.

(2) Useless trivia of the day: The two instructions test ecx, ecx; jz .acquired could be replaced with the single instruction jecxz .acquired. I don't know whether this is an optimization or a pessimization on your particular hardware. It would be a size optimization, at least.

(2.5) Really, consider using cmpxchg instead of xchg; test. The cmpxchg mnemonic maps more directly onto the "Compare And Swap" idiom that you're using. It will also help you when it comes time to implement "is the lock held by this thread"; you can set the word to current-thread-id instead of just 1.

(3) See 'About PAUSE' below: Unfortunately there is no "About PAUSE" below. However, it seems like you're doing the right thing there. You could definitely tighten up the code; for example I don't know why you're wasting time with that test dword [eax], 1. You should just retry the xchg as soon as the pause is over.

(4) Consider removing the stack frame; it's not doing anything useful (unless it's necessary for your debugger of choice), just slowing things down.

Source Link
Quuxplusone
  • 19.7k
  • 2
  • 44
  • 91

This is largely a duplicate of http://stackoverflow.com/questions/6935442/x86-spinlock-using-cmpxchg, http://stackoverflow.com/questions/11959374/fastest-inline-assembly-spinlock, etc.

(1) Your code works only on 32-bit x86. If you're using 64-bit (x86-64), the calling convention is totally wrong (you need to save and restore %rbp rather than %ebp, and you're looking for the argument in the wrong place). If you're implementing your own OS, I suppose you know what you're doing... but in general I think it's weird to see 32-bit x86 code these days.

(2) Useless trivia of the day: The two instructions test ecx, ecx; jz .acquired could be replaced with the single instruction jecxz .acquired. I don't know whether this is an optimization or a pessimization on your particular hardware. It would be a size optimization, at least.

(2.5) Really, consider using cmpxchg instead of xchg; test. The cmpxchg mnemonic maps more directly onto the "Compare And Swap" idiom that you're using. It will also help you when it comes time to implement "is the lock held by this thread"; you can set the word to current-thread-id instead of just 1.

(3) See 'About PAUSE' below: Unfortunately there is no "About PAUSE" below. However, it seems like you're doing the right thing there. You could definitely tighten up the code; for example I don't know why you're wasting time with that test dword [eax], 1. You should just retry the xchg as soon as the pause is over.

(4) Consider removing the stack frame; it's not doing anything useful (unless it's necessary for your debugger of choice), just slowing things down.

default

AltStyle によって変換されたページ (->オリジナル) /