1

I'm looking for a low-overhead method for my program to stall a few cycles on an Intel CPU, without causing memory accesses or side effects that could alter the CPU components' data (e.g. no usleep()).

What would be the best-fit instruction that has a consistent execution cycle-time and predictable behavior, so that I could use it once or numerous times, depending on how many cycles I'd like my program to stall (e.g. 5, 10, or 1000)? I can't trust nop as I've read it does not guarantee 1 cycle execution time and could be optimized away (0 cycles) throughout the pipeline's execution.

asked Mar 2, 2025 at 2:05
6
  • 2
    _mm_pause will stall (the front-end?) for 5 or 100 cycles depending on CPU model (before vs. after Skylake on Intel), or for a BIOS-configurable amount on Zen. _mm_lfence() will block the front-end until the back-end drains. Spinning on rdtsc can be viable if you want to wait for more than like 40 core clock cycles (How to calculate time for an asm delay loop on x86 linux?) Commented Mar 2, 2025 at 2:37
  • Thank you! I unfortunately can't rely on memory fence instructions since their execution time may vary based on what's on the processor's pipeline. Follow up Q: I am using a first gen Intel Xeon Scalable Skylake-SP processor, so would that mean pause would always take ≈100 cycles? Commented Mar 2, 2025 at 3:42
  • 1
    Yes, on Skylake it will always pause the front-end for 100 cycles while the back-end keeps running, if I understand it correctly. If there's already a cache-miss load in the back-end that will soon stall, then pause probably doesn't make thing any slower (except maybe by delaying independent work that will also stall and could have been running in parallel, e.g. another load from a separate address). Or not if it's not in the shadow of a stall that would happen anyway. I can't think of a mechanism that would make it slower by more than 100 core cycles. Commented Mar 2, 2025 at 4:23
  • Thank you! Now how is pause different than rep; nop? I recently came across this combination of instructions in a code base. Commented Mar 5, 2025 at 15:29
  • What does "rep; nop;" mean in x86 assembly? Is it the same as the "pause" instruction? Commented Mar 5, 2025 at 21:12

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.