Note that there is about a 20% slowdown when we run the same benchmark as an i386 binary compiled with GCC 3.2.3 from 2002. SipHash is incredibly fast on a modern 64-bit processor because it can run its core algorithm using only four 64-bit registers. I would suspect that there would be less slowdown on a 32-bit ARM or other RISC architecture because they do not have the register starvation i386 has.No difference in speed.