Re: Pentium 4 and misaligned doubles
[
Date Prev][
Date Next][
Thread Prev][
Thread Next]
[
Date Index]
[
Thread Index]
- Subject: Re: Pentium 4 and misaligned doubles
- From: Rici Lake <lua@...>
- Date: 2005年8月16日 00:13:43 -0500
Just to confirm the previous, it occurred to me that in an array, every 
sixth value will cross a cache boundary. Indeed, empirical results 
demonstrate that there is an effect:
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[1] = a[1] + 1 end
 13.51 real 13.44 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[2] = a[2] + 1 end
 13.54 real 13.49 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[3] = a[3] + 1 end
 13.51 real 13.44 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[4] = a[4] + 1 end
 13.52 real 13.47 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[5] = a[5] + 1 end
 13.51 real 13.45 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[6] = a[6] + 1 end
 16.05 real 15.97 user 0.03 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[7] = a[7] + 1 end
 13.61 real 13.55 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[8] = a[8] + 1 end
 13.63 real 13.58 user 0.00 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[1] = a[1] end
 10.91 real 10.85 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[2] = a[2] end
 10.88 real 10.83 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[3] = a[3] end
 10.88 real 10.83 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[4] = a[4] end
 10.91 real 10.85 user 0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[5] = a[5] end
 10.88 real 10.83 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[6] = a[6] end
 13.41 real 13.35 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[7] = a[7] end
 10.91 real 10.86 user 0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[8] = a[8] end
 10.86 real 10.82 user 0.01 sys
Note that the constant 6 is probably also misaligned. Also, there is a 
difference between the layout of Lua 5.0.2 and Lua 5.1w6, so although 
the periodicity should be the same, it will probably have a different 
offset.
I should also add that I'm doing these tests with FreeBSD, whose 
allocator will not artifically cross a 64-byte boundary. (That wasn't a 
design goal, as far as I know -- the documentation only talks about 
page boundaries -- but it is a consequence of rounding all small 
allocations up to a power of two, and then assigning a whole page to 
objects of the same size.) I believe the Linux allocator is more memory 
conservative, and may allocate a small object across a 64-byte 
boundary.