Hello Jairo, On Sat, Dec 05, 2020 at 03:47:05AM -0500, Jairo A. del Rio wrote: > ... Thank you for this opportunity to indulge in prime hunting, which, as we all know, is one of the most healthy and useful activities. The Xuedong Luo method and your implementation are markedly tight, with priority to performance in the details. In contrast, I have attempted a more relaxed implementation of Eratosthenes' sieve (a module Sieve1.lua) to make it easier to understand what goes on and experiment with variations. That implementation, in spite of this more relaxed approach, is, in some cases, faster than yours. Working on a description of Sieve1.lua, I found, however, that simpler implementations, developed to support the description, also had merit and actually led to some significant performance improvements of your implementation of the Xuedong Luo method. So that's where I'll start. Before doing that, however, I'll briefly summarize the adjustments to your luo.lua in the shape of a diff between luo2.lua (an instrumented version of your luo.lua) and luo4.lua that includes the adjustments: $ diff luo2.lua luo4.lua 24c24 < step = step == 2 and 4 or 2 --- > step = 6 - step 33c33 < j = c --- > local j = c 36,40c36,42 < while j <= M do < S[j] = 0 < j = j + ij < ij = t - ij < count = count + 1 --- > if S[i] ~= 0 then > while j <= M do > S[j] = 0 > j = j + ij > ij = t - ij > count = count + 1 > end 46,48c48,50 < for _, v in next, S do < if v > 0 then < result[#result + 1] = v --- > for i = 1, #S do > if S[i] > 0 then > result[#result + 1] = S[i] $ These adjustments cut roughly a third of the run time required for calculating the primes below 100000. Details follow. The basic mechanism, attributed to Eratosthenes (about -276 to -195), for finding primes is the following (see https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes): Start by listing all positive integers to some desired limit, here 25: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 The number 1 is not considered prime (and if used for sieving would create havoc), so take the next number, 2, in the list, that's the first prime. Now, since any number divisible by a prime is composite (unless it is the prime itself), we can remove as composite all multiples of 2 from the list, excepting 2, leaving us with: 1 2 3 5 7 9 11 13 15 17 19 21 23 25 The next number following 2 in the list is 3, so that's the second prime. As before, we remove multiples of that prime from the list, excepting the prime itself, to get rid of additional composites, leaving us with: 1 2 3 5 7 11 13 17 19 23 25 The next number in the list is 5, the third prime. We haven't said that before, but it actually suffices, when removing multiples of some prime p, to start at p^2 (= p*p, p squared). The reason is that composite numbers below p^2 are divisible by a prime strictly less than p and so, would have been removed earlier. For example, for p=5, 10=2*5 were removed when removing multiples of 2. And 15=3*5 were removed when removing multiples of 3. The only multiple of 5 in the list, excluding 5 itself, is 25. After removing 25 from the list, we are left with: 1 2 3 5 7 11 13 17 19 23 Argueing as before, 7 is the fourth prime and 7*7 = 49 is the next number targeted for removal. But 49 is beyond the list, so we are done and all numbers in the list (except 1) are prime. In Lua, this becomes Eratos1.lua: $ cat Eratos1.lua -- Eratos1.lua: Inspired by "Jairo A. del Rio" <jairoadelrio6@gmail.com>: Tips and questions concerning a prime sieve in Lua, simple sieve -- 2021-Apr-01 18.56 / TN print( "Eratos1.lua: 2021-Apr-28 21.21 / TN" ) -- for k, v in pairs( arg ) do -- print( "arg[" .. k .. "] = <" .. v .. ">" ) -- end local print = print local tonumber = tonumber local arg = arg _ENV = nil local function Eratos1( l ) local list = {} for i = 1, l do list[i] = i end local x = 1 local count = 0 while true do local p repeat x = x + 1 p = list[x] until p ~= 0 if not p or p*p > l then break end -- print( p ) for i = p*p, l, p do list[i] = 0 -- count = count + 1 end end local prime = {} -- do return prime end for i = 2, l do if list[i] ~= 0 then -- print( "prime[" .. i .. "] = " .. list[i] ) prime[#prime+1] = list[i] end end return prime, count end local l = tonumber( arg[1] ) local c = tonumber( arg[2] or 1 ) local result local count for i = 1, c do result, count = Eratos1( l ) end print( "pi(" .. l .. ") = " .. #result .. ", count=" .. count .. " (calculated " .. c .. " times)" ) $ (Much code quoted is work in progress, so please apologize for occasional rough edges, such as remains of debugging activities.) Briefly, list[i] represents the number i and is initialized with i itself for a start. Subsequently, when sieving establishes that some i is composite, list[i] is zeroed. Finally, the list is scanned for non-zero numbers: These are all the primes and hence transferred to the result prime list. Establishing a suitable baseline: $ uname -a CYGWIN_NT-6.1-WOW Annette-Pc2 3.1.7(0.340/5/3) 2020年08月22日 19:03 i686 Cygwin $ time lua-5.4.3 Eratos1.lua 1000000 100 Eratos1.lua: 2021-Apr-28 21.21 / TN pi(1000000) = 78498, count=0 (calculated 100 times) real 0m27.616s user 0m24.585s sys 0m2.870s $ The calculation is summarized by printing pi(1000000), the number of primes below the limit 1000000. Specific values of the prime-counting function pi(x) can be found at https://en.wikipedia.org/wiki/Prime-counting_function. Eratos1.lua with l=1000000 and c=100 performs the same calculation as your (Jairo's) luo.lua: $ time lua-5.4.3 luo.lua real 0m21.828s user 0m20.373s sys 0m1.388s $ The situation is thus not entirely hopeless: Your program is faster, yes, but not dramatically so. So how can we improve this? A common idea used in the Xuedong Luo method is making a special case of the smallest primes. To illustrate this, let's make a special case of 2. This means that we can avoid handling numbers divisible by 2, so some operations need only be carried out half the number of times. This is Eratos2.lua: $ cat Eratos2.lua -- Eratos2.lua: Inspired by "Jairo A. del Rio" <jairoadelrio6@gmail.com>: Tips and questions concerning a prime sieve in Lua, special case 2 sieve -- 2021-Apr-01 18.58 / TN print( "Eratos2.lua: 2021-Apr-09 19.50 / TN" ) local print = print local tonumber = tonumber local arg = arg local os = os _ENV = nil local function Eratos2( l, clockAccum ) local clockStart local clockEnd clockStart = os.clock() local l0 = ( l + 1 ) // 2 local list = {} for i = 1, l0 do list[i] = 2*i - 1 end local x = 1 local count = 0 clockEnd = os.clock() clockAccum.Init = (clockAccum.Init or 0) + clockEnd - clockStart clockStart = clockEnd while true do local p repeat x = x + 1 p = list[x] until p ~= 0 if not p or p*p > l then break end -- print( p ) for i = ( p*p + 1 ) // 2, l0, p do list[i] = 0 count = count + 1 end end clockEnd = os.clock() clockAccum.Sieve = (clockAccum.Sieve or 0) + clockEnd - clockStart clockStart = clockEnd local prime = {} -- do return prime end if 2 <= l then prime[#prime+1] = 2 end for i = 2, l0 do if list[i] ~= 0 then -- print( "prime[" .. i .. "] = " .. list[i] ) prime[#prime+1] = list[i] end end clockEnd = os.clock() clockAccum.Pack = (clockAccum.Pack or 0) + clockEnd - clockStart return prime, count end local l = tonumber( arg[1] ) local c = tonumber( arg[2] or 1 ) local result local count local clockStart = os.clock() local clockAccum = {} for i = 1, c do result, count = Eratos2( l, clockAccum ) end local clockEnd = os.clock() print( "pi(" .. l .. ") = " .. #result .. ", count=" .. count .. " (calculated " .. c .. " times)" ) print( "clockAccum.Init=" .. clockAccum.Init ) print( "clockAccum.Sieve=" .. clockAccum.Sieve ) print( "clockAccum.Pack=" .. clockAccum.Pack ) print( "clockTotal=" .. clockEnd - clockStart ) $ In Eratos2.lua, the list only holds the odd numbers, so list[1]=1, list[2]=3, list[3]=5, and so on. The number of elements of the list is calculated as l0 = (l+1)//2 for the limit l. This ensures that the list is able to hold the largest odd number less than or equal to l. The i'th odd number is easily calculated as 2*i-1. Selecting the next prime, p, for sieving can be done in the same manner as in Eratos1.lua, since the list still contains actual numbers. To remove (set to zero) the numbers p*p, p*p+p, p*p+p+p, ... from the list, we need to find the indexes of these numbers. Or some of them, at least, since we are only dealing with odd numbers now. In general, to find the index of an odd number, k, we calculate (k+1)//2. And since p is odd, p*p is likewise odd and (p*p+1)//2 is the index of p*p in the list. This explains the low limit of the loop that iterates over the multiples of p. It is clear that every second number in p*p, p*p+p, p*p+p+p, ... is even and should not be used. And a bit of consideration shows that the indexes of two neighboring odd numbers in the list p*p, p*p+p, p*p+p+p, ... differ by exactly p. So p is used as the step in the sieve loop. And running this: $ time lua-5.4.3 Eratos2.lua 1000000 100 Eratos2.lua: 2021-Apr-09 19.50 / TN pi(1000000) = 78498, count=811068 (calculated 100 times) clockAccum.Init=5.577 clockAccum.Sieve=6.092 clockAccum.Pack=3.089 clockTotal=14.758 real 0m14.846s user 0m13.104s sys 0m1.684s $ This is really interesting, since it beats luo.lua by a significant margin. I'll deal with this unexpected fact later. There is also some instrumentation (the count and clock output) intended to help analyze performance. Before doing that, however, let's take matters a step further and specialize with respect to the prime 3, as well as the prime 2. This is the same level of specialization used in the Xuedong Luo method. So here is Eratos3.lua doing that: $ cat Eratos3.lua -- Eratos3.lua: Inspired by "Jairo A. del Rio" <jairoadelrio6@gmail.com>: Tips and questions concerning a prime sieve in Lua, special case 2, 3 sieve -- 2021-Apr-01 18.59 / TN print( "Eratos3.lua: 2021-Apr-18 19.25 / TN" ) local print = print local tonumber = tonumber local arg = arg local os = os local error = error _ENV = nil local function Eratos3( l, clockAccum ) local clockStart local clockEnd clockStart = os.clock() local l0 = ( l + l%2 + 1 ) // 3 local list = {} local u = 1 local v = 4 for i = 1, l0 do -- list[i] = 3*(i-1) + (i-1)%2 + 1 list[i] = u u = u + v v = 6 - v -- print( "list[" .. i .. "] = " .. list[i] ) end local x = 1 local count = 0 clockEnd = os.clock() clockAccum.Init = (clockAccum.Init or 0) + clockEnd - clockStart clockStart = clockEnd while true do local p repeat x = x + 1 p = list[x] until p ~= 0 if not p or p*p > l then break end local i = p*p // 3 + 1 local q = p // 6 local d if p % 6 == 1 then d = 8 * q + 1 else d = 4 * q + 3 end -- print( p ) local z = 2 * p -- local c = 0 while i <= l0 do list[i] = 0 i = i + d d = z - d -- c = c + 1 count = count + 1 end -- print( "p=" .. p .. ", c=" .. c ) end clockEnd = os.clock() clockAccum.Sieve = (clockAccum.Sieve or 0) + clockEnd - clockStart clockStart = clockEnd local prime = {} -- do return prime end for i = 2, 3 do if i <= l then prime[#prime+1] = i end end for i = 2, l0 do if list[i] ~= 0 then -- print( "prime[" .. i .. "] = " .. list[i] ) prime[#prime+1] = list[i] end end clockEnd = os.clock() clockAccum.Pack = (clockAccum.Pack or 0) + clockEnd - clockStart return prime, count end local l = tonumber( arg[1] ) local c = tonumber( arg[2] or 1 ) local result local count local clockStart = os.clock() local clockAccum = {} for i = 1, c do result, count = Eratos3( l, clockAccum ) end local clockEnd = os.clock() print( "pi(" .. l .. ") = " .. #result .. ", count=" .. count .. " (calculated " .. c .. " times)" ) print( "clockAccum.Init=" .. clockAccum.Init ) print( "clockAccum.Sieve=" .. clockAccum.Sieve ) print( "clockAccum.Pack=" .. clockAccum.Pack ) print( "clockTotal=" .. clockEnd - clockStart ) $ The list should hold the numbers that are neither divisible by 2 nor 3, so list[1]=1, list[2]=5, list[3]=7, list[4]=11, list[5]=13, list[6]=17, list[7]=19, and so on. These are all the numbers of the form 6*k+1 and 6*k+5. And note that the difference between neighboring numbers alternates between 4 and 2. The number of elements in the list is calculated as l0 = ( l + l%2 + 1 ) // 3. A bit of thought and experimenting serve to convince ourselves that the list thereby becomes able to hold the largest number of the form 6*k+1 or 6*k+5 that are less than or equal to l. The list itself is initialized by the values of u, initially 1, that are formed by repeatedly adding v which alternates between 4 and 2. This is similar to what happens in luo.lua and generates 1, 5, 7, 11, 13 ... The sieve loop is more intricate: We need to map from the values that should be excluded (p*p, p*p+p, ...) to their indexes in order to zero the correct numbers in the list. Observe initially that for the numbers in the list of the form 6*k+1, the index is 2*k+1. And for the numbers of the form 6*k+5, the index is 2*k+2. For example, the index of 7=6*1+1 where k=1 is 2*k+1=3. And the index of 17=6*2+5 with k=2 is 2*k+2=6. If a number, n, is of one of the forms 6*k+1 or 6*k+5, we can convince ourselves that its index in the list is n//3+1: If n = 6*k+1, n//3 = 2*k and n//3+1 is the desired index value 2*k+1. And if n = 6*k+5, n//3 = 2*k+1 and again n//3+1 is the desired index value which is 2*k+2. The index of p*p, the initial value of i, is therefore calculated as p*p//3+1. Now we have to zero the numbers with indexes that correspond to the subset of the numbers p*p, p*p+p, p*p+2*p, ... which are not divisible by 2 or 3. Let's begin by considering an example, p=5: For this p, the numbers we wish to exclude are 25 30 35 40 45 50 55 60 65 70 75 80 85 ... >From this list, we should exclude numbers divisible by 2 or 3, that is, numbers that are not of the form 6*k+1 or 6*k+5. These numbers are already excluded from consideration because we have specialized with respect to 2 and 3. Excluding numbers divisible by 2 or 3 leaves us with 25 35 55 65 85 ... The corresponding list of indexes (n//3+1 for each n in the list) is 9 12 19 22 29 ... So adding 3 and 7 alternatingly generates these indexes. Another example, p=7: Excluding numbers divisible by 2 or 3 from 49 56 63 70 77 84 91 98 105 112 119 126 133 ... leaves 49 77 91 119 133 ... so the list of indexes to be zeroed is 17 26 31 40 45 ... Similar pattern as for p=5: Adding 9 and 5 alternatingly generates these indexes. In general, for a p of the form 6*k+1, adding d = 8*(p//6)+1 and 2*p-d alternatingly generates the desired indexes. And for a p of the form 6*k+5, adding d = 4*(p//6)+3 and 2*p-d alternatingly does the job. That's what happens in Eratos3.lua. Running Eratos3.lua: $ time lua-5.4.3 Eratos3.lua 1000000 100 Eratos3.lua: 2021-Apr-18 19.25 / TN pi(1000000) = 78498, count=429626 (calculated 100 times) clockAccum.Init=4.588 clockAccum.Sieve=5.815 clockAccum.Pack=2.561 clockTotal=12.964 real 0m13.024s user 0m11.575s sys 0m1.435s $ So Eratos3.lua is slightly faster than Eratos2.lua: The clockTotal for Eratos2.lua reported above is 14.586. Getting back to your (Jairo's) luo.lua, to enable a more detailed comparison with Eratos3.lua, I have instrumented luo.lua in the same way as Eratos3.lua. The difference between your original luo.lua and the instrumented luo2.lua is: $ diff luo.lua luo2.lua 4c4,7 < function primesieve(N) --- > function primesieve(N,clockAccum) > local clockStart > local clockEnd > clockStart = os.clock() 22a26,29 > local count = 0 > clockEnd = os.clock() > clockAccum.Init = (clockAccum.Init or 0) + clockEnd - clockStart > clockStart = clockEnd 32a40 > count = count + 1 34a43,45 > clockEnd = os.clock() > clockAccum.Sieve = (clockAccum.Sieve or 0) + clockEnd - clockStart > clockStart = clockEnd 40c51,53 < return result --- > clockEnd = os.clock() > clockAccum.Pack = (clockAccum.Pack or 0) + clockEnd - clockStart > return result, count 42a56,58 > local count > local clockStart = os.clock() > local clockAccum = {} 44c60 < X = #primesieve(1000000) --- > X, count = primesieve(1000000,clockAccum) 45a62,67 > local clockEnd = os.clock() > print( "pi(" .. 1000000 .. ") = " .. #X .. ", count=" .. count .. " (calculated " .. 100 .. " times)" ) > print( "clockAccum.Init=" .. clockAccum.Init ) > print( "clockAccum.Sieve=" .. clockAccum.Sieve ) > print( "clockAccum.Pack=" .. clockAccum.Pack ) > print( "clockTotal=" .. clockEnd - clockStart ) $ And running luo2.lua: $ time lua-5.4.3 luo2.lua pi(1000000) = 78498, count=580993 (calculated 100 times) clockAccum.Init=6.386 clockAccum.Sieve=10.53 clockAccum.Pack=4.955 clockTotal=21.871 real 0m22.000s user 0m20.326s sys 0m1.575s $ The first thing to notice here is that the figures reported by the time command are not significantly larger than the corresponding figures reported for the original luo.lua, repeated here: $ time lua-5.4.3 luo.lua real 0m21.828s user 0m20.373s sys 0m1.388s $ If there is a small difference, this is most likely caused primarily by the count = count + 1 which has been added to the inner sieving loop. Before continuing, some words about the instrumentation used: Resource usage is measured by the Lua os.clock() function which is based on the standard C library clock() function. To measure resource usage of some section of the program, take the difference between the values of os.clock() before and after that program section executes. It is usually important to avoid too many calls to os.clock(), since those calls and the accompanying calculations take time and may disturb the measurements significantly. In any case, it is preferable to measure fairly long-running program sections, since the clock() may "tick" with quite large intervals, that is, be relatively inaccurate. For both Eratos3.lua and luo2.lua, three os.clock() based measurements are taken, one for each of the main sections: Initialization, sieving, and packing of the result. And because the base test consists of 100 repetitions of the calculation of the primes below 1000000, we need to accumulate the measurements for each section. This is handled, a bit cumbersome, using the clockAccum table parameter which has been added to the main sieve function. In addition to the os.clock() measurements, the number of times the inner sieve loop is executed is counted. There is no need for accumulation here, so that value, the count, is simply returned as an additional result of the main sieve function. As a final check, the executions are performed under the control of the time command which provides a simple sanity check for the figures reported. Now compare the measurements for luo2.lua and Eratos3.lua: $ time lua-5.4.3 luo2.lua pi(1000000) = 78498, count=580993 (calculated 100 times) clockAccum.Init=6.386 clockAccum.Sieve=10.53 clockAccum.Pack=4.955 clockTotal=21.871 real 0m22.000s user 0m20.326s sys 0m1.575s $ $ time lua-5.4.3 Eratos3.lua 1000000 100 Eratos3.lua: 2021-Apr-18 19.25 / TN pi(1000000) = 78498, count=429626 (calculated 100 times) clockAccum.Init=4.588 clockAccum.Sieve=5.815 clockAccum.Pack=2.561 clockTotal=12.964 real 0m13.024s user 0m11.575s sys 0m1.435s $ We notice several things: The initialization by luo2.lua is a couple of seconds slower than the initialization by Eratos3.lua. Comparing the code, luo2.lua has local step = 2 local first = 5 while first <= N do S[#S+1] = first first = first + step step = step == 2 and 4 or 2 end whereas Eratos3.lua uses: local u = 1 local v = 4 for i = 1, l0 do list[i] = u u = u + v v = 6 - v end Adjusting luo2.lua to use step = 6-step should save a little. Also the packing of final result, the list of primes, is slower in luo2.lua that uses: for _, v in next, S do if v > 0 then result[#result + 1] = v end end Where Eratos3.lua uses: for i = 2, l0 do if list[i] ~= 0 then -- print( "prime[" .. i .. "] = " .. list[i] ) prime[#prime+1] = list[i] end end Actually, the luo2.lua code is not only slower, but actually risks generating a list of primes out of order: There is no guarantee that the table keys are processed in numerical order when using next. More mysterious initially is the fact that luo2.lua uses 580993 iterations of the inner sieve loop where Eratos3.lua can make do with only 429626. The reason turns out to be that luo2.lua spends time eliminating multiples of composite numbers: For example, multiples of 25 are eliminated, but this is unnecessary, since these multiples have all been eliminated when multiples of 5 were eliminated. Figuring out what to do about this takes some thinking, but fortunately the fix is easy: Simply refrain from executing the inner sieve loop if S[i] == 0: S[i] is the value whose multiples are being zeroed, so if S[i] is zero, it is composite and, hence, its multiples have already been excluded. (This improvement is relevant also for the original Xuedon Luo method.) Similar logic is found in Eratos3.lua in the form of the following loop: repeat x = x + 1 p = list[x] until p ~= 0 Accordingly, the following adjustments to luo2.lua are made in luo3.lua to improve performance: $ diff luo2.lua luo3.lua 24c24 < step = step == 2 and 4 or 2 --- > step = 6 - step 36,40c36,42 < while j <= M do < S[j] = 0 < j = j + ij < ij = t - ij < count = count + 1 --- > if S[i] ~= 0 then > while j <= M do > S[j] = 0 > j = j + ij > ij = t - ij > count = count + 1 > end 46,48c48,50 < for _, v in next, S do < if v > 0 then < result[#result + 1] = v --- > for i = 1, #S do > if S[i] > 0 then > result[#result + 1] = S[i] $ Then: $ time lua-5.4.3 luo3.lua pi(1000000) = 78498, count=429627 (calculated 100 times) clockAccum.Init=5.828 clockAccum.Sieve=9.941 clockAccum.Pack=2.405 clockTotal=18.174 real 0m18.227s user 0m16.629s sys 0m1.575s $ We notice that all sections have been speeded up: The luo3.lua initialization is still a bit slower than the Eratos3.lua initialization: Presumably, this is explained by the different loop contructs used. And the packing is performing similarly in luo3.lua and Eroatos3.lua and understandably: The packing loops are equivalent. The sieving section has been speeded up, but there is still some way to go, which is a bit strange, considering that the luo3.lua sieve loop while j <= M do S[j] = 0 j = j + ij ij = t - ij count = count + 1 end is statement for statement equivalent to the Eratos3.lua sieve loop: while i <= l0 do list[i] = 0 i = i + d d = z - d -- c = c + 1 count = count + 1 end The reason turns out to be that the luo.lua variable j is global. Hence, we declare j local like this in luo4.lua: $ diff luo3.lua luo4.lua 33c33 < j = c --- > local j = c $ Resulting in: $ time lua-5.4.3 luo4.lua pi(1000000) = 78498, count=429627 (calculated 100 times) clockAccum.Init=6.005 clockAccum.Sieve=5.818 clockAccum.Pack=2.513 clockTotal=14.336 real 0m14.471s user 0m12.667s sys 0m1.715s $ Overall, this essentially eliminates the difference between the performance of luo2.lua and Eratos3.lua, except for the initialization, as noted. Finally, one detail stands out: The count reported by both luo3.lua and luo4.lua is one greater than the count reported by Eratos3.lua. Looking into this, it turns out that the original Xuedong Luo method presupposes that the N parameter is of the form 3*M+2 with odd M. So the odd M is = 2*k+1 for some k, so N = 3*(2*k+1)+2 = 6*k+5, and the N = 1000000 used in the test is not of this form. As a result, the inner sieving loop of luo.lua in rare cases sets a S[j] = 0 outside the intended range, explaining the difference in the counts. So, with these initial considerations out of the way, the original plan was to present the very different module that I mentioned earlier, Sieve1.lua. However, this mail is already too long, so I'll limit myself to simply attaching the code (Sieve1_20210507_1322.zip containing Sieve1.lua and required utility modules Integer1.lua and List1.lua) and just comment briefly. To run the code, unpack the files into some directory and: $ time lua-5.4.3 -e'local s=require "Sieve1"; for n=1,100 do ps=s.primesTo( 1000000 ) end print( "moduleCount=" .. s.moduleCount .. ", #ps=" .. #ps )' moduleCount=3, #ps=78498 real 0m11.119s user 0m10.872s sys 0m0.249s $ This mimics the luo.lua calculation by calculating the primes to 1000000 a hundred times. (And note: This is slightly faster than both luo4.lua and Eratos3.lua.) Sieve1.lua is different from Eratos3.lua in (at least) two significant ways: 1. The number of primes (called modules) handled specially (like 2 and 3 in luo.lua and Eratos3.lua) can be varied. This allows easy experimentation with this parameter. The module count is hardcoded in Sieve1.lua, presently Sieve1.moduleCount = 3, so that the primes handled specially are 2, 3, and 5. This particular value of the module count gives the best performance for the primes below 1000000 in that both using 2 and 4 modules is slower. This minimum value may change with the number of primes desired, however, so using 3 is perhaps not the end of the story. 2. The Sieve1.lua module uses a single bit (with 0 meaning potentially prime and 1 denoting composite) to represent an entry in the list of numbers on which exclusion is performed. This means that (under ordinary circumstances) 64 numbers (bits of a Lua integer) are handled in parallel. But beware that for larger primes, there are far between the exclusion values, reducing this advantage, perhaps even turning it into a disadvantage. Enough for now. Have fun. I did. Best Thorkil Naur
Attachment:
Sieve1_20210507_1322.zip
Description: Zip compressed data