This is the problem I ran into long time ago. I thought I may ask your for your ideas. assume I have very small list of numbers (integers), 4 or 8 elements, that need to be sorted, fast. what would be the best approach/algorithm?
my approach was to use the max/min functions (10 functions to sort 4 numbers, no branches, iirc).
// s(i,j) == max(i,j), min(i,j)
i,j = s(i,j)
k,l = s(k,l)
i,k = s(i,k) // i on top
j,l = s(j,l) // l on bottom
j,k = s(j,k)
I guess my question pertains more to implementation, rather than type of algorithm.
At this point it becomes somewhat hardware dependent , so let us assume Intel 64-bit processor with SSE3 .
Thanks
-
I asked basically the same question but with a more specific context (C implementation, arrays of 6 ints) and used cycle count register to evaluation performance. You can see results here : stackoverflow.com/questions/2786899/…kriss– kriss2010年05月10日 11:33:52 +00:00Commented May 10, 2010 at 11:33
-
1related: Fastest sort of fixed length 6 int arrayJanus Troelsen– Janus Troelsen2013年05月12日 19:10:22 +00:00Commented May 12, 2013 at 19:10
6 Answers 6
For small arrays like this, you should probably look into sorting networks. As you can see on that page, insertion sort can be expressed as a sorting network. However, if you know the size of the array beforehand, you can devise an optimal network. Take a look at this site that can help you to find optimal sorting networks for a given size of array (though optimal is only guaranteed up to a size of 16 I believe). The comparators are even grouped together in operations that can be done in parallel. The comparators are essentially the same as your s(x,y) function though if you really want this to be fast, you shouldn't be using min and max because then you're doing twice the number of comparisons that are necessary.
If you need this sorting algorithm to work on a wide range of sizes, then you should probably just go with insertion sort as others have suggested.
-
1So, ancient answer here, but I misinterpreted the concept of an "optimal sorting network" when I first read this answer, so to clarify: an optimal sorting network does not actually use a minimal number of comparators! Sorting networks are restricted in that they always use the same fixed arrangement of comparators; i.e, that there's no branching in deciding which elements to compare. That's great for hardware implementations and GPGU, but it does mean you'll need more comparisons than strictly necessary, even at fairly small array sizes (I believe from 5 and up).Eamon Nerbonne– Eamon Nerbonne2018年10月12日 20:02:21 +00:00Commented Oct 12, 2018 at 20:02
-
1The site you provided for finding optimal sorting networks doesn't work for me, but I found this website users.telenet.be/bertdobbelaere/SorterHunter/…M.J. Rayburn– M.J. Rayburn2022年05月25日 19:59:06 +00:00Commented May 25, 2022 at 19:59
I see you already have a solution that uses 5 comparisons (assuming that s(i,j) compares the two numbers once, and either swaps them or not). If you stick to comparison-based sorting, then you can't do it with any fewer than 5 comparisons.
This can be proven because there are 4! = 24 possible ways to order 4 numbers. Each comparison can only cut the possibilities in half, so with 4 comparisons you could only distinguish between 2^4 = 16 possible orderings.
To sort small amounts of numbers you want a simple algorithm as complexity adds more overhead.
The most efficient way to sort for example four items would be to unravel the sorting algorithm to linear comparisons, thus elliminating all overhead:
function sort(i,j,k,l) {
if (i < j) {
if (j < k) {
if (k < l) return [i,j,k,l];
if (j < l) return [i,j,l,k];
if (i < l) return [i,l,j,k];
return [l,i,j,k];
} else if (i < k) {
if (j < l) return [i,k,j,l];
if (k < l) return [i,k,l,j];
if (i < l) return [i,l,k,j];
return [l,i,k,j];
} else {
if (j < l) return [k,i,j,l];
if (i < l) return [k,i,l,j];
if (k < l) return [k,l,i,j];
return [l,k,i,j];
}
} else {
if (i < k) {
if (k < l) return [j,i,k,l];
if (i < l) return [j,i,l,k];
if (j < l) return [j,l,i,k];
return [l,j,i,k];
} else if (j < k) {
if (i < l) return [j,k,i,l];
if (k < l) return [j,k,l,i];
if (j < l) return [j,l,k,i];
return [l,j,k,i];
} else {
if (i < l) return [k,j,i,l];
if (j < l) return [k,j,l,i];
if (k < l) return [k,l,j,i];
return [l,k,j,i];
}
}
}
However, the code grows a lot for each extra item you add. Adding a fifth item makes the code roughly four times larger. At eight items it would be roughly 30000 lines, so although it's still the most efficient, it's a lot of code, and you would have to write a program that writes the code to get it correct.
-
2the original program used some something like this, but performance was pretty low, my guess due to branch issuesAnycorn– Anycorn2010年05月01日 04:00:55 +00:00Commented May 1, 2010 at 4:00
-
@aaa: I see... Well, to elliminate all branching you could do all the comparisons needed and combine the results into a key, and use that to get an index array from a precalculated dictionary of all possible results.Guffa– Guffa2010年05月01日 04:12:28 +00:00Commented May 1, 2010 at 4:12
-
6This algorithm is good but it is not optimal: it can perform up to 6 comparisons while an optimal algorithm shouldn't perform more than 5 comparisons.Morwenn– Morwenn2015年08月18日 18:28:55 +00:00Commented Aug 18, 2015 at 18:28
Insertion sort is considered best for small arrays. See Fast stable sort for small arrays (under 32 or 64 elements)
-
hi. I am interested more in implementation approachAnycorn– Anycorn2010年05月01日 03:35:23 +00:00Commented May 1, 2010 at 3:35
-
I'm not sure what you mean by "implementation approach". Are you looking for a discussion of assembly code?mathmike– mathmike2010年05月01日 03:39:08 +00:00Commented May 1, 2010 at 3:39
-
not quite that low, but something that would show branches/instructionsAnycorn– Anycorn2010年05月01日 03:40:28 +00:00Commented May 1, 2010 at 3:40
-
I implemented insertion sort in C++ for embedded platforms. It turned out to be optimal in compiled binary size in all cases, probably because the optimizer was able to unravel the sort into sorting networks in those cases that was beneficial (when you have a list of known size at compile time).Emil Vikström– Emil Vikström2017年06月20日 06:29:54 +00:00Commented Jun 20, 2017 at 6:29
For such a small data set, you want as simple of algorithm as possible. More likely than not, a basic Insertion Sort will do as well as you could want.
Would need to know more about the system this is running on, how many times you need to do this sort a second, etc... but the general rule in small sorts is to keep it simple. Quicksort and the like are not beneficial.
-
hello, I wrote some clarifications. I look more for implementation ideasAnycorn– Anycorn2010年05月01日 03:21:40 +00:00Commented May 1, 2010 at 3:21
Sorting networks can be easily implemented in SIMD, although it starts to get ugly at around N = 16. For N = 4 or N = 8 though this would be a good choice. Ideally you need lots of small data sets to sort concurrently, i.e. if you are sorting 8 bit values then you want at least 16 data sets to sort - it's much harder to do this kind of thing across SIMD vectors.
See also: Fastest sort of fixed length 6 int array
Explore related questions
See similar questions with these tags.