Re: natural order comparison?
[
Date Prev][
Date Next][
Thread Prev][
Thread Next]
[
Date Index]
[
Thread Index]
- Subject: Re: natural order comparison?
- From: Rici Lake <lua@...>
- Date: 2005年12月21日 20:58:05 -0500
On 21-Dec-05, at 6:43 PM, PA wrote:
Hello,
I'm trying to implement a "natural order" comparator as described here:
http://www.naturalordersort.org/
http://sourcefrog.net/projects/natsort/
Something along these lines:
local aList = { "rfc1000.txt", "rfc2086.txt", "rfc822.txt" }
table.sort( aList, compare )
The main reason the code you present is going to be slow is that you
split up every string on every comparison (both strings in fact).
Quicksort (the algorithm used by table.sort) does quite a few
comparisons, so each string is going to be split up quite a few times.
You'd do better by doing it only once, and then sorting an array of
indices into the string/canonicalised string array.
In particular, is there a way the construct the regular expression
"(%d*%.?%d*)(%D*)" so it returns only one capture, either one made
only of digits or characters?
Not easily. However, except for cases where you have version-number
stuff like this:
lua-5.0.2.tar.gz
you're going to have strict alternation between strings and numbers. So
it shouldn't be a problem that you capture a number/string pair. (You
would need to have two cases, one for strings starting with a digit and
one for strings not starting with a digit. You can then normalize all
the vectors so that they start with a text string, by inserting "" at
the beginning of the translation of a string which starts with a
digit.)
I think dotted numbers are a difficult case. Should 5.9 come before or
after 5.10 ? What about 192.168.0.9 and 192.168.0.10 ? Or 2005年12月9日 and
2005年12月10日 ? I don't have a solution, though. If you're going to take
the '.' as always starting a decimal fraction, you can treat .%d* as a
text string, whereas if you take the '.' as always being a separator
you can treat '.' as a text string.
In any event,