Sometimes lexicographical sorting of strings may cut it, but when browsing the filesystem, you really need better sorting criteria than just ASCII indices all the time.
For this challenge, given a list of strings representing filenames and folders in a directory, do a filesystem sort, explained below:
- Split every string into an array of alternating digits (
0123456789
) and non-digits. If the filename begins with digits, insert an empty string at the beginning of the array. - For each element in the array, if the element is a string, lexicographically sort all the strings by that element in all lowercase or uppercase. If the element is a number, sort by numerical order. If any files are tied, continue to the next element for more specific sorting criteria.
Here is an example, starting with strings that are initially sorted lexicographically:
Abcd.txt
Cat05.jpg
Meme00.jpeg
abc.txt
cat.png
cat02.jpg
cat1.jpeg
meme03-02.jpg
meme03-1.jpeg
meme04.jpg
After step 1, you'll get this:
"Abcd.txt"
"Cat", 5, ".jpg"
"Meme", 0, ".jpeg"
"abc.txt"
"cat.png"
"cat", 2, ".jpg"
"cat", 1, ".jpeg"
"meme", 3, "-", 2, ".jpg"
"meme", 3, "-", 1, ".jpeg"
"meme", 4, ".jpg"
After step 2, you'll get this:
abc.txt
Abcd.txt
cat1.jpeg
cat02.jpg
Cat05.jpg
cat.png
Meme00.jpeg
meme03-1.jpeg
meme03-02.jpg
meme04.jpg
Here are the rules:
- You may not assume the list of input strings is sorted in any order
- You may assume that the input strings only contain printable ASCII characters
- You may assume that when the filenames are converted to all lowercase or all uppercase, there will be no duplicates (i.e., a case-insensitive filesystem)
- Input and output may be specified via any approved default method
- Standard loopholes apply
Here is an example implementation of the sorting algorithm if there is any confusion:
function sort(array) {
function split(file) {
return file.split(/(\D*)(\d*)/);
}
return array.sort((a, b) => {
var c = split(a);
var d = split(b);
for (var i = 0; i < Math.min(c.length, d.length); i++) {
var e = c[i].toLowerCase();
var f = d[i].toLowerCase();
if (e-f) {
return e-f;
}
if (e.localeCompare(f) !== 0) {
return e.localeCompare(f);
}
}
return c.length - d.length;
});
}
This is code-golf, so shortest answer wins!
5 Answers 5
Bean, (削除) 116 (削除ここまで) (削除) 121 (削除ここまで) 118 bytes
Hexdump:
00000000 26 55 cd a0 62 80 40 a0 6f 53 d0 80 d3 d0 80 a0 &UÍ b.@ oSÐ.ÓÐ.
00000010 6f 20 80 0d 21 80 81 00 20 80 92 20 6f 53 d0 80 o ..!... .. oSÐ.
00000020 a0 78 20 80 83 40 a0 5d a0 5e 53 d0 80 d3 a0 62 x ..@ ] ^SÐ.Ó b
00000030 20 5d 20 80 8b 40 a0 5f a0 60 a0 65 a0 6e dd a0 ] ..@ _ ` e nÝ
00000040 61 50 84 d3 a0 62 20 5e 20 65 4e ce a0 5f 80 4c aP.Ó b ^ eNÎ _.L
00000050 a0 60 8c 20 61 80 53 d0 80 d3 50 80 a0 60 20 80 `. a.SÐ.ÓP. ` .
00000060 b9 20 80 b3 53 50 80 a0 61 20 80 b9 a8 dc c4 aa 1 .3SP. a .1 ̈ÜÄa
00000070 a9 a8 dc e4 aa 29 © ̈Üäa)
00000076
Equivalent JavaScript (140 bytes, ignoring added whitespace for formatting):
f=s=>s.split(/(\D*)(\d*)/).concat(s),
_.sort(
(a,b)=>f(a).reduce(
(c,d,i,r,e=f(b)[i])=>
c||d-e||d.toLowerCase().localeCompare(e.toLowerCase())
)
)
Basically a golfed version of the example implementation given in the question.
-
3\$\begingroup\$ Nice to see a new language! ^^ \$\endgroup\$Arnauld– Arnauld2017年01月29日 23:34:30 +00:00Commented Jan 29, 2017 at 23:34
JavaScript (ES6), (削除) 105 (削除ここまで) (削除) 110 (削除ここまで) (削除) 101 (削除ここまで) (削除) 97 (削除ここまで) 82 bytes
Assumes that each numeric element is up to 99 characters long.
a=>a.sort((a,b)=>(F=s=>s.toLowerCase().replace(/\d+/g,n=>n.padStart(99)))(a)>F(b))
Test
let f =
a=>a.sort((a,b)=>(F=s=>s.toLowerCase().replace(/\d+/g,n=>n.padStart(99)))(a)>F(b))
console.log(f([
'Abcd.txt',
'Cat05.jpg',
'Meme00.jpeg',
'abc.txt',
'cat.png',
'cat1.jpeg',
'cat02.jpg',
'meme03-1.jpeg',
'meme03-02.jpg',
'meme04.jpg'
]).join('\n'))
-
\$\begingroup\$ Weird... I wouldn't have thought of a solution like this. I know I never stated to assume strings would have less than a certain amount of characters, but I guess this is valid. Nice job \$\endgroup\$Patrick Roberts– Patrick Roberts2017年01月29日 23:18:51 +00:00Commented Jan 29, 2017 at 23:18
-
\$\begingroup\$ I'm confused why you're changing it when the previous (smaller) solution was acceptable \$\endgroup\$Patrick Roberts– Patrick Roberts2017年01月30日 00:13:05 +00:00Commented Jan 30, 2017 at 0:13
-
1\$\begingroup\$ @PatrickRoberts The previous version failed to compare files such as
[ 'abc', 'cat', 'abcd' ]
. Only numeric elements should be padded with zeros. \$\endgroup\$Arnauld– Arnauld2017年01月30日 00:15:59 +00:00Commented Jan 30, 2017 at 0:15 -
\$\begingroup\$ Good catch, I'll add that to the test case \$\endgroup\$Patrick Roberts– Patrick Roberts2017年01月30日 00:20:30 +00:00Commented Jan 30, 2017 at 0:20
-
\$\begingroup\$ Nit:
padStart
, although widely supported (and therefore legal), isn't part of ES6, is it? \$\endgroup\$Neil– Neil2017年02月04日 15:12:11 +00:00Commented Feb 4, 2017 at 15:12
Jelly, 26 bytes
VμŒlμœ&ØDμ?
e€ØDI0;œṗ8Ç€μÞ
Try it online! (The footer code, ÇY
, just splits the resulting list of strings with line feeds for legibility)
How?
VμŒlμœ&ØDμ? - Link 1, evaluate a substring: substring
μ μ μ? - ... then / else / if ...
ØD - if: decimal digit characters '0123456789'
œ& - multi-set intersection, effectively "all digits"?
V - then: eval, cast the digit strings to integers
Œl - else: convert to lowercase, cast the char strings to lowercase
e€ØDI0;œṗ8Ç€μÞ - Main link: list of filenames
μ - monadic chain separation
Þ - sort the filenames by the key function, given a single filename:
ØD - decimal digit character yield, '0123456789'
e€ - exists in for €ach -> a list of 0s at chars and 1s at digits
I - increments -> 1s and -1s at changeover indexes - 1, 0s elsewhere, 0s
0; - prepend a zero -> 1s and -1s at changeover indexes, 0s elsewhere
8 - left argument, the filename
œṗ - partition at the truthy indexes (the 1s and -1s)
Ç€ - call the last link (1) as a monad for €ach substring
Perl 6, (削除) 50 45 (削除ここまで) 44 bytes
*.sort:{.lc~~/(\D*)*%%0ドル=(\d+)/;map {+$_//~$_},@0}
*.sort:{/(\D*)*%%0ドル=(\d+)/;map {+$_//.lc},@0}
*.sort:{/(\D*)*%0ドル=(\d+)/;map {+$_//.lc},@0}
Expanded:
* # Whatever lambda ( this is the argument )
.sort: # sort by doing the following to each value
{ # bare block lambda with implicit parameter 「$_」
/
( # stored in 0ドル
\D* # any number of non-digits
)* # any number of times
% # separated by
0ドル = ( # stored in 0ドル
\d+ # at least one digit
)
/;
map {
+$_ # try to turn the element into a number
// # if that fails
.lc # lowercase it
},
@0 # for all the elements
}
JavaScript (ES2020), 54 bytes
a=>a.sort(new Intl.Collator('en',{numeric:1}).compare)
Uses the Intl.Collator
builtin to provide the natural sort order.
Test
const f =
a=>a.sort(new Intl.Collator('en',{numeric:1}).compare)
console.log(f([
'Abcd.txt',
'Cat05.jpg',
'Meme00.jpeg',
'abc.txt',
'cat.png',
'cat1.jpeg',
'cat02.jpg',
'meme03-1.jpeg',
'meme03-02.jpg',
'meme04.jpg'
]))