33
5
As I was trying to reverse engineer the ls command, I came upon an interesting behavior. When I make 3 files, foo.png, foopa.png, and fooqa.png, ls sorts them as foopa.png, foo.png, and fooqa.png. I also tried it using the .gif extension and it seems to be that it happens when p and q are replaced by the first letter of the extension and the next letter in the alphabet; so in the case of .gif it would be g and h. (fooga.gif, then foo.gif, then fooha.gif)
Why does it order the output this way?
1That's quite interesting, is there any explanation for this ordering or how to configure it beyond (or with more granularity than) the LANG variable? – Mokubai – 2019-08-27T11:30:47.833
4
Actually you can use LC_COLLATE instead of LANG. See also this
– xenoid – 2019-08-27T11:40:38.7074Yikes, that was a terrible collation decision. – chrylis -on strike- – 2019-08-27T15:40:20.493
17@chrylis Not necessarily. UTF-8 file names are meant to be used in local languages, and abide to their sorting rules. For instance in French, "de Gaulle" and "Degaulle" are sorted next to each other (space doesn't count, and in other names the apostrophe or the dash don't either), and we would expect file named after them to be sorted the same way. The problem here is that the dot has its own meaning in file names and that the expected sort is closer to alphabetic (but IMHO alphabetic isn't perfect either for file names). The extension sort (
-X) inls` is a step in the right direction. – xenoid – 2019-08-27T16:20:18.4872
ls -vis much more of a step in the right direction – Roman Odaisky – 2019-08-27T23:59:28.273Don't forget the rather broken implementation of character ranges (en_US.UTF-8,in bash) . touch a A b B c C then ls [a-c]. Not one person in a thousand will guess that output, even though it's perfectly understandable how it was done! – ubfan1 – 2019-08-28T21:22:30.497