28
2
I thought that sort would sort common prefixes together but that doesn't always happen. Take this input for example:
AT0S*eightieths
AT0S*eyetooth's
AT*ad
AT*Ad
AT*AD
AT*Eydie
AT*eyed
ATF*adv
ATF*ATV
ATF*edify
ATF*Ediva
ATFKT*advocate
ATFKTNK*advocating
ATFKT*outfought
ATFKTS*advocates
ATHT*whitehead
ATHT*Whitehead
AT*id
AT*I'd
AT*Ito
AT*IUD
ATJ*adage
ATNXNS*attention's
ATNXNS*attenuation's
ATNXNS*autoignition's
AT*oat
AT*OD
AT*outweigh
AT*owed
ATP0K*idiopathic
ATP*adobe
ATT*wighted
ATT*witted
ATT*wooded
AT*UT
AT*Uta
AT*wowed
AT*Wyatt
ATX*atishoo
After sort, I'd expect all the AT* to end up in one chunk but when you run this data through sort, the output == input. Why is that? I'm not specifying any option to ignore non-alphabetic characters or anything. Just sort dict > out.
My version of sort comes from coreutils 8.5-1ubuntu3.
I can confirm i'm having the exact same problem too under debian, but with commas, it's driving me crazy. How can you sort csvs when it behaves like this by default? – Owl – 2019-12-25T18:24:25.233
@Owl Use the proper tool for the job: xsv or csvkit. – Aaron Digulla – 2020-01-07T14:32:52.200
@aaron digulla sort is the proper tool for the job, it's just that it's default behaviour is non standard for some distributions – Owl – 2020-01-07T17:34:35.957
Works for me. Maybe an alias somewhere? – Matthieu Cartier – 2010-12-28T12:18:54.430