22
1
Before 1994, Spanish dictionaries used alphabetical order with a peculiarity: digraphs ll
and ch
were considered as if they were single letters. ch
immediately followed c
, and ll
immediately followed l
. Adding the letter ñ
, which follows n
in Spanish, the order was then:
a, b, c, ch, d, e, f, g, h, i, j, k, l, ll, m, n, ñ, o, p, q, r, s, t, u, v, w, x, y, z
Since 1994 ll
and ch
are considered as groups of two letters (l
,l
and c
,h
respectively), and thus alphabetical order is the same as in English, with the exception of the letter ñ
.
The old order was definitely more interesting.
The challenge
Input a list of zero or more words and output the list sorted according to the old Spanish alphabetical order. Sorting is between words (not between letters within a word). That is, words are atomic, and the output will contain the same words in a possibly different order.
To simplify, we will not consider letter ñ
, or accented vowels á
, é
, í
, ó
, ú
, or uppercase letters. Each word will be a sequence of one or more characters taken from the inclusive range from ASCII 97 (a
) through ASCII 122 (z
).
If there are more than two l
letters in a row, they should be grouped left to right. That is, lll
is ll
and then l
(not l
and then ll
).
Input format can be: words separated by spaces, by newlines, or any convenient character. Words may be surrounded by quotation marks or not, at your choice. A list or array of words is also acceptable. Any reasonable format is valid; just state it in your answer.
In a similar way, output will be any reasonable format (not necessarily the same as the input).
Code golf, shortest wins.
Test cases
In the following examples words are separated by spaces. First line is input, second is output:
llama coche luego cocina caldo callar calma
caldo calma callar cocina coche luego llama
cuchara cuchillo cubiertos cuco cueva
cubiertos cuco cuchara cuchillo cueva
"Words" can be single letters too:
b c a ch ll m l n
a b c ch l ll m n
or unlikely combinations (remember the rule that l
's are grouped left to right):
lll llc llz llll lllz
llc lll lllz llll llz
An empty input should give an empty output:
Of course, this order can be applied to other languages as well:
chiaro diventare cucchiaio
cucchiaio chiaro diventare
all alternative almond at ally a amber
a almond alternative all ally amber at
5It's too late to correct the question now, because it has an answer, but actually rr was a single letter too. I believe it lost its status as a single letter later than ll and ch, so the explanation in Wikipedia is not so much wrong as partial. – Peter Taylor – 2016-03-10T15:13:29.503
"tweo"? filler+ – CalculatorFeline – 2016-03-10T15:19:27.083
@CatsAreFluffy Thanks! corrected – Luis Mendo – 2016-03-10T15:20:28.757
@PeterTaylor Was
– Luis Mendo – 2016-03-10T15:22:38.290rr
really considered a single letter? I had never heard that http://www.rae.es/consultas/exclusion-de-ch-y-ll-del-abecedarioIt was a single letter in the first Spanish dictionary I owned. – Peter Taylor – 2016-03-10T16:03:45.033
3
@PeterTaylor The official academy (RAE) didn't consider
– Luis Mendo – 2016-03-10T16:27:01.647rr
a single letter; at least not since 1803. But it's true that apparently it was considered a single letter in the AmericasThe Hungarian language has even more such peculiarities, and it didn't abandon them.
cs
,dz
,dzs
,gy
,ly
,ny
,sz
,ty
andzs
are all considered single letters. – vsz – 2016-03-10T17:35:31.570@vsz hmm, with
dzs
it sounds like it would make for an even more interesting challenge because a single letter replacement wouldn't be sufficient. – p.s.w.g – 2016-03-10T18:31:22.320@vsz: Hungarian also has
– ninjalj – 2016-03-10T23:33:56.143ccs
for<cs><cs>
, which makes things more interesting: https://sourceware.org/bugzilla/show_bug.cgi?id=135471Looks like Hungarian deserves a separate, much more difficult challenge :-) – Luis Mendo – 2016-03-10T23:34:50.517
@ninjalj : indeed, and technically also for
ddz
,ddzs
,ggy
,lly
,nny
,ssz
,tty
, andzzs
, although a few of them are very rarely used. – vsz – 2016-03-11T04:38:49.730I was not completely right, as the double digraphs (like
ccs
=cs
+cs
) have a different rule. And there are other exceptions besides digraphs. And there are a lot of accented vowels. I'll post a new challenge soon. :) – vsz – 2016-03-11T07:17:24.4471The Welsh alphabet has loads of them, and is probably interesting since they're not in (English) alphabetical order, or include all latin characters: a, b, c, ch, d, dd, e, f, ff, g, ng, h, i, j, l, ll, m, n, o, p, ph, r, rh, s, t, th, u, w, y – Algy Taylor – 2016-03-11T09:29:38.067
@DonMuesli: rr was definitely considered a single letter by my teachers and textbooks in Argentina in the 70s. – Martin Argerami – 2016-03-11T12:16:08.983
@DonMuesli : Done: http://codegolf.stackexchange.com/questions/75370/hungarian-alphabetical-order
– vsz – 2016-03-11T17:56:41.813