17
Here is a list of some common ligatures in Unicode (the ones I could create with my Compose key on Debian):
Orig Ascii Lig
ae [ae] æ
AE [AE] Æ
oe [oe] œ
OE [OE] Œ
ij [ij] ij
IJ [IJ] IJ
ff [ff] ff
fi [fi] fi
fl [fl] fl
ffi [ffi] ffi
ffl [ffl] ffl
You have two options in this challenge: use the actual UTF-8 ligatures, or use the ASCII-only variant. If you use the actual UTF-8 ligature variants, you gain a 20% bonus. If you use the ASCII-only variant, you may assume square brackets will never be involved except to signify a ligature.
The challenge: given a string as input, output the same string
with all original ligatures replaced by their expanded counterparts.
- match greedily:
affib
becomesaffib
(a[ffi]b
), notaffib
(a[ff]ib
) oraffib
(af[fi]b
).
- match greedily:
with all "expanded" letter sequences replaced by ligatures.
- for example,
æOEfoo
([ae]OEfoo
) becomesaeŒfoo
(ae[OE]foo
).
- for example,
Do this completely independently: ffi
([ff]i
) becomes ffi
(ffi
), not ffi
([ffi]
).
Sound simple enough? There's a catch: every time two non-ligatures overlap by exactly one character, both of the ligatures must be inserted into the string. Here's a few test cases to demonstrate:
Input Ascii-output Output
fij [fi][ij] fiij
fIJ f[IJ] fIJ * remember, capitalization matters!
fffi [ff][ffi] ffffi
fff [ff][ff] ffff
ffffi [ff][ff][ffi] ffffffi
ffffij [ff][ff][ffi][ij] ffffffiij
Be careful: the same greedy matching applies (note especially the last few test cases).
code-golf, so shortest code in bytes wins.
7@Mego What's the big deal? If your language of choice cannot handle æ natively, just print 0xc3 0xa6, its UTF-8 encoding. – Dennis – 2015-12-14T03:11:39.687
7If a language can't facilitate a given task, don't use that language for that task. That shouldn't be a big deal. – Alex A. – 2015-12-14T03:24:24.310