Case-fold German

9

1

Given a German string and an indication of a case (lower/upper/title), fold the string to that case.

Specifications

  1. Input will consist only of az plus äöüß-,.;:!?'" in uppercase and/or lowercase.
  2. The target case may be taken as any three unique values (please specify what they are) of a consistent type; either three characters, or three numbers, or three bit patterns. (Other formats are currently not allowed to prevent "outsourcing" the answer to the case specification. Comment if you think that an additional format should be allowed.)
  3. Titlecase means uppercase everything except letters that follow a letter (letters are az plus äöüß).

Gotchas

  1. When ß needs to be uppercase, it must become . Some case-folding built-ins and libraries do not do this.

  2. When ß needs to be titlecase, it must become Ss. Some case-folding built-ins and libraries do not do this.

  3. ss may occur in the text, and should never be converted to ß or .

Examples

Upper case die Räder sagen "ßß ss" für dich, wegen des Öls!
is DIE RÄDER SAGEN "ẞẞ SS" FÜR DICH, WEGEN DES ÖLS!

Lower case die Räder sagen "ßß ss" für dich, wegen des Öls!
is die räder sagen "ßß ss" für dich, wegen des öls!

Title case die Räder sagen "ßß ss" für dich, wegen des Öls!
is Die Räder Sagen "Ssß Ss" Für Dich, Wegen Des Öls!

Adám

Posted 8 years ago

Reputation: 37 779

2What would be the outputs for Ss? Also, the example input is missing a ss – Rod – 8 years ago

@Rod SS Ss ss. Can you tell me why that's unclear? – Adám – 8 years ago

Related – Poke – 8 years ago

Am I allowed to make the three unique values Python functions? (see my answer) – HyperNeutrino – 8 years ago

No, that's exactly what intended to prevent by specifying that the three unique values must be either characters, numbers or bit patterns. – Adám – 8 years ago

May we assume there are no capitals after the first letter in a word? – darrylyeo – 8 years ago

May we take input and give output in an ANSI codepage containing these symbols instead of UTF-8 etc.? – Οurous – 8 years ago

@Οurous Does such a codepage exist? If so, then I guess that's valid by default. At least as long as you program can fit in the codepage and that your interpreter/compiler will run/compile such files. – Adám – 8 years ago

@HyperNeutrino I think that's a standard loophole actually, taking functions as inputs. I don't remember where the meta is tho – Conor O'Brien – 8 years ago

@ConorO'Brien Oh huh. That would make sense I guess. – HyperNeutrino – 8 years ago

@HyperNeutrino Relevant meta post ConorO'Brien was most likely talking about. EDIT: You were actually the first one to comment on that meta post I now noticed.. ;)

– Kevin Cruijssen – 8 years ago

@KevinCruijssen Thanks. And huh that's interesting. I do remember reading and commenting on that meta post but I guess that the time I was trying to use that as a solution I didn't consider that :P – HyperNeutrino – 8 years ago

But... gotcha #1 isn’t making sense, uppercase ß is just SS, a friend of mine has a 4 letter lowercase name and a 5 letter uppercase...? – Stan Strum – 8 years ago

@StanStrum Encoding has nothing to do with names. ß is a two-letter ligature. – Adám – 8 years ago

@Adám not quite what I meant, but i see where you’re going. Just never really see an upper ß – Stan Strum – 8 years ago

@StanStrum It has been in use for over half a century.

– Adám – 8 years ago

Answers

6

Japt, 42 40 bytes

Saved 2 bytes thanks to @Oliver

r'ßQ=7838d)u mV,@W¦vW=X ?Xv :X¥Q?"Ss":Xu

Whew, that took quite some effort. Input is the string to convert, and a single character: u for uppercase, v for lowercase, m for title case.

Test it online!

ETHproductions

Posted 8 years ago

Reputation: 47 880

Do you need the }0? – Oliver – 8 years ago

@Oliver Yeah, otherwise it'll... wait, maybe not... – ETHproductions – 8 years ago

4

Python 3, 92 bytes

lambda s,c:[str.lower,str.upper,str.title][c](s.replace("ẞ","ß").replace("ß"*c,"ẞ"*c))

Try it online!

HyperNeutrino

Posted 8 years ago

Reputation: 26 575

Oh no! Unfortunately, that's a significant part of the challenge. – Erik the Outgolfer – 8 years ago

@EriktheOutgolfer fixed, thanks – HyperNeutrino – 8 years ago

3

Jelly, 50 bytes

⁽ñWỌ”ß;y⁸Œu
Ñ⁾SsÇ⁼?€1¦”ß
Œl
Çe€“Ġẹṇṣ‘ỌÇ;Øa¤Œg⁸ṁ⁹Ŀ€

Try it online!

Full program.

Phew, this took much time to golf...

Argument 1: String (may need to be escaped)
Argument 2: 1 for uppercase, 2 for title case, 3 for lowercase.

Erik the Outgolfer

Posted 8 years ago

Reputation: 38 134

3

05AB1E, 23 bytes

Uses 0 = lower, 1 = upper, 2 = title

•^ŠX•4ôçIiR}`:"lu™"¹è.V

Try it online!

Emigna

Posted 8 years ago

Reputation: 50 798

1

Clean, 649 279 275 274 246 bytes

Yes, that's 123 122 94 bytes of imports, which is already longer than every other answer.

from StdList import++,map,flatten
import StdLib,StdInt,StdBool,Text.Unicode,Text.Unicode.UChar
$ =fromInt
? =isAlpha
^ =toUpper
@0s=map^s
@1s=map toLower s
@2s=flatten(map(\[h:t]=if($223==h||h> $999)[$83,$115][^h]++ @1t)(groupBy(\a b= ?a== ?b)s))

Try it online!

Defines the function @, taking an Int and a UString, and returning a UString.
Conveniently, UString (Clean's default way of handling Unicode), is just a type alias for [Int] - which is a list of Int containing unicode codepoints of the characters in the string.
Inconveniently, Text.Unicode.UChar is really long, and I can't import StdEnv because the definitions in StdChar conflict with the definitions in Text.Unicode.UChar (as they are not intended for use together).

The three values are 0, 1, and 2 for Upper, Lower, and Title case.

Οurous

Posted 8 years ago

Reputation: 7 916