Parse my Esperanto!

21

The famous constructed language Esperanto uses the Latin alphabet (mostly, see the linked wikipedia page for details). However, there are some characters with accents: ĉ, ĝ, ĥ, ĵ, ŝ, and ŭ. (C-circumflex, g-circumflex, h-circumflex, j-circumflex, s-circumflex, and u-breve.) Naturally, these characters are very hard to type. Even for this question, I had to search in the Unicode selector for the characters. Due to this, a convention using the letter "x" has been developed for electronic use. For example, "cxu" is used for "ĉu". (Note: the letter "x" is not used normally in the Esperanto alphabet."

However, I am a language purist! This *air quote* x nonsense is killing me! I need a program to fix this, preferably as short as possible so I can type it into my terminal as fast as possible!

Challenge

Your mission is to take a string of Esperanto using x-convention and convert it to real Esperanto.

In effect, you have to map:

cx: ĉ
gx: ĝ
hx: ĥ
jx: ĵ
sx: ŝ
ux: ŭ
Cx: Ĉ
Gx: Ĝ
Hx: Ĥ
Jx: Ĵ
Sx: Ŝ
Ux: Ŭ

All other printable ASCII characters should be accepted and not changed. Unicode would be nice, but not necessary.

Input and output can be in any format reasonable to your language. Good luck!

Testcases

"input" : "output"
_____________
"gxi estas varma" : "ĝi estas varma"
"Cxu sxi sxatas katojn aux hundojn?" : "Ĉu ŝi ŝatas katojn aŭ hundojn?"
"Uxcxsxabcd(hxSx)efg{};" : "Ŭĉŝabcd(ĥŜ)efg{};"
"qwertyuiop" : "qwertyuiop"
" " : " "
"" : ""
"x" : "x"
"xc" : "xc"
"xcx" : "xĉ"
"cxx" : "ĉx"

Scoring

This is . Answers are scored by smallest bytecount in the language's default encoding.

Here is a Stack Snippet to generate both a regular leaderboard and an overview of winners by language.

To make sure that your answer shows up, please start your answer with a headline, using the following Markdown template:

# Language Name, N bytes

where N is the size of your submission. If you improve your score, you can keep old scores in the headline, by striking them through. For instance:

# Ruby, <s>104</s> <s>101</s> 96 bytes

If there you want to include multiple numbers in your header (e.g. because your score is the sum of two files or you want to list interpreter flag penalties separately), make sure that the actual score is the last number in the header:

# Perl, 43 + 2 (-p flag) = 45 bytes

You can also make the language name a link which will then show up in the leaderboard snippet:

# [><>](http://esolangs.org/wiki/Fish), 121 bytes

var QUESTION_ID=149292,OVERRIDE_USER=47670;function answersUrl(e){return"https://api.stackexchange.com/2.2/questions/"+QUESTION_ID+"/answers?page="+e+"&pagesize=100&order=desc&sort=creation&site=codegolf&filter="+ANSWER_FILTER}function commentUrl(e,s){return"https://api.stackexchange.com/2.2/answers/"+s.join(";")+"/comments?page="+e+"&pagesize=100&order=desc&sort=creation&site=codegolf&filter="+COMMENT_FILTER}function getAnswers(){jQuery.ajax({url:answersUrl(answer_page++),method:"get",dataType:"jsonp",crossDomain:!0,success:function(e){answers.push.apply(answers,e.items),answers_hash=[],answer_ids=[],e.items.forEach(function(e){e.comments=[];var s=+e.share_link.match(/\d+/);answer_ids.push(s),answers_hash[s]=e}),e.has_more||(more_answers=!1),comment_page=1,getComments()}})}function getComments(){jQuery.ajax({url:commentUrl(comment_page++,answer_ids),method:"get",dataType:"jsonp",crossDomain:!0,success:function(e){e.items.forEach(function(e){e.owner.user_id===OVERRIDE_USER&&answers_hash[e.post_id].comments.push(e)}),e.has_more?getComments():more_answers?getAnswers():process()}})}function getAuthorName(e){return e.owner.display_name}function process(){var e=[];answers.forEach(function(s){var r=s.body;s.comments.forEach(function(e){OVERRIDE_REG.test(e.body)&&(r="<h1>"+e.body.replace(OVERRIDE_REG,"")+"</h1>")});var a=r.match(SCORE_REG);a&&e.push({user:getAuthorName(s),size:+a[2],language:a[1],link:s.share_link})}),e.sort(function(e,s){var r=e.size,a=s.size;return r-a});var s={},r=1,a=null,n=1;e.forEach(function(e){e.size!=a&&(n=r),a=e.size,++r;var t=jQuery("#answer-template").html();t=t.replace("{{PLACE}}",n+".").replace("{{NAME}}",e.user).replace("{{LANGUAGE}}",e.language).replace("{{SIZE}}",e.size).replace("{{LINK}}",e.link),t=jQuery(t),jQuery("#answers").append(t);var o=e.language;/<a/.test(o)&&(o=jQuery(o).text()),s[o]=s[o]||{lang:e.language,user:e.user,size:e.size,link:e.link}});var t=[];for(var o in s)s.hasOwnProperty(o)&&t.push(s[o]);t.sort(function(e,s){return e.lang>s.lang?1:e.lang<s.lang?-1:0});for(var c=0;c<t.length;++c){var i=jQuery("#language-template").html(),o=t[c];i=i.replace("{{LANGUAGE}}",o.lang).replace("{{NAME}}",o.user).replace("{{SIZE}}",o.size).replace("{{LINK}}",o.link),i=jQuery(i),jQuery("#languages").append(i)}}var ANSWER_FILTER="!t)IWYnsLAZle2tQ3KqrVveCRJfxcRLe",COMMENT_FILTER="!)Q2B_A2kjfAiU78X(md6BoYk",answers=[],answers_hash,answer_ids,answer_page=1,more_answers=!0,comment_page;getAnswers();var SCORE_REG=/<h\d>\s*([^\n,]*[^\s,]),.*?(\d+)(?=[^\n\d<>]*(?:<(?:s>[^\n<>]*<\/s>|[^\n<>]+>)[^\n\d<>]*)*<\/h\d>)/,OVERRIDE_REG=/^Override\s*header:\s*/i;
body{text-align:left!important}#answer-list,#language-list{padding:10px;width:290px;float:left}table thead{font-weight:700}table td{padding:5px}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script> <link rel="stylesheet" type="text/css" href="//cdn.sstatic.net/codegolf/all.css?v=83c949450c8b"> <div id="answer-list"> <h2>Leaderboard</h2> <table class="answer-list"> <thead> <tr><td></td><td>Author</td><td>Language</td><td>Size</td></tr></thead> <tbody id="answers"> </tbody> </table> </div><div id="language-list"> <h2>Winners by Language</h2> <table class="language-list"> <thead> <tr><td>Language</td><td>User</td><td>Score</td></tr></thead> <tbody id="languages"> </tbody> </table> </div><table style="display: none"> <tbody id="answer-template"> <tr><td>{{PLACE}}</td><td>{{NAME}}</td><td>{{LANGUAGE}}</td><td>{{SIZE}}</td><td><a href="{{LINK}}">Link</a></td></tr></tbody> </table> <table style="display: none"> <tbody id="language-template"> <tr><td>{{LANGUAGE}}</td><td>{{NAME}}</td><td>{{SIZE}}</td><td><a href="{{LINK}}">Link</a></td></tr></tbody> </table>

Good luck, have fun, and feel free to suggest improvements!

Clarifications:

  • You only need to worry about printable ASCII characters.

  • You only need to output a character that looks like the correct output. Yes, this means you can tack the accent onto the standard character.

OldBunny2800

Posted 2017-11-28T01:08:20.037

Reputation: 1 379

ASCII here means 20-7E printable characters, 00-7F, or what? – user202729 – 2017-11-28T02:06:24.693

All the printable ones. – OldBunny2800 – 2017-11-28T02:07:11.237

Note: I added a clarification that you can use the letter and the modifier accent. – OldBunny2800 – 2017-11-28T02:15:23.767

5Combining circumflex is at 0302 ̂, and combining breve is at 0306 ̆. – user202729 – 2017-11-28T02:23:39.487

^ Each one take 2 bytes in UTF8 as TIO count.

– user202729 – 2017-11-28T02:28:53.730

@user202729 A language purist would most probably hate combining chars, but those are actually easy to type with compose key. – Erik the Outgolfer – 2017-11-28T12:25:25.943

@EriktheOutgolfer what do you mean “compose key”? – OldBunny2800 – 2017-11-28T12:26:54.347

I have to point out that your second test sentence, altough grammatically correct, should more likely end with an "n" ("hundojn") – etuardu – 2017-11-28T14:48:33.800

My bad, thanks. Mi pardonpetas. – OldBunny2800 – 2017-11-28T14:51:26.317

Try feeding it the input "Linux" – Arturo Torres Sánchez – 2017-11-28T21:00:29.497

@EriktheOutgolfer Why would a language purist care for different representations of the same grapheme? – Arturo Torres Sánchez – 2017-11-28T21:01:50.230

Answers

9

QuadR, 65 bytes

.x
3::⍵M⋄'ĉĝĥĵŝŭĈĜĤĴŜŬ'['cghjsuCGHJSU'⍳⊃⍵M]

Try it online!

.x replace any char followed by "x" with

3::⍵M upon indexing error, return the match unmodified
 now try:
'ĉĝĥĵŝŭĈĜĤĴŜŬ'[] index into this string with
  ⍵M the match's
   first letter's
   index
  'cghjsuCGHJSU' in this string

This is equivalent to the Dyalog APL tacit function:

'.x'⎕R{3::⍵.Match⋄'ĉĝĥĵŝŭĈĜĤĴŜŬ'['cghjsuCGHJSU'⍳⊃⍵.Match]}

Adám

Posted 2017-11-28T01:08:20.037

Reputation: 37 779

Nice answer! +1 – OldBunny2800 – 2017-11-28T01:28:12.380

I'm not sure how bytes are counted here. Isn't the straightforward use of ⎕R shorter? ('cghjsuCGHJSU',¨'x')⎕r(,¨'ĉĝĥĵŝŭĈĜĤĴŜŬ') – ngn – 2017-11-28T07:26:27.827

@ngn It is, but my battery ran out before I had a chance to post that. – Adám – 2017-11-28T07:35:25.953

6

Retina, 27 bytes

iT`x`̂`[cghjs]x
iT`x`̆`ux

Try it online!

This program is composed by two transliterations. Due to having combining characters in the code this doesn't render too well, the first line should actually look similar to iT`x`^`[cghjs]x, where ^ stands for the circumflex accent combining character. What this is saying is that it should Transliterate (ignoring case) all the xs in the input into a ^, whenever they are following any letter in [cghjs].


Note: TIO incorrectly measures this code as 25 bytes. Actually, this Retina program uses UTF-8 encoding (other programs can use UTF-32 or ISO 8859-1) and the two combining characters present cost 2 bytes each.

Leo

Posted 2017-11-28T01:08:20.037

Reputation: 8 482

5

C,  173  154 bytes

Thanks to @Colera Su for saving 17 bytes!

p,c,i;f(char*s){for(char*l="cghjsuCGHJSU";p=*s;~c&&putchar(p))for(c=*++s,i=0;c=='x'&&l[i];++i)l[i]-p||write(1,"ĉĝĥĵŝŭĈĜĤĴŜŬ"+i*2,2,c=-1,++s);}

Try it online!

Explanation:

p,c,i;
f(char*s)
{
    // The outer loop and an array of characters that are modified by a trailing 'x'.
    // The array/string is used for getting the index for the accented character later.
    for (char*l="cghjsuCGHJSU";

                                // Store the current character of the input string in 'p'.
                                // If it is '\0', the loop terminates.
                                p=*s;

                                      // The last statement in the loop.
                                      // If 'c==-1', it outputs the char stored in 'p'. 
                                      ~c&&putchar(p))

        // Store the character following 'p' in 'c' and increment the string pointer.
        for(c=*++s, i=0;

                        // If 'c' is not the letter 'x', the inner loop terminates
                        // immediately. Otherwise it loops through the characters of
                        // string 'l'.
                        c=='x'&&l[i]; ++i)

            // If the character stored in 'p' is found inside the string 'l'...
            l[i]-p ||

                      // ...then print the accented character corresponding to 'p'.
                      // 'i' is the index of 'p' in 'l', and, because the characters
                      // with accents are two bytes each, the index is multiplied by 2.
                      write(1,"ĉĝĥĵŝŭĈĜĤĴŜŬ"+i*2,2,

                      // Finally set 'c' to -1 so that the non-accented character doesn't
                      // get printed too, and increment the string pointer so that the
                      // letter 'x' doesn't get printed either.
                                                    c=-1, ++s);
}

Steadybox

Posted 2017-11-28T01:08:20.037

Reputation: 15 798

Nice! Can I have an explanation please? – OldBunny2800 – 2017-11-28T02:10:26.543

Probably you can use literal null byte instead of \0? – user202729 – 2017-11-28T02:16:03.580

(but that unfortunately doesn't work on TIO) – user202729 – 2017-11-28T02:17:08.600

You can use write(1,"..."+i*2,2) to save 17 bytes. Try it online!

– Colera Su – 2017-11-28T05:52:12.877

5

Python 3, 81 bytes

lambda s,T="cĉgĝhĥjĵsŝuŭ":eval("s"+".replace('%sx',%r)"*12%(*T+T.upper(),))

Try it online!

Generates and evaluates the string:

s.replace('cx','ĉ').replace('gx','ĝ').replace('hx','ĥ').replace('jx','ĵ').replace('sx','ŝ').replace('ux','ŭ').replace('Cx','Ĉ').replace('Gx','Ĝ').replace('Hx','Ĥ').replace('Jx','Ĵ').replace('Sx','Ŝ').replace('Ux','Ŭ')

Erik the Outgolfer saved a byte.

xnor

Posted 2017-11-28T01:08:20.037

Reputation: 115 687

@EriktheOutgolfer Nice one, thanks! – xnor – 2017-11-28T19:05:59.820

3

///, 75 bytes

/,/\/\///>/x\,/c>ĉ,g>ĝ,h>ĥ,j>ĵ,s>ŝ,u>ŭ,C>Ĉ,G>Ĝ,H>Ĥ,J>Ĵ,S>Ŝ,U>Ŭ/

Note: Because the OP request all printable characters must be processed, my "special characters" chosen must not be printable. So I chosen tab and newline instead of , which does not change my bytecount or code functionality. The code would look like:

/
/\/\/// /x\
/c  ĉ
g   ĝ
h   ĥ
j   ĵ
s   ŝ
u   ŭ
C   Ĉ
G   Ĝ
H   Ĥ
J   Ĵ
S   Ŝ
U   Ŭ/

However that requires the input must not contains tab or newlines.

Try it online!

Because /// can't take input, you should put the input after the code.

Pretty straightforward. I guess it can't be shorter because /// need special handling of each character.

Explanation:

/,/\/\//       Replace all `,` in the code by `//`
               (two slashes are represented as two backslash-ed slashes)
/>/x\,         (in original code) becomes
/>/x\//        (because `,` is replaced by `//`) - replace all occurence of 
               `>` by `x/`.
/cx/ĉ//gx/ĝ//hx/ĥ//jx/ĵ//sx/ŝ//ux/ŭ//Cx/Ĉ//Gx/Ĝ//Hx/Ĥ//Jx/Ĵ//Sx/Ŝ//Ux/Ŭ/
               ^ The remaining part of the code should look like this.
               Straightforward replacement.

user202729

Posted 2017-11-28T01:08:20.037

Reputation: 14 620

3

Python 3, 95 bytes

f=lambda x,v="cĉgĝhĥjĵsŝuŭCĈGĜHĤJĴSŜUŬ":v and f(x.replace(v[0]+"x",v[1]),v[2:])or x

Try it online!

-10 bytes thanks to WhatToDo
-1 byte thanks to Colera Su

HyperNeutrino

Posted 2017-11-28T01:08:20.037

Reputation: 26 575

96 bytes Try it online!

– WhatToDo – 2017-11-28T02:22:18.303

@user507295 oh smart idea. thanks! – HyperNeutrino – 2017-11-28T02:23:47.170

Use and-or trick to save one byte: Try it online!

– Colera Su – 2017-11-28T02:37:44.013

@ColeraSu oh cool, thanks. not sure why that trick vanished D: – HyperNeutrino – 2017-11-28T03:19:52.203

@HyperNeutrino Because I didn't know about that trick. Sorry! – WhatToDo – 2017-11-28T03:30:22.767

@WhatToDo Ah okay I see :P Don't worry about it! I do that all the time even when I do know the trick :P – HyperNeutrino – 2017-11-28T12:57:08.020

2

Retina, 55 bytes

iT`CG\HJSUcg\hjsux`ĈĜĤĴŜŬĉĝĥĵŝŭ_`[cghjsux]x

Try it online! Non-combining approach. Bytes could be saved if not for the standalone x test cases.

Neil

Posted 2017-11-28T01:08:20.037

Reputation: 95 035

1

Perl 5, 101 + 1 (-p) = 102 bytes

%k=qw/c ĉ g ĝ h ĥ j ĵ s ŝ u ŭ C Ĉ G Ĝ H Ĥ J Ĵ S Ŝ U Ŭ/;$"=join"|",keys%k;s/($")x/$k{$1}/g

Try it online!

Xcali

Posted 2017-11-28T01:08:20.037

Reputation: 7 671

1

JavaScript (ES6), 92 bytes

s=>[..."cghjsuCGHJSU"].reduce((a,v,i)=>a.split(v+"x").join("ĉĝĥĵŝŭĈĜĤĴŜŬ"[i]),s)

Try it online!

Used split-join method recommended in here to reduce byte counts because the new RegExp(/*blah*/) constructor took up too many bytes.

Comparison:

Original: a.replace(new RegExp(v+"x", "g"), "ĉĝĥĵŝŭĈĜĤĴŜŬ"[i])
New     : a.split(v+"x").join("ĉĝĥĵŝŭĈĜĤĴŜŬ"[i])

Shorter, combining accent approach (63 bytes), but with some artifacts visible.

s=>s.replace(/([cghjs])x/gi," ̂$1").replace(/(u)x/gi," ̌$1");

Footnote: I'm claiming my answer 92 bytes because the 63-byte solution has artifacts that may affect the output.

Shieru Asakoto

Posted 2017-11-28T01:08:20.037

Reputation: 4 445

1

C, 145 144 bytes

Another C approach. Return by overwriting the input, using the fact that circumflex / breve are 2 bytes.

-1 bytes thanks to Steadybox.

i,t;f(char*s){for(t=1;*s;s++)if(*s^'x')for(i=12,t=1;i--;)t="cghjsuCGHJSU"[i]-*s?t:i*2;else t^1&&memcpy(s-1,"ĉĝĥĵŝŭĈĜĤĴŜŬ"+t,2),t=1;}

Try it online!

Colera Su

Posted 2017-11-28T01:08:20.037

Reputation: 2 291

1

Using t^1&&memcpy(s-1,"ĉĝĥĵŝŭĈĜĤĴŜŬ"+t,2),t=1; instead of t^1?memcpy(s-1,"ĉĝĥĵŝŭĈĜĤĴŜŬ"+t,2),t=1:0; saves one byte. Try it online!

– Steadybox – 2017-11-28T15:59:37.907

1

QuadR, 25 bytes

Combining diacritics edition.

ux
([cghjs])x
 ̆&
 ̂\1

i flag

Try it online!

Replace…

(u)x         u followed by x and
([cghjs])x   any of these letters followed by x …
 ̆\1          by a breve followed by the first group (the u) and
 ̂\1          a circumflex followed by the first group (the letter)

case insensitively

Equivalent to the following Dyalog APL code:

'(u)x' '([cghjs])x'⎕R' ̆\1' ' ̂\1'

Adám

Posted 2017-11-28T01:08:20.037

Reputation: 37 779

Why is this 28 and not 24 bytes? – Erik the Outgolfer – 2017-11-28T21:07:18.550

@EriktheOutgolfer TIO's SBCS counter confused me. Fixed. Thanks. Wait, does that mean I win? – Adám – 2017-11-28T21:36:45.930

Huh, now it looks like it's 27 bytes (copied from TIO), but 24 bytes when copied from here. What is QuadR's encoding, and which is correct?

– Erik the Outgolfer – 2017-11-28T21:40:53.090

@EriktheOutgolfer Both links report 24 on my FFQ/Win10. QuadR uses Dyalog Classic or any Unicode. – Adám – 2017-11-28T23:27:02.097

So it's 24 bytes or what? – Erik the Outgolfer – 2017-11-29T12:22:30.257

1

APL (Dyalog Unicode), 57 bytes

Anonymous tacit function. Usages:

  1. Prefix function to string. This transliterates the string.

  2. Prefix function to list of strings. This transliterates the strings.

  3. Infix function with input file tie number as right argument and output file tie number as left argument. This populates the output file with the transliterated content of the input file.

('cghjsuCGHJSU',¨'x')⎕R(,¨'ĉĝĥĵŝŭĈĜĤĴŜŬ')

()⎕R() PCRE Replace

'cghjsuCGHJSU' these letters

,¨'x' each followed by an x

 … with…

,¨'ĉĝĥĵŝŭĈĜĤĴŜŬ' each of these letters as strings

Try it online!

Adám

Posted 2017-11-28T01:08:20.037

Reputation: 37 779

1

Perl 5, 49 + 2 (-p -C) = 61 51 bytes

s/[CGHJScghjs]\Kx/\x{0302}/g;s/[Uu]\Kx/\x{0306}/g

Try it online!

Saved 10 bytes thanks to Nahuel Fouilleul

DarkHeart

Posted 2017-11-28T01:08:20.037

Reputation: 171

could save 7 bytes: s/[CGHJScghjs]\Kx/\x{0302}/g;s/[Uu]\Kx/\x{0306}/g – Nahuel Fouilleul – 2017-11-28T11:04:42.220

seems it works also just with -C and without -C with warning (Wide character in print) – Nahuel Fouilleul – 2017-11-28T11:11:24.390

1from perlrun -C on its own (not followed by any number or option list), or the empty string "" for the PERL_UNICODE environment variable, has the same effect as -CSDL. – Nahuel Fouilleul – 2017-11-28T11:15:25.693

1

J, 64 63 bytes

rplc((_2]\'ĉĝĥĵŝŭĈĜĤĴŜŬ');~"1'cghjsuCGHJSU',.'x')"0

How it works:

With _2]\ I rearrange the string 'ĉĝĥĵŝŭĈĜĤĴŜŬ' into a 12-row column in order to fit the shape of the other string.

,. adds 'x' to each character of the 'cghjsuCGHJSU' string and makes a 12 row by 2 columns array

;~"1' makes a list of boxed pairs of the above, "1 - rank 1 - apply to each row.

┌──┬──┐
│cx│ĉ │
├──┼──┤
│gx│ĝ │
├──┼──┤
│hx│ĥ │
├──┼──┤
│jx│ĵ │
├──┼──┤
│sx│ŝ │
├──┼──┤
│ux│ŭ │
├──┼──┤
│Cx│Ĉ │
├──┼──┤
│Gx│Ĝ │
├──┼──┤
│Hx│Ĥ │
├──┼──┤
│Jx│Ĵ │
├──┼──┤
│Sx│Ŝ │
├──┼──┤
│Ux│Ŭ │
└──┴──┘

rplc uses these boxed items to replace each occurrence of the left boxed item from a pair with the right one.

Try it online!

Galen Ivanov

Posted 2017-11-28T01:08:20.037

Reputation: 13 815

1

R, 75 70 bytes

function(s)gsub('([cghjs])x','\\1\U302',gsub('(u)x','\\1\U306',s,T),T)

Try it online!

-5 bytes thanks to Giuseppe

Explanation

  • gsub('(u)x','\\1\U306',s,T): replace in s every occurrence of an uppercase or lowercase "u" (by using ignore.case=TRUE via the fourth argument T) followed by an "x" the "u" followed by the unicode for a breve
  • gsub('([cghjs])x','\\1\U302',gsub('(u)x','\\1\U306',s,T),T): take the result of that and replace every occurrence of an uppercase or lowercase (by using ignore.case=TRUE via the fourth argument T) "c", "g", "h", "j", or "s" followed by an "x" with the letter followed by the unicode for a circumflex

duckmayr

Posted 2017-11-28T01:08:20.037

Reputation: 441

using argument order rather than naming saves 3 bytes, and another two getting rid of the leading zero in \U0302 and \U0306: Try it online!

– Giuseppe – 2017-11-28T19:18:01.750

@Giuseppe -- great idea, thanks! – duckmayr – 2017-11-28T20:34:10.750

1

Befunge, 2x48 +1 = 99 bytes

>~:1+!#@_:"x"-v>$ 11p0"cghjsuCGHJSU"1\ >\31p11g-v
^ # #, : ++$\ _^#1"x"0*4!-"u"g11*"ʊ"!\_^#!:\*g13<

Try It Out (TIO is super weird about Befunge and I couldn't get any of my solutions to work on it)

How it works

>~:1+!@_

Gets input and checks if it's the end. End program if it is.

          "x"-v>
^ # #, : ++$\ _^

Checks if the character is an "x". If not, keep a copy of the character and print it.

               >$ 11p0"cghjsuCGHJSU"1\

Store the last character at (1,1). Puts all the characters to check into the stack.

                                       >\31p11g-v
                                      _^#!:\*g13<

Compare the last character against all the values in the stack.

                 1"x"0*4!-"u"g11*"ʊ"!\

Multiply the check (0 or 1) by ʊ (unicode value 650). Check whether the character was a u (for the breve) and adds 4 to the stack if so. Finally, add the ascii value of x (100) as well. The total adds up to the correct accent if needed or just an "x" if not.

>~:1+!#@_  
^ # #, : ++$\ _^#

Add all the values in the stack together, print it and keep a duplicate. Go back up for the next input.

Jo King

Posted 2017-11-28T01:08:20.037

Reputation: 38 234

1

Mathematica, 81 bytes or 57 bytes

StringReplace[RemoveDiacritics@#<>"x"->#&/@Characters@"ĉĝĥĵŝŭĈĜĤĴŜŬ"]

It applies a replacement rule where the letter without the hat together with an "x" is replaced by the letter.

Here is an alternative using the added accents character: StringReplace[{"ux"->"ŭ","Ux"->"Ŭ",c_~~"x":>c<>"̂"}]

M. Stern

Posted 2017-11-28T01:08:20.037

Reputation: 111

0

CJam, 51 bytes

q"ĉĝĥĵŝŭĈĜĤĴŜŬ""cghjsuCGHJSU".{'x+@\/*}

Try it online!

Explanation:

q                   Read input
"ĉĝĥĵŝŭĈĜĤĴŜŬ"      String literal
"cghjsuCGHJSU"      Another string literal
.{                  Iterate over the strings in parallel
  'x+                 Add an 'x to the normal character
  @                   Rotate to bring the input to the top of stack
  \                   Swap to bring the "cx" to the top
  /                   Split the input on instances of "cx"
  *                   Join the input on instances of the accented character
}

Esolanging Fruit

Posted 2017-11-28T01:08:20.037

Reputation: 13 542

Is this really 39 bytes? I count 39 characters and I don't think CJam have a special encoding. – user202729 – 2017-11-28T02:20:57.483

@user202729 Changed (TIO counted bytes as characters for some reason) – Esolanging Fruit – 2017-11-28T02:23:53.533

Because TIO believe that all golfing languages have special character codepage, and it doesn't bother check if all characters are in the correct codepage. – user202729 – 2017-11-28T02:25:10.880

0

sed, 108 bytes

s/cx/ĉ/g
s/gx/ĝ/g
s/hx/ĥ/g
s/jx/ĵ/g
s/sx/ŝ/g
s/ux/ŭ/g
s/Cx/Ĉ/g
s/Gx/Ĝ/g
s/Hx/Ĥ/g
s/Jx/Ĵ/g
s/Sx/Ŝ/g
s/Ux/Ŭ/g

iBug

Posted 2017-11-28T01:08:20.037

Reputation: 2 477

You should format the code as code by \...`` or <pre><code>...</code></pre> or 4 indents. – user202729 – 2017-11-28T14:04:23.187

@user202729 I obviously knew that. I was submitting from my Android phone so I didn't format it correctly. – iBug – 2017-11-28T15:20:22.227

2This looks like it's 119 bytes long. – Erik the Outgolfer – 2017-11-28T19:11:24.070

0

PowerShell, 58 bytes

It's 54 characters and saving it in PowerShell ISE makes it UTF-8 + BOM for 58 bytes. It doesn't render as nicely in a browser:

$args-replace'(?<=u)x','̆'-replace'(?<=[cghjs])x','̂'

regex replaces the x with the combining Unicode characters from @user202729's comment.

e.g.

PS C:\> .\eo.ps1 "Cxu vi sxatas la cxapelliterojn? Mi ankaux."
Ĉu vi ŝatas la ĉapelliterojn? Mi ankaŭ.

TessellatingHeckler

Posted 2017-11-28T01:08:20.037

Reputation: 2 412

0

Clojure, 126 115 bytes

-11 bytes by changing the replacement map to a partition of a string.

#(reduce(fn[a[f r]](clojure.string/replace a(str f\x)(str r)))%(partition 2"cĉgĝhĥjĵsŝuŭCĈGĜHĤJĴSŜUŬ")) 

A reduction over a map of replacements to look for, and what to replace them with.

Still working on a way to compress the replacement map.

(defn translate [^String esperanto]
  (reduce (fn [acc [f r]] (clojure.string/replace
                            acc ; Replace the translation so far by
                            (str f \x) ; adding a x after each character, search for it in the string,
                            (str r))) ; and replace it with a stringified accented char

          esperanto ; Before the reduction happens, the accumulator is the original string

          ; A list of [char-to-find what-to-replace-with] pairs
          (partition 2"cĉgĝhĥjĵsŝuŭCĈGĜHĤJĴSŜUŬ")))))

Carcigenicate

Posted 2017-11-28T01:08:20.037

Reputation: 3 295

0

JavaScript (ES6), 91 bytes

(i,s='cĉgĝhĥjĵsŝuŭCĈGĜHĤJĴSŜUŬ')=>i.replace(/.x/g,m=>s[1+s.search(m[0])||s]||m)

Try it online!

Patrick Stephansen

Posted 2017-11-28T01:08:20.037

Reputation: 103

0

Scala, 110 bytes

Boring regex solution:

def?(s:String)="(.)x".r.replaceAllIn(s,m=>m.group(0)(0)+(if(m.group(0)(0).toUpper=='U')"\u0306"else"\u0302"))

Old scala solution (116 bytes)

def?(s:String)=s.foldLeft("")((r,c)=>if(c=='x')r.init+r.last+(if(r.last.toUpper=='U')"\u0306"else"\u0302")else r+c)

Ungolfed

def?(s:String)=
  s.foldLeft("")((r,c)=>  // 'Fold' string with empty string as first result
    if(c=='x')            // If current character is x
      r.init+             // Take the every character from result but the last
        r.last+           // The last character from result and add
          (if(r.last.toUpper=='U')
            "\u0306"      // combining breve if 'u' or 'U'
          else"\u0302")   // combining circumflex in any other case
 else r+c                 // Otherwise return result + character
)

AmazingDreams

Posted 2017-11-28T01:08:20.037

Reputation: 281

0

JavaScript, 35 chars, 36 bytes

s=>s.replace(/([cghjsu])x/gi,"$1̂")

ericw31415

Posted 2017-11-28T01:08:20.037

Reputation: 2 229

0

sed, 40 bytes (38 chars)

s/([cghjsCGHJS])x/\1̂/g
s/(u|U)x/\1̆/g

Try it online!

I believe this is different enough from iBug's answer.

pizzapants184

Posted 2017-11-28T01:08:20.037

Reputation: 3 174