Uncollapse digits

72

5

Task

Given a string of English names of digits “collapsed” together, like this:

zeronineoneoneeighttwoseventhreesixfourtwofive

Split the string back into digits:

zero nine one one eight two seven three six four two five

Rules

  • The input is always a string. It always consists of one or more lowercase English digit names, collapsed together, and nothing else.

    • The English digit names are zero one two three four five six seven eight nine.
  • The output may be a list of strings, or a new string where the digits are delimited by non-alphabetic, non-empty strings. (Your output may also optionally have such strings at the beginning or end, and the delimiters need not be consistent. So even something like {{ zero0one$$two ); is a valid (if absurd) answer for zeroonetwo.)

  • The shortest answer in bytes wins.

Test cases

three -> three
eightsix -> eight six
fivefourseven -> five four seven
ninethreesixthree -> nine three six three
foursixeighttwofive -> four six eight two five
fivethreefivesixthreenineonesevenoneeight -> five three five six three nine one seven one eight
threesevensevensixninenineninefiveeighttwofiveeightsixthreeeight -> three seven seven six nine nine nine five eight two five eight six three eight
zeroonetwothreefourfivesixseveneightnine -> zero one two three four five six seven eight nine

Lynn

Posted 2017-11-20T17:00:02.693

Reputation: 55 648

28This is an excellent challenge! The task is extremely easy to understand and verify, but the right approach to use isn't very obvious. And choosing the right approach could make a huge difference in score. +1 :) – James – 2017-11-20T17:27:22.523

1

After thinking this up, I remembered a similar, but more simplistic challenge on anarchy golf: yesno! It sparked some amazing C answers. I hope to see one of those soon :)

– Lynn – 2017-11-20T17:41:43.243

I do not think my C answer qualifies as such, but hopefully it's a starting point for others with a more twisted sense of humor than myself. – Michael Dorgan – 2017-11-20T20:29:46.017

I'm pretty sure I've seen this same challenge, but where you're supposed to print the actual number. I'm almost certain it was also posted by, you, Lynn; but I've lost the link, hook me up with it? – Magic Octopus Urn – 2017-11-20T22:21:44.950

That wouldn’t have been mine, sorry. Do you mean the challenge was to turn one two three into one hundred and twenty-three?

– Lynn – 2017-11-20T22:37:08.893

Can the result be a List of Match objects? (Which saves stringifying the result) – Brad Gilbert b2gills – 2017-11-20T23:13:14.023

@BradGilbertb2gills What’s the string representation of such a list in your language? – Lynn – 2017-11-20T23:15:51.570

If you turn a List of Matches into a Str, it will space separate the strings. code('eightsix').Str'eight six' – Brad Gilbert b2gills – 2017-11-20T23:55:12.160

@BradGilbertb2gills I’d say leave the .Str in, then. – Lynn – 2017-11-21T00:55:49.113

3@MichaelDorgan (or any other C coders), you may want to have a look at the algorithm I used in my Befunge answer. A straight conversion of that to C got me a 104 byte solution, which I think beats all of the existing C answers. I'm willing to bet that could be improved upon by someone with more C golfing skills. – James Holderness – 2017-11-21T03:06:05.093

My straight conversion is around 120 bytes and doesn't handle the \0 char. You are right though, this will be smaller. – Michael Dorgan – 2017-11-21T19:00:20.677

Answers

29

Retina, 20 bytes

!`..[eox]|[tse]?....

Try it online!

Uriel

Posted 2017-11-20T17:00:02.693

Reputation: 11 708

17

C (gcc), 89 80 76 75 72 71 70 69 bytes

f(char*s){*s&&f(s+printf(" %.*s",""[(*s^s[2])%12],s)-1);}

Try it online!

(89) Credit to gastropner for the XOR hash.
(76) Credit to Toby Speight for the idea of using 1st and 3rd.
(75) Credit to Michael Dorgan for '0' → 48.
(72) Credit to Michael Dorgan and Lynn for literals with control characters.
(69) Credit to Lynn for x?y:0 → x&&y

f (char *s) {        /* K&R style implicit return type. s is the input. */
    *s&&f(           /* Recurse while there is input. */
        s+printf(    /* printf returns the number of characters emitted. */
            " %.*s", /* Prefix each digit string with a space. Limit
                      * how many bytes from the string to print out. */
            ""
                     /* Magic hash table, where the value represents
                      * the length of the digit string. The string
                      * is logically equivalent to
                      * "\04\01\05\03\04\05\05\04\04\01\03\03" */
            [(*s^s[2])%12],
                     /* The XOR hash (mod 12) */
            s)       /* The current digit. */
            -1);}    /* Subtract 1 for the space. */

jxh

Posted 2017-11-20T17:00:02.693

Reputation: 331

11

Python 2, 50 bytes

import re
re.compile('..[eox]|[tse]?....').findall

Try it online!

-3 thanks to Lynn.
-4 thanks to Uriel's answer's regex.

Erik the Outgolfer

Posted 2017-11-20T17:00:02.693

Reputation: 38 134

3Nice! import re;re.compile('…').findall should save a couple of bytes. I did expect this to turn into regex golf :) – Lynn – 2017-11-20T17:11:16.940

@Lynn Hold on, wait until I'm done! :-P EDIT: It's 3 bytes, actually. – Erik the Outgolfer – 2017-11-20T17:12:06.237

@Lynn Also, you should've turned this to [tag:code-golf] [tag:regular-expression] instead. ;) – Erik the Outgolfer – 2017-11-20T17:24:59.673

I’m holding out for a C answer, which will be very interesting! – Lynn – 2017-11-20T17:33:03.310

9

Befunge, 87 85 81 76 bytes

<*"h"%*:"h"$_02g-v1$,*<v%*93,:_@#`0:~
"@{&ruX;\"00^ !: _>_48^>+:"yp!"*+%02p0

Try it online!

Befunge doesn't have any string manipulation instructions, so what we do is create a kind of hash of the last three characters encountered, as we're processing them.

This hash is essentially a three digit, base-104 number. Every time a new character is read, we mod the hash with 1042 to get rid of the oldest character, multiply it by 104 to make space for the new character, then add the ASCII value of the new character mod 27 (to make sure it doesn't overflow).

For comparison purposes, we take this value mod 3817, write it into memory (thus truncating it to 8 bits), which results in smaller numbers that are easier for Befunge to handle. The hashes we then have to compare against are 0, 38, 59, 64, 88, 92, 114, 117, and 123. If it matches any of those, we know we've encountered a character sequence that marks the end of a number, so we output an additional space and reset the hash to zero.

If you're wondering why base 104, or why mod 3817, those values were carefully chosen so that the hash list we needed to compare against could be represented in as few bytes as possible.

James Holderness

Posted 2017-11-20T17:00:02.693

Reputation: 8 298

Quite honestly, this looks like bakemoji (ばけもじ) to me. Wow. The algorithm description is nice though and I will contemplate it. – Michael Dorgan – 2017-11-21T18:27:42.103

^, I remember seeing the term as mojibake (もじばけ). How did you find those numbers (base 104, mod 3187), @JamesHolderness ? – Zacharý – 2017-11-22T22:31:57.097

@Zacharý I wrote a little Python script that tested different base and mod combinations to find the ones that would produce the correct results when run against all the expected inputs. Once I knew which combinations worked, I ran the resulting hash outputs through a Befunge number generator to find out which produced the shortest code. – James Holderness – 2017-11-23T00:32:00.970

6

C (gcc), 179 159 146 139 137 116 107 103 102 bytes

Edit 1: (Added suggestions from Mr. Xcoder - thanks! - My macro version was same size as yours, but I like yours better.)

Edit 2: Changed char individual compares to calls to strchr()

Edit 3: K&R's the var declarations (Eww!)

Edit 4: When 1 macro is not enough...

Edit 5: Redone with new algorithm suggested above. Thanks to James Holderness for this great idea!

Edit 6: Removed 0 set as it seems to go there automatically - Master level code golf techniques used (commas, printf trick, etc.) - thanks gastropner!

Edit 7: Use memchr and fixed a bug pointed out by James Holderness.

Edit 7: Use && on final check to replace ? - thanks jxh.

c,h;f(char*s){while(c=*s++)putchar(c),h=h%10816*104+c%27,memchr("&;@X\\ru{",h%3817,9)&&putchar(h=32);}

Try it online!

Non-golfed (Which is still very golfy honestly...)


int c;
int h;
void f(char*s)
{
    while(c=*s++)
        putchar(c),
        h=h%10816*104+c%27,
        memchr("&;@X\\ru{",h%3817,9)?putchar(h=32):1;
}

Old, straight forward grep-esqe solution:

#define p putchar
#define q c=*s++
c,x;f(char*s){while(q){p(c);x=strchr("tse",c);p(q);p(q);if(!strchr("eox",c)){p(q);if(x)p(q);}p(' ');}}

Old, cleaner version.

// Above code makes a macro of putchar() call.

void f(char *s)
{
    char c;
    while(c = *s++)
    {
        putchar(c);
        int x = strchr("tse", c);

        putchar(*s++);
        putchar(c=*s++);

        if(!strchr("eox", c))
        {
            putchar(*s++);
            if(x)
            {
                putchar(*s++);
            }
        }       
        putchar(' ');
    }
}

Try it online!

Michael Dorgan

Posted 2017-11-20T17:00:02.693

Reputation: 221

We can macro the putchar and such for a few bytes, but in general, still thinking about a better algorithm if possible. – Michael Dorgan – 2017-11-20T20:17:48.813

159 bytes by #defineing putchar and removing a pair of unnecessary brackets. – Mr. Xcoder – 2017-11-20T20:47:45.570

2

A bit ugly, but 136 bytes by using #define p putchar( instead (note the open parenthesis).

– Tom Carpenter – 2017-11-21T09:11:26.957

1109 bytes c,h=0;f(char*s){while(c=*s++)putchar(c),h=h%10816*104+c%27,c=h%3817,printf(" "+!(c&&strchr("&;@X\\ru{",c)));} – gastropner – 2017-11-21T22:21:51.217

Ah, the printf trick I saw below plus removal of a couple parenthesis and braces. Master level code golf enabled :) – Michael Dorgan – 2017-11-21T22:47:07.350

Changing the hash to one that doesn't give \ or \0 as values would save another 3 or 4 bytes too due to removal of 0 check and escape, at cost of up to 2 more chars to check against. – Michael Dorgan – 2017-11-22T00:08:16.337

This is why I had thought of changing the hash - to get rid of that assignment and extra check. I'll make another go of it shortly. – Michael Dorgan – 2017-11-22T17:03:06.403

Lynn showed me: x?y:0x&&y. Even though you evaluate to 1, you are just throwing it away, so you can save a byte. – jxh – 2017-11-30T20:17:08.877

Thanks. Mine doesn't touch the xor version above, but I'll take it. – Michael Dorgan – 2017-11-30T21:43:27.603

@MichaelDorgan: Well, you helped quite a bit with it. – jxh – 2017-11-30T21:54:56.210

:) Kinda fun to see that list of numbers shrink over time. Addictive even. Still, you get the crown for the superior algorithm. – Michael Dorgan – 2017-11-30T21:59:37.037

6

Java (OpenJDK 8), 55 46 43 bytes

Saving 9 bytes thanks to Forty3/FrownyFrog

Saving 3 bytes thanks to Titus

s->s.replaceAll("one|tw|th|f|z|s|.i"," $0")

Try it online!

edit: Thank you for the welcome and explanation of lambdas!

Luca H

Posted 2017-11-20T17:00:02.693

Reputation: 163

3

Hi, welcome to PPCG! Great first answer, and it indeed works. Here is the TIO link for it. Lambdas can be created in multiple ways. Here is another TIO with some lambdas with added comments so you can see how to create them yourself. (I suggest copying it to Eclipse so you can see the highlighting of the code.) Also, Tips for golfing in Java and Tips for golfing in all languages might be interesting to read. Enjoy your stay! :)

– Kevin Cruijssen – 2017-11-21T10:55:18.130

@KevinCruijssen thank you! I am honestly suprised that Java is shorter than JavaScript. Usually when I am reading challenges, JS is a lot shorter. – Luca H – 2017-11-21T11:22:54.173

JavaScript should be 2 bytes shorter (g regex suffix instead of All). – Neil – 2017-11-21T13:29:30.883

@Neil it is longer here because it is using f=(s)=> instead of s->, which is 4 bytes shorter. – Luca H – 2017-11-21T13:42:41.160

1@LucaH - per FrownyFrog's suggestion, you can reduce a few of your two-letter strings to single characters: z|f|s instead of ze|fo|fi|si|se/ – Forty3 – 2017-11-21T14:02:39.613

@Forty3 thanks for the advice, good call. I wanted to have everything unique, but it simply has not to be occuring anywhere else. – Luca H – 2017-11-21T14:20:01.843

on|t[wh]|.i|[fsz] (-4 bytes) – Titus – 2017-11-21T15:06:48.647

@Titus on won't work because it will trigger on "zeronine". t[wh] does not save any bytes, but .i does, so I used it. Thank you for that. – Luca H – 2017-11-21T22:13:45.517

5

Retina, 24 23 bytes

!`..[eox]|[fnz]...|.{5}

Try it online! Edit: Saved 1 byte thanks to @FrownyFrog.

Neil

Posted 2017-11-20T17:00:02.693

Reputation: 95 035

1would ..... -> .{5} work? – FrownyFrog – 2017-11-22T22:11:25.263

5

JavaScript, 66 57 52 44 41 bytes

s=>s.replace(/one|t[wh]|.i|[fsz]/g," $&")

Pretty naive, but it works.

Nice catch by FrownyFrog to use 2 chars .. except for "one" which a pure 2 char check might mess up zeronine. Edit: the single f and s were good catches by FrownyFrog that I overlooked my first two golfs.

Thanks, Neil, for the suggestion of an unnamed lambda and being able to use a single char for z gets down to 52.

Titus comes up with a smaller RegEx. I feel we are heading toward Uriel's regex eventually.

Forty3

Posted 2017-11-20T17:00:02.693

Reputation: 341

Does it break if you use two characters and push 'on' to the end? – FrownyFrog – 2017-11-20T23:58:50.343

I'm thinking z|tw|th|f|s|ei|ni|on – FrownyFrog – 2017-11-21T00:03:01.823

@FrownyFrog will fail for "twonine" ("on" in middle) – Uriel – 2017-11-21T00:08:55.740

@Uriel so 'ni' doesn't take priority if you put it on the left of 'on'? – FrownyFrog – 2017-11-21T00:11:21.027

1@FrownyFrog o comes first so it is recognized first. – Uriel – 2017-11-21T00:21:22.717

@Uriel I see, thanks. – FrownyFrog – 2017-11-21T00:22:00.623

The f=() around the s is unnecessary (unnamed lambda is acceptable). – Neil – 2017-11-21T13:28:40.643

1on|t[wh]|.i|[fsz] (-4 bytes) – Titus – 2017-11-21T15:06:55.213

2@Titus - Unfortunately, the on| will match zeronine rendering zer onine – Forty3 – 2017-11-21T15:13:21.793

s=>s.replace(/zero|one|two|three|four|five|six|seven|eight|nine/g, "$& ") this is naive, but straightforward. – Alan Dong – 2017-11-29T22:29:40.653

Try s=>s.replace(/..[eox]|[est]?.{4}/g,"$& ") instead, still 41 B though – Ephellon Dantzler – 2017-11-30T22:56:09.143

5

C, 103 99 bytes

char*r="f.tzuonresn.xgv";f(char*s){*s&&f(s+printf("%.*s ",(strrchr(r,s[2])-strchr(r,*s))%10,s)-1);}

This works for any character encoding (including awkward ones like EBCDIC), because it doesn't use the numeric value of the input characters. Instead, it locates the first and third letters in a magic string. The distance between these indicates how many letters to advance with each print.

Test program

#include <stdio.h>
int main(int argc, char **argv)
{
    for (int i = 1;  i < argc;  ++i) {
        f(argv[i]);
        puts("");
    }
}

Toby Speight

Posted 2017-11-20T17:00:02.693

Reputation: 5 058

4

J, 37 35 bytes

rplc'twthsiseeinionzef'(;LF&,)\~_2:

Try it online!

FrownyFrog

Posted 2017-11-20T17:00:02.693

Reputation: 3 112

2Cool alternative solution! I tried f=:[:>'..[eox]|[tse]?....'&rxall and it worked in interpeter, but doesn't work in TIO. – Galen Ivanov – 2017-11-21T10:21:44.990

this is really clever, well done – Jonah – 2017-11-21T14:13:02.533

@GalenIvanov TIO has the latest release, it could be a regression in J. – FrownyFrog – 2017-11-21T15:50:36.537

4

C (gcc), 106 bytes 104 102 bytes

-2 bytes thanks to @jxh -2 bytes thanks to ceilingcat

c;f(char*s){for(char*t=" $&=B*,29/?";*s;)for(c=4+(index(t,(*s^s[1])+35)-t)/4;c--;)putchar(c?*s++:32);}

Try it online!

XOR is truly our greatest ally.

gastropner

Posted 2017-11-20T17:00:02.693

Reputation: 3 264

Like the s++ trick. Nice hash. – Michael Dorgan – 2017-11-22T00:11:16.927

1s[1] will be shorter. – jxh – 2017-11-22T08:28:47.230

@jxh Nice one! Updated. – gastropner – 2017-11-22T08:44:29.233

3

Retina, 28 bytes

t[ewh]|[zfs]|(ni|o)ne|ei
 $&

Try it online!

ovs

Posted 2017-11-20T17:00:02.693

Reputation: 21 408

3

Pyth, 35 27 23 bytes

Saved a lot of bytes by porting Uriel's approach.

:Q"..[eox]|[tse]?...."1

Try it here! Initial approach.

Mr. Xcoder

Posted 2017-11-20T17:00:02.693

Reputation: 39 774

3

Pip, 27 bytes

aR`[zfs]|one|[ent][iwh]`s._

Takes input as a command-line argument. Try it online!

Simple regex replacement, inserts a space before each match of [zfs]|one|[ent][iwh].


Jumping on the bandwagon of stealing borrowing Uriel's regex gives 23 bytes (with -s flag):

a@`..[eox]|[tse]?....`

DLosc

Posted 2017-11-20T17:00:02.693

Reputation: 21 213

You can use a shorter regex.

– Erik the Outgolfer – 2017-11-20T20:40:27.210

3

C 168 ,145,144,141 bytes

EDIT: Tried init 'i' to 1 like so

a,b;main(i)

To get rid of leading whitespace,
but it breaks on input starting with three, seven or eight

141

#define s|a%1000==
a,i;main(b){for(;~scanf("%c",&b);printf(" %c"+!!i,b),a|=b%32<<5*i++)if(i>4|a%100==83 s 138 s 116 s 814 s 662 s 478)a=i=0;}

Try it online

144

a,i;main(b){for(;~(b=getchar());printf(" %c"+!!i,b),a=a*21+b-100,++i)if(i>4|a==204488|a==5062|a==7466|a==23744|a==21106|a==6740|a==95026)a=i=0;}

Try it online

168

i,a;main(b){for(;~scanf("%c",&b);printf(" %c"+!!i,b),a|=b<<8*i++)if(i>4|a==1869768058|a==6647407|a==7305076|a==1920298854|a==1702259046|a==7891315|a==1701734766)a=i=0;}

Try it online!

Ungolfed

i,a;main(b){
for(;~scanf("%c",&b); // for every char of input
printf(" %c"+!!i,b), // print whitespace if i==0 , + char
a|=b<<8*i++ // add char to a for test
)
if(
i>4| // three seven eight
a==1869768058|      // zero
a==6647407|        // one
a==7305076|       // two
a==1920298854|   //four
a==1702259046|  //five
a==7891315|    //six
a==1701734766 //nine
) a=i=0; //reset i and a
}

int constants gets unnecessary large by shifting a<<8
but in case you can compare to strings somehow it should be the most natural

146 Using string comparison

#define s|a==*(int*)
a,b;main(i){for(;~(b=getchar());printf(" %c"+!!i,b),a|=b<<8*i++)if(i>4 s"zero"s"one"s"two"s"four"s"five"s"six"s"nine")a=i=0;}

Using String comparison

Obfuscated

#define F(x)if(scanf(#x+B,&A)>0){printf(#x,&A);continue;}
B;A;main(i){for(;i;){B=1;F(\40e%4s)F(\40th%3s)F(\40se%3s)F(\40o%2s)B=2;F(\40tw%1s)F(\40si%1s)B=1;F(\40%4s)i=0;}}

PrincePolka

Posted 2017-11-20T17:00:02.693

Reputation: 653

3

Jelly,  23  21 bytes

ḣ3OP%953%7%3+3ɓḣṄȧṫḊÇ

A full program printing line-feed separated output. Note: once it's done it repeatedly prints empty lines "forever" (until a huge recursion limit or a seg-fault)

Try it online! (TIO output is accumulated, a local implementation will print line by line)

How?

Starting with a list of characters, the program repeatedly:

  1. finds the length of the first word of the list of characters using some ordinal mathematics;
  2. prints the word plus a linefeed; and
  3. removes the word from the head of the list of characters

The length of the first word is decided by inspecting the first three characters of the current list of characters (necessarily part of the first word). The program converts these to ordinals, multiplies them together, modulos the result by 953, modulos that by seven, modulos that by three and adds three:

word   head3  ordinals       product  %953  %7  %3  +3 (=len(word))
zero   zer    [122,101,114]  1404708   939   1   1   4
two    two    [111,110,101]  1233210    28   0   0   3
one    one    [116,119,111]  1532244   773   3   0   3
three  thr    [116,104,114]  1375296   117   5   2   5
four   fou    [102,111,117]  1324674     4   4   1   4
five   fiv    [102,105,118]  1263780   102   4   1   4
six    six    [115,105,120]  1449000   440   6   0   3
seven  sev    [115,101,118]  1370570   156   2   2   5
eight  eig    [101,105,103]  1092315   177   2   2   5
nine   nin    [110,105,110]  1270500   151   4   1   4

ḣ3OP%953%7%3+3ɓḣṄȧṫḊÇ - Main link, list of characters           e.g. "fiveeight..."
ḣ3              - head to index three                                "fiv"
  O             - ordinals                                           [102,105,118]
   P            - product                                            1263780
    %953        - modulo by 953                                      102
        %7      - modulo by seven                                    4
          %3    - modulo by three                                    1
            +3  - add three                                          4

              ɓ - dyadic chain separation swapping arguments...
... ḣṄȧṫḊÇ ...
    ḣ         - head to index                                        "five"
     Ṅ        - print the result plus a line-feed and yield the result
       ṫ      - tail from index                                      "eeight..."
      ȧ       - and (non-vectorising)                                "eeight..."
        Ḋ     - dequeue                                               "eight..."
         Ç    - call the last link (Main*) as a monad with this as input
              -       * since it's the only link and link indexing is modular.

Jonathan Allan

Posted 2017-11-20T17:00:02.693

Reputation: 67 804

1I am not sure whether this is allowed. (Seriously, what do you do when two greatly upvoted meta-answers say the opposite of each other?) – Ørjan Johansen – 2017-11-25T18:59:08.873

The OP explicitly states "Your output may also optionally have such strings at the beginning or end" and this program actually prints as it goes, so the output is produced prior to any forced termination anyway. – Jonathan Allan – 2017-11-25T19:01:23.773

Sure, but I don't think OP considered an infinite end string. And the meta-question is explicitly about the case where the output is printed first. – Ørjan Johansen – 2017-11-25T19:02:42.840

I think it fulfils the spirit of the requirement (if it, for example, printed infinite empty strings and then the words I might argue it did not) – Jonathan Allan – 2017-11-25T19:04:35.400

So, I guess that puts me into Martin's camp of "if it's a program and can justify..." :) – Jonathan Allan – 2017-11-25T19:08:15.047

I certainly didn’t have infinite strings in mind when I wrote the question! I’m of two minds about it. It’s a creative bending of the rules, and I don’t wanna be a spoilsport, so I think I’ll allow it. How much more work would it be to make the program terminate? – Lynn – 2017-11-26T13:16:55.773

@Lynn Could Certainly force termination with three bytes

– Jonathan Allan – 2017-11-26T19:22:40.163

2

Jelly, 44 bytes

Ṛ¹Ƥz⁶ZUwЀ“¢¤Ƙƒ⁺6j¹;Ċ-ḶṃżṃgɼṘƑUẏ{»Ḳ¤$€Ẏḟ1Ṭœṗ

Try it online!

Erik the Outgolfer

Posted 2017-11-20T17:00:02.693

Reputation: 38 134

3Sorry man, you've been outgolfed twice. – Zacharý – 2017-11-20T21:53:49.717

2

Quite long one. You are welcome to golf it down.

R, 109 bytes

function(x)for(i in utf8ToInt(x)){F=F+i;cat(intToUtf8(i),if(F%in%c(322,340,346,426,444,448,529,536,545))F=0)}

Try it online!

djhurio

Posted 2017-11-20T17:00:02.693

Reputation: 1 113

Any way to use unicode characters instead of digits? – Michael Dorgan – 2017-11-21T21:37:45.910

Nice application of intToUtf8! 90 bytes would be possible by using a different approach using regexp: function(x,p=paste,z=p("(",p(c("zero",broman::numbers),collapse="|"),")"))gsub(z,"\\1 ",x) – Michael M – 2018-04-09T11:49:41.700

2

Z80 Assembly, 46 45 bytes

; HL is the address of a zero-terminated input string
; DE is the address of the output buffer

Match5: ldi                                 ; copy remaining characters
Match4: ldi
Match3: ld a,32 : ld (de),a : inc de        ; and add space after a matched word.

Uncollapse:

        ld a,(hl) : ldi : or a : ret z      ; copy first byte (finish if it was zero)
        ex af,af'                           ; and save its value for later.

        ldi : ld a,(hl) : ldi               ; copy second and third bytes

        cp 'e' : jr z,Match3                ; is the third letter 'e' or 'o' or 'x'?
        cp 'o' : jr z,Match3
        cp 'x' : jr z,Match3

        ex af,af'                           ; now look at the first letter

        cp 'e' : jr z,Match5                ; is it 't' or 's' or 'e'?
        sub 's' : jr z,Match5
        dec a : jr z,Match5
        jr Match4

(It was fun to adapt the Uriel's cool regex to a regex-unfriendly environment).

introspec

Posted 2017-11-20T17:00:02.693

Reputation: 121

2

Haskell, 81 bytes

f[c]=[c]
f(h:t)=[' '|s<-words"z one tw th f s ei ni",and$zipWith(==)s$h:t]++h:f t

Try it online!

Explanation:

f(h:t)=                      h:f t -- recurse over input string
   [' '|s<-               ]++      -- and add a space for each string s
      words"z one tw th f s ei ni" -- from the list ["z","one","tw","th","f","s","ei","ni"]
      ,and$zipWith(==)s$h:t        -- which is a prefix of the current string

Laikoni

Posted 2017-11-20T17:00:02.693

Reputation: 23 676

2

Python 3 (no regex), 85 bytes

i=3
while i<len(s):
	if s[i-3:i]in'ineiveroneghtwoureesixven':s=s[:i]+' '+s[i:]
	i+=1

Try it online!

Alex

Posted 2017-11-20T17:00:02.693

Reputation: 121

2Welcome to PPCG! – Laikoni – 2017-11-23T19:10:58.843

It's nice, but a full program must include the code to take input. – Jonathan Allan – 2017-11-25T15:24:58.847

So, as a full program 104 bytes. However you can save 4 by using while s[i:] and then you can get that down to 93 bytes by submitting a recursive lambda (functions only need to return the output rather than print it themselves).

– Jonathan Allan – 2017-11-25T19:13:42.577

2

Excel, 181 bytes

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"z"," z"),"on"," on"),"tw"," tw"),"th"," th"),"f"," f"),"s"," s"),"ei"," ei"),"ni"," ni")

Places a space in front of: z, on, tw, th, f, s, ei, ni

Wernisch

Posted 2017-11-20T17:00:02.693

Reputation: 2 534

1

Jelly, 40 39 bytes

“¢¤Ƙƒ⁺6j¹;Ċ-ḶṃżṃgɼṘƑUẏ{»Ḳe€€@ŒṖẠ€TḢịŒṖK

Try it online!

How it works

“¢¤Ƙƒ⁺6j¹;Ċ-ḶṃżṃgɼṘƑUẏ{»Ḳe€€@ŒṖẠ€TḢịŒṖK
“¢¤Ƙƒ⁺6j¹;Ċ-ḶṃżṃgɼṘƑUẏ{»                 = the compressed string of the digit names
                        Ḳ                = split at spaces
                         e€€@ŒṖ          = check whether each member of each partition of the argument is a digit.
                               Ạ€        = A function that checks whether all values of an array are true, applied to each element.
                                 T       = Finds the index of each truthy element 
                                  Ḣ      = Grab the first element, since we have a singleton array
                                    ịŒṖ  = The previous command gives us the index, partition that splits the input into digits. This undoes it and gives us the partition.
                                       K = Join the array of digits with spaces                

Zacharý

Posted 2017-11-20T17:00:02.693

Reputation: 5 710

1

QuadS, 21 20 bytes

..[eox]|[tse]?....
&

Try it online!

This is a port of my retina answer.

Uriel

Posted 2017-11-20T17:00:02.693

Reputation: 11 708

1

Python 3, no regex,  83 68 65  63 bytes

-15 thanks to Lynn (refactor into a single function)
-3 more thanks to Lynn (avoid indexing into a list with more arithmetic)
...leading to another save of 2 bytes (avoiding parenthesis with negative modulos) :)

def f(s):h=ord(s[0])*ord(s[1])%83%-7%-3+5;print(s[:h]);f(s[h:])

A function which prints the words separated by newlines and then raises an IndexError.

Try it online! (suppresses the exceptions to allow multiple runs within the test-suite)

Jonathan Allan

Posted 2017-11-20T17:00:02.693

Reputation: 67 804

I'm revisiting this a lot later and realizing this could be 68 bytes: def f(s):h=[4,5,3][ord(s[0])*ord(s[1])%83%7%3];print(s[:h]);f(s[h:]) – Lynn – 2018-04-08T06:34:41.497

Oh wow, h(s) and h(s) how did I not notice?! Thanks Lynn! – Jonathan Allan – 2018-04-09T07:11:26.890

I'm not sure how I keep coming back to this question and noticing new things, but h=(ord(s[0])*ord(s[1])%83%7+1)%3+3 is 65 bytes! :) – Lynn – 2018-06-22T16:59:49.827

Heh, thanks Lynn, that allowed two more bytes to be golfed off too! – Jonathan Allan – 2018-06-22T17:31:15.417

1

APL (Dyalog Unicode), 25 bytes

'..[eox]|[tse]?....'⎕S'&'

Try it online!

Erik the Outgolfer

Posted 2017-11-20T17:00:02.693

Reputation: 38 134

0

Jelly, 36 bytes

œṣj⁶;$}
W;“€ɗİẒmṫṃ¦¦ạỊɦ⁼Fḷeṭḷa»s2¤ç/

Try it online!

Algorithm:

for x in ['ze', 'ni', 'on', 'tw', 'th', ...]:
    replace x in input by space+x

I bet we can do even better.

Lynn

Posted 2017-11-20T17:00:02.693

Reputation: 55 648

0

Mathematica, 125 bytes

(s=#;While[StringLength@s>2,t=1;a="";While[FreeQ[IntegerName/@0~Range~9,a],a=s~StringTake~t++];Print@a;s=StringDrop[s,t-1]])&


Try it online!

TIO outputs an error message about "CountryData"(???)
I don't know why this happens, but eveything works fine on Mathematica

J42161217

Posted 2017-11-20T17:00:02.693

Reputation: 15 931

0

Perl 6,  42  30 bytes

*.comb(/<{(0..9).Str.uninames.lc.words}>/)

Test it

{m:g/..<[eox]>||<[tse]>?..../}

Test it
(Translated from other answers)

Brad Gilbert b2gills

Posted 2017-11-20T17:00:02.693

Reputation: 12 713

0

q/kdb+, 59 51 bytes

Solution:

{asc[raze x ss/:string`z`one`tw`th`f`s`ei`ni]cut x}

Example:

q){asc[raze x ss/:string`z`one`tw`th`f`s`ei`ni]cut x}"threesevensevensixninenineninefiveeighttwofiveeightsixthreeeight"
"three"
"seven"
"seven"
"six"
"nine"
"nine"
"nine"
"five"
"eight"
"two"
"five"
"eight"
"six"
"three"
"eight"

Explanation:

Quick solution, probably better and more golfable approaches.

{asc[raze x ss/:string`z`one`tw`th`f`s`ei`ni]cut x} / ungolfed solution
{                                                 } / lambda with implicit x as input
                                             cut x  / cut x at indices given by left
 asc[                                       ]       / sort ascending
                string`z`one`tw`th`f`s`ei`ni        / string list ("z","one",...)
          x ss/:                                    / string-search left with each right
     raze                                           / reduce down list

Notes:

46 bytes with some simple golfing, replacing q calls with k ones, but still a hefty solution.

asc[(,/)x ss/:($)`z`one`tw`th`f`s`ei`ni]cut x:

streetster

Posted 2017-11-20T17:00:02.693

Reputation: 3 635

0

GNU sed, 35 bytes

(including +1 for the -r flag)

s/([ots]|[zfn].|(se|th|ei).)../ &/g

Just a simple regexp replacement.

Toby Speight

Posted 2017-11-20T17:00:02.693

Reputation: 5 058

0

Ruby, 33 bytes

->s{s.scan(/..[eox]|[tse]?..../)}

Try it online!

(Everybody else is doing it, so why can't we?)

G B

Posted 2017-11-20T17:00:02.693

Reputation: 11 099

0

Perl5, 26 bytes

echo zeronineoneoneeighttwoseventhreesixfourtwofive \
| perl -ple 's/..[eox]|[tse]?..../$& /g'

Or just the program:

s/..[eox]|[tse]?..../$& /g

Kjetil S.

Posted 2017-11-20T17:00:02.693

Reputation: 1 049

0

Deorst, 22 bytes

'..[eox]|[tse]?....'gf

Try it online!

Of course, this uses the regex found by Uriel. Although, it’s great when Deorst beats Pyth and Jelly :P

caird coinheringaahing

Posted 2017-11-20T17:00:02.693

Reputation: 13 702

Sorry not any more.

– Jonathan Allan – 2017-11-25T17:05:40.483

0

Matlab, 432 bytes

A long attempt.

s=char(input('','s'));o=[];while length(s)>0;switch s(1:2);case'ze';o=[o,' ',s(1:4)];s(1:4)=[];case'on';o=[o,' ',s(1:3)];s(1:3)=[];case'tw';o=[o,' ',s(1:3)];s(1:3)=[];case 'th';o=[o,' ',s(1:5)];s(1:5)=[];case'fo';o=[o,' ',s(1:4)];s(1:4)=[];case'fi';o=[o,' ',s(1:4)];s(1:4)=[];case 'si';o=[o,' ',s(1:3)];s(1:3)=[];case'se';o=[o,' ',s(1:5)];s(1:5)=[];case 'ei';o=[o,' ',s(1:5)];s(1:5)=[];otherwise;o=[o,' ',s(1:4)];s(1:4)=[];end;end;o

ungolfed:

s=char(input('','s'))
o=[]
while length(s)>0
switch s(1:2)
    case 'ze'
        o=[o,' ',s(1:4)]
        s(1:4)=[]
    case 'on'
        o=[o,' ',s(1:3)]
        s(1:3)=[]
    case 'tw'
        o=[o,' ',s(1:3)]
        s(1:3)=[]
    case 'th'
        o=[o,' ',s(1:5)]
        s(1:5)=[]
    case 'fo'
        o=[o,' ',s(1:4)]
        s(1:4)=[]
    case 'fi'
        o=[o,' ',s(1:4)]
        s(1:4)=[]
    case 'si'
        o=[o,' ',s(1:3)]
        s(1:3)=[]
    case 'se'
        o=[o,' ',s(1:5)]
        s(1:5)=[]
    case 'ei'
        o=[o,' ',s(1:5)]
        s(1:5)=[]
    otherwise
        o=[o,' ',s(1:4)]
        s(1:4)=[]
end
end
o

Jeremiah Peek

Posted 2017-11-20T17:00:02.693

Reputation: 11

2Welcome to PPCG! – H.PWiz – 2017-11-26T21:08:21.000

Thanks! First golf attempt, so not that great. But, I didn't see any other Matlab! – Jeremiah Peek – 2017-11-26T21:20:59.217

0

Ruby, 77 bytes

puts gets.gsub(/(one|two|three|four|five|six|seven|eight|nine)/,' \1 ').strip

Try it online!

Alex Allen

Posted 2017-11-20T17:00:02.693

Reputation: 91