Look-and-say sequence: roman numerals edition

20

Challenge description

We've had a few challenges involving the Look-and-say sequence. Quick reminder:

  • The sequence starts with 1,
  • Subsequent terms of this sequence are generated by enumerating each group of repeating digits in the previous term,

So the first few terms are:

1        "one"
11       "one one" (we look at the previous term)
21       "two ones"
1211     "one two, one one"
111221   "one one, one two, two ones"
312211   "three ones, two twos, one one"

Now let's do the same thing, but use Roman Numerals instead. We start with I and follow the same rules (we apply the digit-counting rule to characters instead, so we read IVX as one one, one five, one ten instead of one four, one ten or some other way):

I           "one"
II          "one one"
III         "two ones" = "II" + "I"
IIII        "three ones" = "III" + "I"
IVI         "four ones" = "IV" + "I"
IIIVII      "one one, one five, one one"
IIIIIVIII   "three ones, one five, two ones" = ("III" + "I") + ("I" + "V") + ("II" + "I")

Given a positive integer N, either:

  • Output first N numerals of this sequence (any reasonable separator is fine, as well as ["I", "II", "III", ...]
  • Output Nth term of this sequence (it may be 0-indexed).

Remember to make your code as short as possible, since this is a challenge!

EDIT: I believe that there is always one standard/preferred way of expressing integers as roman numerals, (like 95 -> XCV instead of VC). Couple of Roman numeral converters I found online corroborate my opinion. If in doubt, use an online converter, as listing all the possible edge-cases and specific rules of writing Roman numerals is not the point of this challenge.

EDIT2: @PeterTaylor and @GregMartin pointed out that only numbers less or equal to 5 appear in the sequence, so you don't have to worry about the ambiguity of Roman numerals (numbers 1 - 8 are I, II, III, IV, V, VI, VII, and VIII)

shooqie

Posted 2016-08-30T15:09:29.237

Reputation: 5 032

There isn't a unique Roman numeral expression for each integer. Which numbers might it be necessary to express, and which expressions of those numbers are valid? – Peter Taylor – 2016-08-30T15:13:47.410

What do you mean by "there isn't a unique Roman numeral expression for each integer"? Like 4 / IV / IIII ? Or 95 / XCV / VC ? There might not always be a unique way to express an integer, but I'm pretty sure there's always a preferred (standard) one - correct me if I'm wrong. – shooqie – 2016-08-30T15:17:52.813

1how far do we have to go with our roman numberals? – Maltysen – 2016-08-30T15:19:30.157

Yes, both of those cases. In the second case, I think it's very much a matter of opinion which is preferable. – Peter Taylor – 2016-08-30T15:21:59.777

@Maltysen I think the number of equal digits that can appear in succession is limited to 7. – Martin Ender – 2016-08-30T15:22:21.457

@shooqie I agree with you on this one, but you might just want to solidify the rules for roman numerals in OP. – Maltysen – 2016-08-30T15:26:17.573

@Maltysen: Like, this is something that bothers me the most about this site. Make a challenge a tiny bit more complicated and you get questions like what counts as X? or what about the edge-case #98752314? – shooqie – 2016-08-30T15:35:35.460

9@shooqie if these details weren't clarified, how would you compare answers? If there are certain edge cases left up to interpretation the actual scores become meaningless because they might make a bigger difference than any golfing tricks you could come up with. – Martin Ender – 2016-08-30T16:41:14.420

@MartinEnder: Well, I did my best at clearing up the confusion about the Roman numerals, but I feel like I shouldn't have to say how to write them - we've had challenges with Roman numerals before and there didn't seem to be any confusion whatsoever. – shooqie – 2016-08-30T17:00:28.490

I don't understand the output format - are we converting to roman numerals, grouping by character type, counting the length of each group and turning the whole thing into roman numerals again? – Blue – 2016-08-30T17:30:15.797

@muddyfish: Yeah, pretty much. – shooqie – 2016-08-30T17:32:30.953

I think one can mathematically prove that numbers greater than 8 never appear in the sequence starting at I. If so, then programs need only consider Is and Vs. – Greg Martin – 2016-08-30T18:46:46.700

@GregMartin, it's possible to prove the stronger statement that no run-length greater than 5 appears in the sequence whose first generation is I. – Peter Taylor – 2016-08-30T21:56:00.033

1k rep! Congrats :) – ThreeFx – 2016-08-30T23:51:18.757

@GregMartin: Well, that certainly makes things easier. – shooqie – 2016-08-31T07:57:21.007

Surprisingly roman numerals are in fact a decimal notation where a number is split into powers of 10. So 999 = 900+90+9 = CM + XC + IX = CMXCIX and not IM (This is the modern standard form. The romans themselves were not completely consistent). But as noted this is not needed here. – Ton Hospel – 2016-09-01T06:02:06.530

Answers

17

Perl, 49 bytes

Includes +1 for -p

Run with the 0-based index on STDIN, e.g.

ecce.pl <<< 14

ecce.pl:

#!/usr/bin/perl -p
s,(.)\1*,$&/$1%182 .$1,eg for($_=/$/)x$`;y;19;IV

Magic formulas are so magic.

Normally I would use ($_=//)x$' to make the loop control one byte shorter, but scoring on this site gives that a handicap of 2 so it ends up 1 byte longer. On older perls you can drop the space before for. Some versions of perl force you to add a final ; to close the transliteration. But what is given above is the code that works on my system.

Explanation

Working backwards from solution to code:

The string transformations we need:

I     -> II
II    -> III
III   -> IIII
IIII  -> IVI
IIIII -> VI

V     -> IV
VV    -> IIV

Each replacement ends with the repeated character. I will get a sequence of the same characters using regex /(.)\1*/, so this can be done by appending $1. The part before the -> is in $&. With that I still need:

I     -> I
II    -> II
III   -> III
IIII  -> IV
IIIII -> V

V     -> I
VV    -> II

Write I as 1 and V as 9:

1     -> 1
11    -> 11
111   -> 111
1111  -> 19
11111 -> 9

9     -> 1
99    -> 11

By dividing the part before -> by the repeated digit this becomes:

1     -> 1
11    -> 11
111   -> 111
1111  -> 19
11111 -> 9

1     -> 1
11    -> 11

So now the original repeated V is not an exception anymore. So I want an expression that makes this happen:

1     -> 1
11    -> 11
111   -> 111
1111  -> 19
11111 -> 9

And this can be done by a simple modulo 182:

1     % 182 = 1
11    % 182 = 11
111   % 182 = 111
1111  % 182 = 19
11111 % 182 = 9

(this even gets IIIIII to VI right though it isn't needed here)

All that is left is initializing the working variable to 1 for index 0, repeat this transformation in a loop and at the end replace 1 by I and 9 by V

1, 9 and 182 is the only parameter combination for which this simple formula works.

Ton Hospel

Posted 2016-08-30T15:09:29.237

Reputation: 14 114

2This is genius! :) – Lynn – 2016-09-02T11:15:59.837

10

Mathematica, 113 90 83 bytes

Thanks to Martin Ender for suggestions that reduced the length by over 25%!

Showing off the high-level commands in Mathematica.

Nest[Flatten[Characters@{RomanNumeral@#,#2}&@@@Reverse@@@Tally/@Split@#]&,{"I"},#]&

A pure function, taking an argument N and outputting the Nth element of this (0-indexed) sequence, as a list of characters. Spread out a bit:

Nest[
  Flatten[
    Characters @ {RomanNumeral@#,#2}& @@@
      Reverse @@@ Tally /@ Split@ #
    ]& ,
  {"I"}, #]&

The outer Nest iterates the middle four-line function, starting on {"I"}, N times. Line 4 splits the character list of the input Roman numeral into runs of like characters, counts each run with Tally, and puts the counts before the characters they're counting. Line 3 renders the counts as Roman numerals, then splits those Roman numerals up into lists of characters. The Flatten command reduces the whole list-of-lists to a one-dimensional list.

Here's the initial version:

Nest[
  "" <> Flatten[{RomanNumeral@#[[1]], #[[2]]} & /@
    (Reverse@#[[1]] & /@ 
      Tally /@
        Split@Characters@#)] &,
  "I", #] &

Greg Martin

Posted 2016-08-30T15:09:29.237

Reputation: 13 940

3Grrr Mathematica ;) – Beta Decay – 2016-08-30T18:50:27.283

If you use @@@ instead of /@ you can use # and #2 instead of #[[1]] and #[[2]]. Also, lists of characters are acceptable string types, so you can work with those and avoid using Characters@. – Martin Ender – 2016-08-30T21:11:27.567

@MartinEnder Aha, I knew there must have been a @@@-like shortcut! As for lists of characters being acceptable string types (which I agree would shorten the code): is there a post on this site you can point me to that describes the community standard(s)? – Greg Martin – 2016-08-30T22:22:54.370

2http://meta.codegolf.stackexchange.com/a/2216/8478 – Martin Ender – 2016-08-30T22:24:09.907

1A few more savings: Characters threads automatically so you can use @, Reverse@#& is of course the same as plain Reverse, in which case you also don't need those parentheses. And prefix notation (in the case of Flatten) doesn't save anything if you need to add parentheses to make it work. Combining all of those: Nest[Flatten[Characters@{RomanNumeral@#,#2}&@@@Reverse@@@Tally/@Split@#]&,{"I"},#]& – Martin Ender – 2016-08-31T07:58:49.063

8

CJam (33 30 bytes)

"I"{e`{(4md1$^'I*\'V*@}%e_}ri*

Online demo

Key to the correctness of the implementation is the following theorem:

If the first generation is I, no run length is ever greater than five

Lemma: if the first generation is I, no string ever contains VVV. Proof is by contradiction. Suppose that there is a first index n for which the nth generation contains VVV. If that VVV breaks down as (a)V VV then the conversion from the previous generation is bad: it should have been (a+5)V. So it must be VV V(d), and the previous generation contained VVVVV, contradicting the choice of n.

Now, suppose there is a first index m for which the mth generation contains ...IIIIII.... Note that there can be no digits other than I and V in the string, because no previous generation has had a run of nine Is or nine Vs. At most four of the Is come from a run of Is in the previous string, so the corresponding section of the previous string must be ...IIIVV... giving ... IIII IIV .... Since the VV in generation m-1 doesn't come from VVVVV (see lemma), the second V must be a run-length of digit I, so in generation m-1 we have ...IIIVVI.... And since we want the initial Is to give IIII and not IVI or VI, it is preceded either by the start of the string or by a V.

If we have (...V)?IIIVVI... in generation m-1, what do we have in generation m-2? We've already observed that the VV of gen. m-1 must be parsed as (a)V V(I).

Suppose we take a=2: (...V)?I IIV VI... Actually it must be ...VI IIV VI..., although that leading V might be part of IV; so in the previous generation we have either (...V)? IIII VV IIIII... or (...V)? IIIII VV IIIII. Either way we run into trouble with VVIIIII: the second V must be a run-length, but then ...VI IIII... requires a following (run-length, digit) pair with the same digit.

So it must be a=1: (...V)?II IV VI.... Since generation m is the first with a run of six Is, that must be (...V)? II IV VI..., so that generation m-2 is (...V)? I V IIIII.... ...VIVIIIII... is impossible: however we choose to interpret the second V we end up with two consecutive (run-length, digit) pairs with the same digit.

Therefore generation m-2 must be ^IVIIIII..., parsed as ^IV IIII I(V)... or ^IV III II(V).... These give respectively generation m-3 as ^V III V ... or ^V II VV....

But if we look at the start of the strings beginning with the first one that starts with V, we get a cycle:

    VI IV I...
    IV III IV ...
    II IV IVI ...
    IIII IV II IV ...

and so no generation ever starts with either VIIIV or VIIVV. We must conclude that there is no such m.

Dissection

"I"          e# Initial generation
{            e# Loop...
  e`         e#   Run-length encode
  {          e#   Foreach [run-length char] pair...
    (        e#     Extract the run-length r
    4md1$^   e#     Get the number of Vs and the number of Is
             e#     The number of Vs is r/4 ; the number of Is is (r%4)^(r/4)
    'I*\'V*@ e#     Repeat strings the appropriate number of times and reorder
  }%
  e_         e#  Flatten to a simple string
}ri*         e# ... n times, where n is taken from stdin

Peter Taylor

Posted 2016-08-30T15:09:29.237

Reputation: 41 901

6

Python 3, 195 bytes

There's a lot of bytes wasted on the roman numerals, so there's likely some golfing to be done there.

Thanks to @El'endiaStarman, @Sherlock9 and @Shooqie

import re
def f(x,r=""):
 for v,i in(5,"V"),(4,"IV"),(1,"I"):a,x=divmod(x,v);r+=i*a
 return r
s="I"
for i in[0]*int(input()):print(s);s=re.sub(r'(.)\1*',lambda m:f(len(m.group()))+m.group()[0],s)

Ideone it!

Beta Decay

Posted 2016-08-30T15:09:29.237

Reputation: 21 478

You can omit square brackets: for v,i in(5,"V"),(4,"IV"),(1,"I") – shooqie – 2016-08-30T17:40:34.977

@shooqie I had no idea that you could do that :D – Beta Decay – 2016-08-30T17:53:00.883

for v,i in(5,"V"),(4,"IV"),(1,"I"):a,x=divmod(x,v);r+=i*a saves a byte. – Sherlock9 – 2016-08-30T18:12:15.947

@βετѧΛєҫαγ: Also, you're not seem to be using i (as in for i in range(...)). I tried dabbling with exec but this escaped 1 in the 'sub' method seems to be messing up the code, I haven't been able to find a workaround. – shooqie – 2016-08-30T18:22:01.213

@shooqie I shortened it a bit by getting rid of range – Beta Decay – 2016-08-30T18:26:31.453

Hi, next time you might want to cross-out the previous byte-counts when improvements are made. I.e. Python 3, <s>314</s> <s>210</s> <s>202</s> <s>200</s> <s>199</s> 196 bytes. A.f.a.i.k. it's kinda standard to do so here on PPCG. – Kevin Cruijssen – 2016-09-01T09:23:41.783

@KevinCruijssen Nah, there isn't a standard on PPCG, and I prefer to keep it neater – Beta Decay – 2016-09-01T09:33:06.877

4

R, 110 107 Bytes

as.roman combined with rle makes this easy. Scoping abuse and built in cat behavior of <<- saves a few bytes.

x="I"
replicate(scan(),{r=rle(strsplit(x,"")[[1]])
x<<-paste(rbind(paste(as.roman(r$l)),r$v),collapse="")})

Takes in N from console. Outputs first 2 to N terms of sequence (which I believe is within spec...)

 [1] "II"                                                                                                                                                                                                                                     
 [2] "III"                                                                                                                                                                                                                                    
 [3] "IIII"                                                                                                                                                                                                                                   
 [4] "IVI"                                                                                                                                                                                                                                    
 [5] "IIIVII"                                                                                                                                                                                                                                 
 [6] "IIIIIVIII"                                                                                                                                                                                                                              
 [7] "VIIVIIII"                                                                                                                                                                                                                               
 [8] "IVIIIIVIVI"                                                                                                                                                                                                                             
 [9] "IIIVIVIIVIIIVII"                                                                                                                                                                                                                        
[10] "IIIIIVIIIVIIIIVIIIIIVIII"                                                                                                                                                                                                               
[11] "VIIVIIIIIVIVIIVVIIVIIII"                                                                                                                                                                                                                
[12] "IVIIIIVVIIVIIIVIIIIIVIIIIVIVI"                                                                                                                                                                                                          
[13] "IIIVIVIIIVIIIIVIIIIIVVIIVIVIIVIIIVII"                                                                                                                                                                                                   
[14] "IIIIIVIIIVIIIIIVIVIIVVIIIVIIIIVIIIVIIIIVIIIIIVIII"                                                                                                                                                                                      
[15] "VIIVIIIIIVVIIVIIIVIIIIIVIIIIIVIVIIVIIIIIVIVIIVVIIVIIII"                                                                                                                                                                                 
[16] "IVIIIIVVIIIVIIIIVIIIIIVVIIVVIIVIIIVIIIIVVIIVIIIVIIIIIVIIIIVIVI"                                                                                                                                                                         
[17] "IIIVIVIIIVIIIIIVIVIIVVIIIVIIIIIVIIIIVIIIIIVIVIIIVIIIIVIIIIIVVIIVIVIIVIIIVII"                                                                                                                                                            
[18] "IIIIIVIIIVIIIIIVVIIVIIIVIIIIIVIIIIIVVIIVIVIIVVIIVIIIVIIIIIVIVIIVVIIIVIIIIVIIIVIIIIVIIIIIVIII"                                                                                                                                           
[19] "VIIVIIIIIVVIIIVIIIIVIIIIIVVIIVVIIIVIIIIVIIIVIIIIIVIIIIVIIIIIVVIIVIIIVIIIIIVIIIIIVIVIIVIIIIIVIVIIVVIIVIIII"                                                                                                                              
[20] "IVIIIIVVIIIVIIIIIVIVIIVVIIIVIIIIIVIIIIIVIVIIVIIIIIVVIIVIVIIVVIIIVIIIIVIIIIIVVIIVVIIVIIIVIIIIVVIIVIIIVIIIIIVIIIIVIVI"                                                                                                                    
[21] "IIIVIVIIIVIIIIIVVIIVIIIVIIIIIVIIIIIVVIIVVIIVIIIVIIIIVVIIIVIIIIVIIIVIIIIIVIIIIIVIVIIVVIIIVIIIIIVIIIIVIIIIIVIVIIIVIIIIVIIIIIVVIIVIVIIVIIIVII"                                                                                             
[22] "IIIIIVIIIVIIIIIVVIIIVIIIIVIIIIIVVIIVVIIIVIIIIIVIIIIVIIIIIVIVIIIVIIIIIVIVIIVIIIIIVVIIVVIIVIIIVIIIIIVIIIIIVVIIVIVIIVVIIVIIIVIIIIIVIVIIVVIIIVIIIIVIIIVIIIIVIIIIIVIII"                                                                      
[23] "VIIVIIIIIVVIIIVIIIIIVIVIIVVIIIVIIIIIVIIIIIVVIIVIVIIVVIIVIIIVIIIIIVVIIVIIIVIIIIVVIIIVIIIIIVIIIIVIIIIIVVIIVVIIIVIIIIVIIIVIIIIIVIIIIVIIIIIVVIIVIIIVIIIIIVIIIIIVIVIIVIIIIIVIVIIVVIIVIIII"                                                   
[24] "IVIIIIVVIIIVIIIIIVVIIVIIIVIIIIIVIIIIIVVIIVVIIIVIIIIVIIIVIIIIIVIIIIVIIIIIVVIIIVIIIIVIIIIIVIVIIIVIIIIIVVIIVIVIIVVIIIVIIIIIVIIIIIVIVIIVIIIIIVVIIVIVIIVVIIIVIIIIVIIIIIVVIIVVIIVIIIVIIIIVVIIVIIIVIIIIIVIIIIVIVI"                             
[25] "IIIVIVIIIVIIIIIVVIIIVIIIIVIIIIIVVIIVVIIIVIIIIIVIIIIIVIVIIVIIIIIVVIIVIVIIVVIIIVIIIIIVIVIIVVIIVIIIVIIIIIVVIIIVIIIIVIIIVIIIIIVIIIIIVVIIVVIIVIIIVIIIIVVIIIVIIIIVIIIVIIIIIVIIIIIVIVIIVVIIIVIIIIIVIIIIVIIIIIVIVIIIVIIIIVIIIIIVVIIVIVIIVIIIVII"

Vlo

Posted 2016-08-30T15:09:29.237

Reputation: 806

1

Perl 6, 62 bytes

{("I",{S:g/(.)$0*/{<I II III IV V>[$/.chars-1]~$0}/}...*)[$_]}

Anonymous function that accepts a zero-based index.

Makes use of the fact that roman numbers higher than 5 aren't needed, because the only groups of repeating digits that can occur, are:

I     -> II
II    -> III
III   -> IIII
IIII  -> IVI
IIIII -> VI

V     -> IV
VV    -> IIV

(try it online)

smls

Posted 2016-08-30T15:09:29.237

Reputation: 4 352

1

JavaScript (ES6), 107

Recursive function returning the Nth term 0 based

f=(n,r='I')=>n?f(n-1,r.match(/I+|V+/g).map(x=>((n=x.length)-4?'VIII'.slice(n<5,1+n%5):'IV')+x[0]).join``):r

Test

f=(n,r='I')=>n?f(n-1,r.match(/I+|V+/g).map(x=>((n=x.length)-4?'VIII'.slice(n<5,1+n%5):'IV')+x[0]).join``):r

function update() {
  O.textContent=f(I.value)
}

update()
<input id=I value=25 type=number oninput='update()'><pre id=O></pre>

edc65

Posted 2016-08-30T15:09:29.237

Reputation: 31 086