Match Roman Numerals

19

2

Challenge

Given some input string, return a truthy value if it represents a correct roman numeral between 1 (=I) and 3999 (=MMMCMXCIX), and a falsey value otherwise.

Details

  • The input is a non-empty string that only comprises the characters IVXLCDM.
  • The roman numerals (that we use here in this challenge) are defined as follows:

We use only following symbols:

Symbol  I   V   X   L   C   D    M
Value   1   5  10  50 100 500 1000

To define which strings are actually valid roman numerals, it is probably easiest to provide the rule of conversation: To write a decimal number a3 a2 a1 a0 (where each ai represents one digit. So for example to represent 792 we have a3=0, a2=7, a1=9, a0=2.) as a roman numeral, we decompose it into the power of tens. The different powers of ten can be written as follows:

      1-9: I, II, III, IV, V, VI, VII, VIII, IX
    10-90: X, XX, XXX, XL, L, LX, LXX, LXXX, XC
  100-900: C, CC, CCC, CD, D, DC, DCC, DCCC, CM
1000-3000: M, MM, MMM

Beginning at the left side with the most significant digit of the, we can convert the number that each digit represents separately and concatenate them. So for the example from above this would look like so:

Digit        a3    a2   a1   a0
Decimal       0     7    9    2
Roman             DCC   XC   II

Therefore the roman numeral for 792 is DCCXCII. Here is a full list of all roman numerals that are relevant for this challenge: OEIS a006968.txt

Examples

Truthy

MCCXXXIV (1234)
CMLXXXVIII (988)
DXIV (514)
CI (101)

Falsey

MMIXVIII
IVX
IXV
MMMM
XXXVX
IVI
VIV

flawr

Posted 2019-04-11T12:43:14.013

Reputation: 40 560

Subset of this conversion challenge.

– Shaggy – 2019-04-11T13:03:20.647

I still don't think this qualifies as a "subset" as the set of invalid inputs is larger. This challenge here only refers to the "well"-defined numbers that are used in OEIS A006968 – flawr – 2019-04-11T13:59:15.383

2Why is MMMM invalid? Is there a letter for 5000 that should be used instead for M<letter>? – Skyler – 2019-04-12T14:14:45.073

Check out the specs, there is no such letter. The only symbols used are I,V,X,L,C,D,M. – flawr – 2019-04-12T15:20:16.410

Answers

17

Verbose, 1362 bytes

GET A ROMAN NUMERAL TYPED IN BY THE CURRENT PERSON USING THIS PROGRAM AND PUT IT ONTO THE TOP OF THE PROGRAM STACK
PUT THE NUMBER MMMM ONTO THE TOP OF THE PROGRAM STACK
MOVE THE FIRST ELEMENT OF THE PROGRAM STACK TO THE SECOND ELEMENT'S PLACE AND THE SECOND ELEMENT OF THE STACK TO THE FIRST ELEMENT'S PLACE
DIVIDE THE FIRST ELEMENT OF THE PROGRAM STACK BY THE SECOND ELEMENT OF THE PROGRAM STACK AND PUT THE RESULT ONTO THE TOP OF THE PROGRAM STACK
PUT THE NUMBER V ONTO THE TOP OF THE PROGRAM STACK
GET THE FIRST ELEMENT OF THE PROGRAM STACK AND THE SECOND ELEMENT OF THE PROGRAM STACK AND IF THE SECOND ELEMENT OF THE PROGRAM STACK IS NOT ZERO JUMP TO THE INSTRUCTION THAT IS THE CURRENT INSTRUCTION NUMBER AND THE FIRST ELEMENT ADDED TOGETHER'S RESULT
PUT THE NUMBER I ONTO THE TOP OF THE PROGRAM STACK
GET THE TOP ELEMENT OF THE STACK AND OUTPUT IT FOR THE CURRENT PERSON USING THIS PROGRAM TO SEE
PUT THE NUMBER III ONTO THE TOP OF THE PROGRAM STACK
GET THE FIRST ELEMENT OF THE PROGRAM STACK AND THE SECOND ELEMENT OF THE PROGRAM STACK AND IF THE SECOND ELEMENT OF THE PROGRAM STACK IS NOT ZERO JUMP TO THE INSTRUCTION THAT IS THE CURRENT INSTRUCTION NUMBER AND THE FIRST ELEMENT ADDED TOGETHER'S RESULT
PUT THE NUMBER NULLA ONTO THE TOP OF THE PROGRAM STACK
GET THE TOP ELEMENT OF THE STACK AND OUTPUT IT FOR THE CURRENT PERSON USING THIS PROGRAM TO SEE

Outputs I for valid roman numerals in the range I-MMMCMXCIX and NULLA (0) or informs user input is not a valid roman numeral otherwise.

Expired Data

Posted 2019-04-11T12:43:14.013

Reputation: 3 129

13I can't decide if this is the right tool for the job or not. – Vaelus – 2019-04-11T17:57:53.907

5Is this the right tool for any job? – omzrs – 2019-04-11T19:39:49.707

8

C# (Visual C# Interactive Compiler), 79 109 bytes

This seems like a Regex challenge, I'm sure a shorter solution can be found...

s=>System.Text.RegularExpressions.Regex.IsMatch(s,"^M{0,3}(C[MD]|D?C{0,3})(X[CL]|L?X{0,3})(I[XV]|V?I{0,3})$")

Try it online!

Innat3

Posted 2019-04-11T12:43:14.013

Reputation: 791

Couldn't you shorten {0,3} to {,3}? – flawr – 2019-04-11T13:44:34.943

@flawr doesn't seem to capture anything then – Innat3 – 2019-04-11T13:55:06.987

1Ah sorry, only things like {5,} work, but not {,5}. – flawr – 2019-04-11T14:01:12.200

Should actually be 109 bytes

– Expired Data – 2019-04-11T15:35:55.250

@ExpiredData well this still confuses me since I see some answers not include dependencies while others do. I will keep it in mind for future answers. – Innat3 – 2019-04-11T17:44:11.753

2

You can add it as compiler flag instead, so it's 72 bytes and the language should be changed to C# (Visual C# Interactive Compiler) with flag /u:System.Text.RegularExpressions.Regex, like this answer :)

– Kevin Cruijssen – 2019-04-11T20:07:34.170

In general however, required imports should always be counted in the byte-count. In the C# (Visual C# Interactive Compiler) however it's possible to add most imports as compiler flags. The compiler flags used to count towards the byte-count in the past, but this is no longer the case. They're now counted as a different language instead.

– Kevin Cruijssen – 2019-04-11T20:16:31.673

@KevinCruijssen And this, this, this, and this. :)

– Embodiment of Ignorance – 2019-04-11T20:21:00.103

@EmbodimentofIgnorance Yeah, I know you use them more often, but was looking for an example with the regex in particular. :) But I see your last this-link is with regex as well. :) – Kevin Cruijssen – 2019-04-11T20:24:44.127

3Alternate regex: ^M?M?M?(C[MD]|D?C?C?C?)(X[CL]|L?X?X?X?)(I[XV]|V?I?I?I?)$. Same length, but looks weirder (which is the goal, right?) – Embodiment of Ignorance – 2019-04-11T20:30:58.503

@KevinCruijssen thank you for the clarification :) – Innat3 – 2019-04-12T07:27:35.120

8

Wolfram Language (Mathematica), 35 bytes

Check[FromRomanNumeral@#<3999,1<0]&

Try it online!

5 bytes saved, thanks to @attinat

the limitation [1,3999] unfortunateley costs 7 bytes...
here is the code for any roman number

Wolfram Language (Mathematica), 28 bytes

Check[FromRomanNumeral@#,F]&

Try it online!

the above code works for any number, not just [1,3999]

J42161217

Posted 2019-04-11T12:43:14.013

Reputation: 15 931

2@ExpiredData "The input is a non-empty string that only comprises the characters IVXLCDM." – mathmandan – 2019-04-11T17:31:53.557

35 bytes. Boole is also shorter (by one byte) than using If in that way. – attinat – 2019-04-11T19:28:46.917

8

CP-1610 assembly (Intellivision),  52 ... 48  47 DECLEs1 = 59 bytes

Let's try this on a system that predates Perl by a good 7 years. :-)

Takes a pointer to a null-terminated string in R4. Sets the Zero flag if the input is a valid Roman numeral, or clears it otherwise.

                ROMW    10              ; use 10-bit ROM width
                ORG     $4800           ; map this program at $4800

                ;; ------------------------------------------------------------- ;;
                ;;  test code                                                    ;;
                ;; ------------------------------------------------------------- ;;
4800            EIS                     ; enable interrupts

4801            SDBD                    ; R5 = pointer into test case index
4802            MVII    #ndx,     R5
4805            MVII    #$214,    R3    ; R3 = backtab pointer
4807            MVII    #11,      R0    ; R0 = number of test cases

4809  loop      SDBD                    ; R4 = pointer to next test case
480A            MVI@    R5,       R4
480B            PSHR    R0              ; save R0, R3, R5 onto the stack
480C            PSHR    R3
480D            PSHR    R5
480E            CALL    isRoman         ; invoke our routine
4811            PULR    R5              ; restore R5 and R3
4812            PULR    R3

4813            MVII    #$1A7,    R0    ; use a white 'T' by default
4815            BEQ     disp

4817            MVII    #$137,    R0    ; or a white 'F' is the Z flag was cleared

4819  disp      MVO@    R0,       R3    ; draw it
481A            INCR    R3              ; increment the backtab pointer

481B            PULR    R0              ; restore R0
481C            DECR    R0              ; and advance to the next test case, if any
481D            BNEQ    loop

481F            DECR    R7              ; loop forever

                ;; ------------------------------------------------------------- ;;
                ;;  test cases                                                   ;;
                ;; ------------------------------------------------------------- ;;
4820  ndx       BIDECLE test0, test1, test2, test3
4828            BIDECLE test4, test5, test6, test7, test8, test9, test10

                ; truthy
4836  test0     STRING  "MCCXXXIV", 0
483F  test1     STRING  "CMLXXXVIII", 0
484A  test2     STRING  "DXIV", 0
484F  test3     STRING  "CI", 0

                ; falsy
4852  test4     STRING  "MMIXVIII", 0
485B  test5     STRING  "IVX", 0
485F  test6     STRING  "IXV", 0
4863  test7     STRING  "MMMM", 0
4868  test8     STRING  "XXXVX", 0
486E  test9     STRING  "IVI", 0
4872  test10    STRING  "VIV", 0

                ;; ------------------------------------------------------------- ;;
                ;;  routine                                                      ;;
                ;; ------------------------------------------------------------- ;;
      isRoman   PROC

4876            PSHR    R5              ; push the return address

4877            MOVR    R7,       R2    ; R2 = dummy 1st suffix
4878            MOVR    R2,       R5    ; R5 = pointer into table
4879            ADDI    #@tbl-$+1,R5

487B  @loop     MVI@    R5,       R1    ; R1 = main digit (M, C, X, I)
487C            MVI@    R5,       R3    ; R3 = prefix or 2nd suffix (-, D, L, V)

487D            MVI@    R4,       R0    ; R0 = next digit

487E            CMPR    R0,       R3    ; if this is the prefix ...
487F            BNEQ    @main

4881            COMR    R2              ; ... disable the suffixes
4882            COMR    R3              ; by setting them to invalid values
4883            MVI@    R4,       R0    ; and read R0 again

4884  @main     CMPR    R0,       R1    ; if R0 is not equal to the main digit,
4885            BNEQ    @back           ; assume that this part is over

4887            MVI@    R4,       R0    ; R0 = next digit
4888            CMPR    R0,       R1    ; if this is a 2nd occurrence
4889            BNEQ    @suffix         ; of the main digit ...

488B            CMP@    R4,       R1    ; ... it may be followed by a 3rd occurrence
488C            BNEQ    @back

488E            MOVR    R2,       R0    ; if so, force the test below to succeed

488F  @suffix   CMPR    R0,       R2    ; otherwise, it may be either the 1st suffix
4890            BEQ     @next
4892            CMPR    R0,       R3    ; or the 2nd suffix (these tests always fail
4893            BEQ     @next           ; if the suffixes were disabled above)

4895  @back     DECR    R4              ; the last digit either belongs to the next
                                        ; iteration or is invalid

4896  @next     MOVR    R1,       R2    ; use the current main digit
                                        ; as the next 1st suffix

4897            SUBI    #'I',     R1    ; was it the last iteration? ...
4899            BNEQ    @loop

489B            CMP@    R4,       R1    ; ... yes: make sure that we've also reached
                                        ; the end of the input

489C            PULR    R7              ; return

489D  @tbl      DECLE   'M', '-'        ; table format: main digit, 2nd suffix
489F            DECLE   'C', 'D'
48A1            DECLE   'X', 'L'
48A3            DECLE   'I', 'V'

                ENDP

How?

The regular expression can be rewritten as 4 groups with the same structure, provided that # is any invalid character that is guaranteed not be present in the input string.

                 +-------+---> main digit
                 |       |
(M[##]|#?M{0,3})(C[MD]|D?C{0,3})(X[CL]|L?X{0,3})(I[XV]|V?I{0,3})
                   ||  |
                   |+--+-----> prefix or second suffix
                   |
                   +---------> first suffix

The first suffix of the group \$N\$ is the main digit of the group \$N-1\$. Therefore, we can store the patterns with the pair \$(\text{main_digit}, \text{second_suffix})\$ alone.

Our routine attempts to parse the input string character by character according to these patterns and eventually checks whether the end of the string is reached.

Output

output

screenshot of jzIntv


1. A CP-1610 opcode is encoded with a 10-bit value, known as a 'DECLE'. This routine is 47 DECLEs long, starting at $4876 and ending at $48A4 (included).

Arnauld

Posted 2019-04-11T12:43:14.013

Reputation: 111 334

wouldn't this be one of the few places where fractional bytes are valid – ASCII-only – 2019-04-12T07:34:11.183

@ASCII-only I used to think so, but I don't know for sure. See the comments of this answer for some insight about this.

– Arnauld – 2019-04-12T07:36:38.337

@ASCII-only Also, I've just found this post in meta that tends to confirm it's probably best to round to whole bytes.

– Arnauld – 2019-04-12T07:52:12.593

ah, so it's only 10 bits when it's in RAM? – ASCII-only – 2019-04-12T07:54:35.293

The program is never stored in RAM, only in ROM. So it depends on the memory chips used in the cartridge. The CPU is designed to access either 10-bit or 16-bit ROM. The "ROMW 10" directive forces the compiler to generate code in 10-bit format. – Arnauld – 2019-04-12T07:57:12.193

7

Java 8, 70 bytes

s->s.matches("M{0,3}(C[MD]|D?C{0,3})(X[CL]|L?X{0,3})(I[XV]|V?I{0,3})")

Port of @Innat3's C# answer, so make sure to upvote him!

Try it online.

Explanation:

s->                // Method with String parameter and boolean return-type
  s.matches("...") //  Check if the string matches the regex fully
                   //  (which implicitly adds a leading "^" and trailing "$")

M{0,3}             // No, 1, 2, or 3 adjacent "M"
(     |        )   // Followed by either:
 C[MD]             //  A "C" with an "M" or "D" after it
      |            // or:
       D?          //  An optional "D"
         C{0,3}    //  Followed by no, 1, 2, or 3 adjacent "C"
(     |        )   // Followed by either:
 X[CL]             //  An "X" with a "C" or "L" after it
      |            // or:
       L?          //  An optional "L"
         X{0,3}    //  Followed by no, 1, 2, or 3 adjacent "X"
(     |        )   // Followed by either:
 I[XV]             //  An "I" with an "X" or "V" after it
      |            // or:
       V?          //  An optional "V"
         I{0,3}    //  Followed by no, 1, 2, or 3 adjacent "I"

Kevin Cruijssen

Posted 2019-04-11T12:43:14.013

Reputation: 67 575

5

R, 74 71 56 bytes

Thanks to @RobinRyder, @Giuseppe, & @MickyT for their suggestions how to use grep effectively with R's built in as.roman.

sub("^M(.+)","\\1",scan(,""))%in%paste(as.roman(1:2999))

Try it online!

CT Hall

Posted 2019-04-11T12:43:14.013

Reputation: 591

as.roman won't work anyway, since it only works up to 3899 for some reason. – Giuseppe – 2019-04-11T15:56:14.180

I really should read the documentation better, Probably because 4000 doesn't have a definite representation in Roman, so how'd one do 3900. This is similar to 390 and now I just found an issue with my grep where I'd have to anchor the pattern. – CT Hall – 2019-04-11T16:01:51.743

@Giuseppe, addressed, using the same regex as the other answers. – CT Hall – 2019-04-11T16:26:23.410

I wonder if there's a way to use .romans here...probably not. – Giuseppe – 2019-04-11T16:30:14.427

@Giuseppe I thought about it but didn't explore very far. – CT Hall – 2019-04-11T16:33:01.110

266 bytes using as.roman: first strip the initial M if there is one, then check whether the result is in as.roman(1:2999). This requires special handling of the case where the input is M. – Robin Ryder – 2019-04-11T17:49:32.847

@RobinRyder 56 bytes using paste to coerce to character and tweaking the regex.

– Giuseppe – 2019-04-11T18:20:11.213

@Giuseppe change the regex to ^M(.+) to handle the special M case – MickyT – 2019-04-11T18:54:18.873

@MickyT ah, of course, I never get * versus + right on the first try... – Giuseppe – 2019-04-11T19:16:07.677

@Giuseppe for me it is trying to remember if it's ? or + – MickyT – 2019-04-11T19:48:21.927

1My last question is, who the heck decided that romans would be a useful thing to put into R??? It was added in 2.5.0 (April 2007)... – Giuseppe – 2019-04-11T20:11:18.420

Is the + necessary at all? I think you can just as well use ^M(.) and save a byte, the rest of the string then doesn't get matched and subsequently isn't replaced. – Aaron Hayman – 2019-04-12T11:50:35.190

of course + could be used elsewhere for 53 bytes Try it online!

– Aaron Hayman – 2019-04-12T12:23:18.707

or different approach for 50 (this one might need some checking but seems to work alright) Try it online!

– Aaron Hayman – 2019-04-12T12:41:29.947

If anyone is interested I asked StackOverflow a question about the mismatched regex from an earlier edit. You can view it here https://stackoverflow.com/questions/55642649/why-do-upper-bounded-braces-in-r-grepl-function-find-one-more-of-the-repeated-to/55643992#comment97999239_55643992

– CT Hall – 2019-04-12T23:48:44.317

4

Wolfram Language (Mathematica), 32 bytes

RomanNumeral@Range@3999~Count~#&

Try it online!

Expired Data

Posted 2019-04-11T12:43:14.013

Reputation: 3 129

2

Jelly,  48 47 46  44 bytes

-1 thanks to Nick Kennedy

5Żo7;“ÆæC‘ð“IVXLCDM”ṃ@3Ƥm2”MẋⱮ3¤ṭŻ€ṚŒpF€ḟ€0ċ

A monadic Link accepting a non-empty list of characters consisting only of IVXLCDM which yields either 1 (when it's a valid Roman numeral between \$1\$ and \$3999\$) or 0 (if not).

Try it online! Or see the test-suite.

How?

5Żo7;“ÆæC‘ð“IVXLCDM”ṃ@3Ƥm2”MẋⱮ3¤ṭŻ€ṚŒpF€ḟ€0ċ  - Main Link: list of characters S

5Żo7;“ÆæC‘  - chain 1: f(S) -> X
5Ż          - zero range of five = [0,1,2,3,4,5]
  o7        - OR seven             [7,1,2,3,4,5]
     “ÆæC‘  - list of code-page indices        [13,22,67]
    ;       - concatenate          [7,1,2,3,4,5,13,22,67]

          ð - start a new dyadic chain...

“IVXLCDM”ṃ@3Ƥm2”MẋⱮ3¤ṭŻ€ṚŒpF€ḟ€0ċ - chain 2: f(X,S) -> isValid
“IVXLCDM”                         - list of characters, IVXLCDM
           3Ƥ                     - for infixes of length three:
                                  - (i.e. IVX VXL XLC LCD CDM)
         ṃ@                       -   base decompression with swapped arguments
                                  -   (i.e. use characters as base-3 digits of X's values)
                                  -   (e.g. IVX -> VI I V IX II IV III VII VIII)
             m2                   - modulo two slice (results for IVX XLC and CDM only)
                    ¤             - nilad followed by link(s) as a nilad:
               ”M                 -   character 'M'
                  Ɱ3              -   map across [1,2,3] with:
                 ẋ                -     repeat -> M MM MMM
                     ṭ            - tack
                      Ż€          - prepend a zero to each
                        Ṛ         - reverse
                                  -   -- now we have the table: 
                                  -    0 M MM MMM
                                  -    0 DC C D CM CC CD CCC DCC DCCC
                                  -    0 LX X L XC XX XL XXX LXX LXXX
                                  -    0 VI I V IX II IV III VII VIII
                         Œp       - Cartesian product   [[0,0,0,0],...,["M","CM",0,"IV"],...]
                           F€     - flatten €ach  [[0,0,0,0],...,['M','C','M',0,'I','V'],...]
                             ḟ€0  - filter out the zeros from €ach       ["",...,"MCMIV",...]
                                ċ - count occurrences of S

Jonathan Allan

Posted 2019-04-11T12:43:14.013

Reputation: 67 804

There seems to be a redundant space on the first line. Another byte. Another byte can be saved by using a simpler first line. Try it online!

– Nick Kennedy – 2019-04-12T06:27:39.253

Thanks, I've saved one more from it. – Jonathan Allan – 2019-04-12T07:38:26.920

1

Perl 5 (-p), 57 bytes

$_=/^M*(C[MD]|D?C*)(X[CL]|L?X*)(I[XV]|V?I*)$/&!/(.)\1{3}/

TIO

  • uses almost the same regular expression except {0,3} quantifier was changed by *
  • &!/(.)\1{3}/ to ensure the same character can't occur 4 times in a row.
  • can't be golfed with -/(.)\1{3}/ because would give-1 for IIIIVI for example

Nahuel Fouilleul

Posted 2019-04-11T12:43:14.013

Reputation: 5 582

1

05AB1E, 61 9 8 bytes

ŽF¯L.XIå

Whopping \$\color{green}{\textrm{-52 bytes}}\$ thanks to @Adnan, because apparently 05AB1E's Roman Number builtin wasn't documented, haha.. xD

Try it online or verify all test cases.

Explanation:

ŽF¯       # Push comressed integer 3999
   L      # Create a list in the range [1,3999]
    .X    # Convert each integer in this list to a roman number string
      Iå  # Check if the input is in this list
          # (and output the result implicitly)

See this 05AB1E tip of mine (section How to compress large integers?) to understand why ŽF¯ is 3999.


Original 61 bytes answer:

•1∞Γ'иÛnuÞ\₂…•Ž8вв€SÐ)v.•6#&‘нδ•u3ôNèyè}'M3L×)Rεõš}`3Fâ}€˜JIå

Try it online or verify all test cases.

Explanation:

•1∞Γ'иÛnuÞ\₂…•             '# Push compressed integer 397940501547566186191992778
              Ž8в           # Push compressed integer 2112
                 в          # Convert the integer to Base-2112 as list:
                            #  [1,11,111,12,2,21,211,2111,10]
€S                          # Convert each number to a list of digits
  Ð                         # Triplicate this list
   )                        # And wrap it into a list of lists (of lists)
    v                       # Loop `y` over each these three lists:
     .•6#&‘нδ•              #  Push compressed string "xivcxlmcd"
              u             #  Uppercased
               3ô           #  And split into parts of size 3: ["XIV","CXL","MCD"]
     Nè                     #  Use the loop index to get the current part
       yè                   #  And index the list of lists of digits into this string
    }'M                    '# After the loop: push "M"
       3L                   # Push list [1,2,3]
         ×                  # Repeat the "M" that many times: ["M","MM","MMM"]
          )                 # Wrap all lists on the stack into a list:
                            # [[["I"],["I","I"],["I","I","I"],["I","V"],["V"],["V","I"],["V","I","I"],["V","I","I","I"],["I","X"]],[["X"],["X","X"],["X","X","X"],["X","L"],["L"],["L","X"],["L","X","X"],["L","X","X","X"],["X","C"]],[["C"],["C","C"],["C","C","C"],["C","D"],["D"],["D","C"],["D","C","C"],["D","C","C","C"],["C","M"]],["M","MM","MMM"]]
           R                # Reverse this list
            εõš}            # Prepend an empty string "" before each inner list
                `           # Push the four lists onto the stack
                 3F         # Loop 3 times:
                   â        #  Take the cartesian product of the two top lists
                    }€˜     # After the loop: flatten each inner list
                       J    # Join each inner list together to a single string
                        Iå  # And check if the input is in this list
                            # (after which the result is output implicitly)

See this 05AB1E tip of mine (sections How to compress strings not part of the dictionary?, How to compress large integers?, and How to compress integer lists?) to understand why:

  • •1∞Γ'иÛnuÞ\₂…• is 397940501547566186191992778
  • Ž8в is 2112
  • •1∞Γ'иÛnuÞ\₂…•Ž8вв is [1,11,111,12,2,21,211,2111,10]
  • .•6#&‘нδ• is "xivcxlmcd"

Kevin Cruijssen

Posted 2019-04-11T12:43:14.013

Reputation: 67 575

1

I'm not sure why .X is not documented, but I think this should work: 3999L.XQO

– Adnan – 2019-04-13T20:56:57.470

@Adnan Haha, -52 bytes right there. Completely forgot you indeed told us about adding a Roman Number builtin. Will ask @Mr.Xcoder in chat to add it to the docs. Are any other commands missing? ;) PS: Saved another byte by compressing 3999. :) – Kevin Cruijssen – 2019-04-13T21:24:58.927

1

Python 2, 81 bytes

import re
re.compile('M{,3}(D?C{,3}|C[DM])(L?X{,3}|X[LC])(V?I{,3}|I[VX])$').match

Try it online!

Let's look at the last part of the regex, which matching the Roman numerals up to 9 (including the empty string)

V?I{,3}|I[VX]

This has two alternatives separated by |:

  • V?I{,3}: An optional V followed by up to 3 I's. This matches the empty string I,II,III, V, VI,VII,VIII.
  • I[VX]: An I followed by a V or X. This matches IV and IX.

The same things with X,L,C matching the tens, with C,D,M matches the hundreds, and finally ^M{,3} allows up to 3 M's (thousands) at the start.

I tried generating the template for each trio of characters rather than writing it 3 times, but this was a lot longer.

xnor

Posted 2019-04-11T12:43:14.013

Reputation: 115 687

No need for the ^ anchor at the beginning; match already implies it matches at the beginning of the string. – ShadowRanger – 2019-04-12T01:28:03.770

@ShadowRanger Thanks, I removed the ^. – xnor – 2019-04-12T01:29:27.493

Although I think you messed up the count in the edit; should be 83, not 81. – ShadowRanger – 2019-04-12T01:40:16.133

@ShadowRanger The count is 81 because the f= isn't included in the code since anonynomous functions are allowed. It's just for TIO. – xnor – 2019-04-12T01:42:22.833

1Ah, makes sense. Annoying there's no way to organize it to hide that in the header or footer, but yeah, unassigned lambdas are legal, so unassigned bound methods of compiled regex should be good too. – ShadowRanger – 2019-04-12T01:44:47.173

well, unassigned lambdas are legal, but only because they're a valid expression that can be used as a function, but imports are used all the time. it's either just something the community has implicitly (?) agreed on, or i must have missed the meta post – ASCII-only – 2019-04-12T07:36:58.357

1

Retina, 56 51 bytes

(.)\1{3}
0
^M*(C[MD]|D?C*)(X[CL]|L?X*)(I[XV]|V?I*)$

Port of @NahuelFouilleul's Perl 5 answer, so make sure to upvote him!

Try it online or verify all test cases.

Explanation:

(.)\1{3}        # If four adjacent characters can be found which are the same
0               # Replace it with a 0

^...$           # Then check if the string matches the following fully:
 M*             #  No or any amount of adjacent "M"
 (     |    )   #  Followed by either:
  C[MD]         #   A "C" with an "M" or "D" after it
       |        #  or:
        D?      #   An optional "D"
          C*    #   Followed by no or any amount of adjacent "C"
 (     |    )   #  Followed by either:
  X[CL]         #   An "X" with a "C" or "L" after it
       |        #  or:
        L?      #   An optional "L"
          X*    #   Followed by no or any amount of adjacent "X"
 (     |    )   #  Followed by either:
  I[XV]         #   An "I" with an "X" or "V" after it
       |        #  or:
        V?      #   An optional "V"
          I*    #   Followed by no or any amount of adjacent "I"

Kevin Cruijssen

Posted 2019-04-11T12:43:14.013

Reputation: 67 575

0

Python 3, 116 113 109 107 105 106 bytes

import re
lambda n:re.match(r'(M{,3}(C(M|CC?|D)?|DC{,3}))(X(C|XX?|L)?|(LX{,3}))?(I(X|II?|V)?|VI{,3})?$',n)

Try it online!

-1 byte thanks to ShadowRanger

Noodle9

Posted 2019-04-11T12:43:14.013

Reputation: 2 776

2As I mentioned on the Py2 answer, the leading ^ is unnecessary since match only matches at the beginning of a string already. – ShadowRanger – 2019-04-12T01:35:30.680

@ShadowRanger added anchors while debugging and then didn't try again without them. I'll remember that now - thanks! :) – Noodle9 – 2019-04-12T12:57:06.687

Well, just to be clear, the trailing $ is necessary (only fullmatch implies anchors on both ends, and obviously that would cost more than a $). – ShadowRanger – 2019-04-12T13:53:44.743

@ShadowRanger Ah! That explains why I needed anchors! Didn't realize I only needed to anchor the end. Thanks again. – Noodle9 – 2019-04-12T14:12:24.320

0

perl -MRegexp::Common -pe, 34 bytes

$_=/^$RE{num}{roman}$/&!/(.)\1{3}/

The &!/(.)\1{3}/ part is necessary, because Regexp::Common allows four (but not five) of the same characters in a row. That way, it matches roman numbers used on clock faces, where IIII is often used for 4.

user73921

Posted 2019-04-11T12:43:14.013

Reputation:

0

Ruby, (-n) 56 bytes

p~/^M{,3}(D?C{,3}|CM|CD)(L?X{,3}|XC|XL)(V?I{,3}|IV|IX)$/

Try it online!

Outputs 0 (truthy) or nil (falsy).

Reinstate Monica -- notmaynard

Posted 2019-04-11T12:43:14.013

Reputation: 1 053