Count the Shantens

9

1

Introduction to Mahjong tiles

Mahjong (麻雀) is a board game that originates from China. Mahjong tiles used in this challenge are in Unicode points U+1F000 – U+1F021:


They are categorized as:

  • Winds (風牌): (East Wind), (South Wind), (West Wind), (North Wind)

  • Dragons (三元牌): (Red Dragon), (Green Dragon), (White Dragon)

  • Ten Thousands (萬子): (One of the Ten Thousands) through (Nine of the Ten Thousands)

  • Bamboos (索子): (One of the Bamboos; note that it depicts a bird) through (Nine of the Bamboos)

  • Circles (筒子): (One of the Circles) through (Nine of the Circles)

  • Winds and Dragons are together called Honors (字牌).

  • Ten Thousands, Bamboos and Circles are each called a suit and together called Numbers (数牌). They have ranks from 1 to 9.

  • Among Numbers, the Ones and the Nines are called Terminals (老頭牌). The Twos through the Eights are called Simples (中張牌).

  • Honors and Terminals are together called Orphans (幺九牌).

Every Mahjong tiles have 4 copies of themselves.

Mahjong Hand

A legal Hand consists of:

  • A Head (対子), which is 2 identical tiles.

  • 4 Bodies (面子), each which is either:

    • A Sequence (順子), which is 3 consecutive Numbers with same suit.

    • A Triplet (刻子), which is 3 identical tiles.

Hence, a Hand consists of 14 tiles.

There are exceptional valid Hands. Seven Heads (七対子), which is 7 distinct Heads, and Thirteen Orphans (国士無双), which collects all 13 Orphans with an additional Orphan.

Tenpais and Shantens

13 Mahjong tiles will be given as the input. If an additional tile would make a valid Hand, this is called Tenpai (聽牌). Tenpai is zero Shantens.

If a tile must be exchanged to be Tenpai, this is 1-Shanten (1向聴).

If a tile must be exchanged to be 1-Shanten, this is 2-Shantens, and so on.

The objective is to output the number of Shantens.

Examples

: Tenpai waiting for or to make a Sequence. An example of Two-sided Wait (兩面待機).

: Tenpai waiting for or to make a Triplet. An example of Wait for a Triplet (双碰待機).

: Tenpai waiting for to make a Sequence. An example of One-sided Wait (邊張待機).

: Tenpai waiting for to make a Sequence. An example of Wait for the Middle (坎張待機).

: Tenpai waiting for to make a Head. An example of Wait for a Head (單騎待機).

: Tenpai waiting for , , or . This example demonstrates how multiple ways of waiting can be combined.

: Tenpai waiting for any of Ten Thousands. This is called Nine Gates (九蓮宝燈), and has the highest possible number of tiles to wait for to be a "normal" Hand.

: Tenpai waiting for to make a Head. An example of Seven Heads.

: Tenpai waiting for . An example of Thirteen Orphans.

: Tenpai waiting for any of Orphans. Another example of Thirteen Orphans.

: 1-Shanten. Note that this is not Tenpai waiting for because it would be a fifth .

: 1-Shanten. Note that this is not Tenpai for Seven Heads because getting will result in a duplicated Head.

: 2-Shantens.

: 6-Shantens. This is the highest possible number of Shantens.

You can demonstrate arbitrary examples here. (for 13 or 14 tiles)

Rules about code golf

  • Input type and format doesn't matter, but it must consist of the Unicode characters above. In C++, valid examples include std::u8string (sorted or not) , u32char_t[13], and std::multiset<u32char_t>.

  • Output type and format doesn't matter either.

  • Invalid input (not exactly 13 tiles, contains 5 copies of the same tile, contains a character not in U+1F000 – U+1F021, etc) falls into don't care situation.

Dannyu NDos

Posted 2020-02-10T06:27:16.637

Reputation: 2 087

1@KevinCruijssen Thanks. Since I want to see how golfing languages handle Unicode, I removed that rule. – Dannyu NDos – 2020-02-10T09:01:18.700

FYI: for some reason, the Mahjong tile characters are hard to read on (my version of) Firefox which alternates between small monochrome characters and larger color characters. Weirdly enough, the monochrome and color versions of the same character are both used depending on its position in the post. No problem on Chrome on the same laptop. – Arnauld – 2020-02-10T10:23:23.833

@Arnauld Really? I'm using Firefox on Ubuntu, and to me, only is displayed as emoji, and all other tiles are displayed as monochrome characters. I'm afraid I don't know a solution. – Dannyu NDos – 2020-02-10T10:26:55.733

It is quite painful to convert from Unicode to the mpsz notation used by the tool linked in the challenge, or the other way around. So ...

– Arnauld – 2020-02-11T02:21:36.220

... here are two scripts to perform both conversions.

– Arnauld – 2020-02-11T02:21:46.907

1Also, I think we need more non-tenpai test cases, especially some tricky ones where it's not obvious to decide which groups should be removed or completed to get the optimal N-shantens. – Arnauld – 2020-02-11T02:27:53.683

Man, seeing this challenge brought me back to when I played some games for fun (no actual monetary stakes). I was a beginner and wanted to be super sure I would have the yaku needed to win with the hand I had, and somehow ended up on tenpai for a closed chin'itsu-tan'yao-ryanpeikou, which (if I won with it) has some ridiculous score value... I didn't win the round, but when I showed what I had after the game the table basically flipped out with how insane my hand was lmao – Value Ink – 2020-02-14T08:17:05.833

Answers

4

05AB1E, 80 77 73 72 77 74 bytes

₂Ý8+3ä€ü3€`4иML3δи©«4㮀¦©âʒ˜D¢à5‹}®7.Æ«8L•1«þï•₆в«D¸â«ε˜•1óñ•+çJœIœδ.L}W<

-5 bytes thanks to @Grimmy, and also for teaching me about a hidden builtin
+5 bytes to fix two bugs

Brute-force approach that's extremely slow, so obviously no TIO with test cases.

Although TIO is too slow for the entire solution, I've added a TIO after each section so you can verify the individual parts work as intended. Note that the final TIO-link only uses a portion of each hand (including the input).

General explanation:

  1. Generate a list of all valid 14-tile hands
  2. Get every possible permutation of every 14-tile hand
  3. Get the input-hand and get all its permutations as well
  4. Get the Levenshtein distance between every input-hand permutation and every valid 14-tile hand permutation
  5. Keep the lowest Levenshtein distance across all these checks, and decrease it by 1 to get the number of Shantens

Code explanation:

Create a list of all overlapping consecutive triplets of the Numbers:

₂Ý                # Push a list in the range [0,26]
  8+              # Add 8 to each to make the range [8,34]
    3ä            # Split it into 3 equal-sized parts: [[8,16],[17,25],[26,34]]
      €           # For each inner list:
       ü3         #  Create all overlapping triplets of this list
                  #   [a,b,c,d,e,...] → [[a,b,c],[b,c,d],[c,d,e],...]
         €`       # Then flatten one level down to have a list of triplets
           4и     # And repeat each consecutive triplet 4 times in the list

Try it online.

Create a list of all possible triplets of all tiles:

M                 # Push the largest value on the stack thus far (including inner lists),
                  # which is 34
 L                # Pop and push a list in the range [1,34]
   δ              # For each value in this list,
  3 и             #  Repeat it 3 times as list
     ©            # Store the list of identical triplets in variable `®` (without popping)

Try it online.

Then:

«                 # Merge the two lists together
 4ã               # And take the cartesian product of 4 with this list

Which results in all possible combinations of 4 Bodies: Try it online.

Then we'll generate all possible Heads:

®                 # Push the list of identical triplets from variable `®`
 €¦               # Remove one value from each so we have identical pairs (Heads) instead
   ©              # Store that as new variable `®` (without popping)

Try it online.

And create every possible combination of these 1 Heads + 4 Bodies:

â                 # By taking the cartesian product of the two lists
 ʒ      }         # And then filtering it so we won't use more than 4 of the same tile:
  ˜               #  Flatten the pairs/triplets to a single list
   D¢             #  Duplicate it, and count for each item how much it occurs
     à            #  Get the maximum for the tile that occurs the most
      5‹          #  And check that this it smaller than 5

Now we have all possible regular 14-tile hands: Try it online.

Next we'll also create the special type of 14-tile heads, starting with the 7 Heads:

®                 # Push the list of identical pairs (Heads) from variable `®` again
 7.Æ              # Get all 7-element combinations of these distinct heads
    «             # And merge it to the earlier created valid 14-tile hands

Try it online.

And the final valid 14-tile hands are the 13+1 Orphans:

8L                # Push a list in the range [1,8] (4 Winds, 3 Dragons, and One of the Ten Thousands)
  •1«þï•          # Push compressed integer 27700378
        ₆в        # Convert it to base-36 as list: [16,17,25,26,34] (Nine of the Ten Thousands,
                  #  One & Nine of the Bamboos, One & Nine of the Circles)
          «       # Merge those together
           D¸     # Duplicate this list of 13 Orphans, and wrap it into a list
             â    # And take the cartesian product of the two
              «   # And also merge those to the earlier created valid 14-tile hands

Try it online.

Now that we finally have all our valid 14-tile hands, we're going to get all their permutations as well as the permutations of the input, and check the Levenshtein-distance between each possible combination. After which the lowest Levenshtein-distance across all permutation-comparisons minus 1 is our number of Shatens:

ε                 # Map over each valid 14-tile hand:
 ˜                #  Flatten the pairs/triplets to a single 14-length list
  •1óñ•           #  Push compressed integer 126975
       +          #  Add it to each value in the list
        ç         #  Convert each from integer to their unicode character
         J        #  And join it together to a single string
          œ       #  Get all permutations of this hand
 Iœ               #  Then push the input, and get all its permutations as well
 δ                #  And then check for every possible combination of the permutations:
  .L              #   What the Levenshtein-distance is between the two

And finally get the (flattened) lowest one to calculate the number of Shantens:

}W                # After the map: push the flattened lowest Levenshtein-distance
  <               # And subtract it by 1 to get the number of Shantens
                  # (after which this is output implicitly as result)

Try it online with just a portion of the valid hands, each cut into a smaller hand, and with added debug-lines.

See this 05AB1E tip of mine (sections How to compress large integers? and How to compress integer lists?) to understand why •1«þï• is 27700378; •1«þï•₆в is [16,17,25,26,34]; and •1óñ• is 126975.

How slow?

To give you an idea of how slow this approach is, here are some numbers:

Regular hands: \$34×118^4-X\$

  • There are \$118\$ valid Bodies (including duplicated subsequent triplets for the Numbers, which each occur four time), and thus \$118^4\$ combinations of 4 Bodies (yes, this includes duplicated combinations due to the duplicated subsequent Number triplets).
  • There are \$34\$ valid Heads, and thus \$34×118^4\$ combinations of 1 Head + 4 Bodies.
  • After which \$X\$ (TODO: do the math..) combinations are removed, because they contain a certain tile more than four times.

Special hands: \$\binom{34}{7} + 13\$

  • There are again \$34\$ valid Heads, and thus \$\binom{34}{7}\$ combinations of 7 distinct Heads.
  • There are \$13\$ Orphans (\$4\$ Winds + \$3\$ Dragons + \$3\$ Ones + \$3\$ Nines), and thus \$13\$ valid Orphan hands.

There are therefore a total of \$34×118^4-X + \binom{34}{7} + 13 = 6597224013-X\$ valid 14-tile hands.

Comparing the input-hand and a single valid-hand: \$14!×13!\$

  • Getting all permutations of the 14-tile hands means there are \$14!\$ permutations.
  • Getting all permutations of the 13-tile input-hand means there are \$13!\$ permutations.
  • Checking each permutation of a valid 14-tile hand with each permutation of the input-hand are therefore \$14!×13! = 542861032610856960000\$ valid permutation-pairs.

So in conclusion:

Assuming a standard Levenshtein algorithm is used in the .L builtin, the complexity is in general about \$O(n^2)\$ (\$n\$ being the length of the string). Which in this case would be complexity \$O(14^2) = O(196)\$ for each of the \$14!×13! × (34×118^4-X + \binom{34}{7} + 13) = \$\$3581375840062321621020180480000 - 542861032610856960000×X\$ permutation-combinations across all valid hands (after which we still have to get the flattened \$minimum-1\$).
Safe to say: way too slow for a TIO link for even a single test case, let alone a test suite of all test cases. ;)

Kevin Cruijssen

Posted 2020-02-10T06:27:16.637

Reputation: 67 575

2ü‚ü«€Ù is the built-in ü3: TIO. – Grimmy – 2020-02-10T13:17:44.663

@Grimmy Huh, since when is that a thing?! o.Ô – Kevin Cruijssen – 2020-02-10T13:18:41.743

Probably since non-legacy 05AB1E. Also, that literal 34 can be M. – Grimmy – 2020-02-10T13:20:14.357

If anyone knows the math for the $X$, let me know. – Kevin Cruijssen – 2020-02-11T12:09:49.140

Winning hands => permutations of each => drop last from each => levenshtein distance of each with the current hand would also work, saving a 13! execution time factor and probably a few bytes. – Grimmy – 2020-02-11T12:44:51.270

@Grimmy It would definitely increase the speed by a factor $13!$, but unfortunately no bytes (I think). The trailing œIœδ.L}W< I have now, would become œε¨I.L]W< in that case. – Kevin Cruijssen – 2020-02-11T13:03:03.967