Convert to Suzhou numerals

27

3

Suzhou numerals (蘇州碼子; also 花碼) are Chinese decimal numerals:

0 〇
1 〡 一
2 〢 二
3 〣 三
4 〤
5 〥
6 〦
7 〧
8 〨
9 〩

They pretty much work like Arabic numerals, except that when there are consecutive digits belonging to the set {1, 2, 3}, the digits alternate between vertical stroke notation {〡,〢,〣} and horizontal stroke notation {一,二,三} to avoid ambiguity. The first digit of such a consecutive group is always written with vertical stroke notation.

The task is to convert a positive integer into Suzhou numerals.

Test cases

1          〡
11         〡一
25         〢〥
50         〥〇
99         〩〩
111        〡一〡
511        〥〡一
2018       〢〇〡〨
123321     〡二〣三〢一
1234321    〡二〣〤〣二〡
9876543210 〩〨〧〦〥〤〣二〡〇

Shortest code in bytes wins.

for Monica

Posted 2018-12-13T00:35:08.100

Reputation: 1 172

1I've been in Suzhou 3 times for longer period of time (quite a nice city) but didn't know about Suzhou numerals. You have my +1 – Thomas Weller – 2018-12-13T12:19:40.257

2@ThomasWeller For me it's the opposite: before writing this task I knew what the numerals were, but not that they were named "Suzhou numerals". In fact I've never heard them called this name (or any name at all). I've seen them in markets and on handwritten Chinese medicine prescriptions. – for Monica – 2018-12-13T13:51:53.637

Can you take input in the form of a char array? – Embodiment of Ignorance – 2018-12-13T16:46:23.530

@EmbodimentofIgnorance Yes. Well, enough people are taking string input anyway. – for Monica – 2018-12-15T07:26:34.987

Answers

2

Jelly, 35 bytes

9Ḷ;-26ż“/Ẉ8‘+⁽ȷc¤ṃ@ɓD_2ỊŒgÄFị"+⁽-FỌ

Dennis

Posted 2018-12-13T00:35:08.100

Reputation: 196 637

9

JavaScript, 81 bytes

s=>s.replace(/./g,c=>(p=14>>c&!p)|c>3?eval(`"\\u302${c}"`):'〇一二三'[c],p=0)

Using 14>>c saves 3 bytes. Thanks to Arnauld.

tsh

Posted 2018-12-13T00:35:08.100

Reputation: 13 072

9

R, 138 bytes

I'll bet there's an easier way to do this. Use gsub to get the alternating numeric positions.

function(x,r=-48+~x)Reduce(paste0,ifelse(58<~gsub("[123]{2}","0a",x),"123"["一二三",r],'0-9'["〇〡-〩",r]))
"~"=utf8ToInt
"["=chartr

J.Doe

Posted 2018-12-13T00:35:08.100

Reputation: 2 379

8

Retina, 46 bytes

/[1-3]{2}|./_T`d`〇〡-〩`^.
T`123`一二三

Try it online! Link includes test cases. Explanation:

/[1-3]{2}|./

Match either two digits 1-3 or any other digit.

_T`d`〇〡-〩`^.

Replace the first character of each match with its Suzhou.

T`123`一二三

Replace any remaining digits with horizontal Suzhou.

51 bytes in Retina 0.8.2:

M!`[1-3]{2}|.
mT`d`〇〡-〩`^.
T`¶123`_一二三

Try it online! Link includes test cases. Explanation:

M!`[1-3]{2}|.

Split the input into individual digits or pairs of digits if they are both 1-3.

mT`d`〇〡-〩`^.

Replace the first character of each line with its Suzhou.

T`¶123`_一二三

Join the lines back together and replace any remaining digits with horizontal Suzhou.

Neil

Posted 2018-12-13T00:35:08.100

Reputation: 95 035

7

Perl 5 `-pl -Mutf8`, 53 46 bytes

-7 bytes thanks to Grimy

s/[123]{2}|./OS&$&/ge;y//〇〡-〰一二三/c

Explanation

# Binary AND two consecutive digits 1-3 (ASCII 0x31-0x33)
# or any other single digit (ASCII 0x30-0x39) with string "OS"
# (ASCII 0x4F 0x53). This converts the first digit to 0x00-0x09
# and the second digit, if present, to 0x11-0x13.
s/[123]{2}|./OS&$&/ge;
# Translate empty complemented searchlist (0x00-0x13) to
# respective Unicode characters.
y//〇〡-〰一二三/c

nwellnhof

Posted 2018-12-13T00:35:08.100

Reputation: 10 037

-3 bytes with s/[123]\K[123]/$&^$;/ge;y/--</一二三〇〡-〩/ (TIO)

– Grimmy – 2018-12-13T14:17:30.350

49: s/[123]{2}/$&^v0.28/ge;y/--</一二三〇〡-〩/ (TIO).

48: s/[123]{2}/$&^"\0\34"/ge;y/--</一二三〇〡-〩/ (requires using literal control characters instead of \0\34, idk how to do this on TIO)

– Grimmy – 2018-12-13T14:28:57.473

46: s/[123]{2}|./OS&$&/ge;y//〇〡-〰一二三/c (TIO)

– Grimmy – 2018-12-13T15:01:31.953

6

Java (JDK), 120 bytes

s->{for(int i=0,p=0,c;i<s.length;)s[i]+=(p>0&p<4&(c=s[i++]-48)>0&c<4)?"A䷏乚䷖".charAt(c+(p=0)):(p=c)<1?12247:12272;}

Credits

-3 bytes thanks to Kevin Cruijssen

Olivier Grégoire

Posted 2018-12-13T00:35:08.100

Reputation: 10 647

1c=s[i]-48;if(p>0&p<4&c>0&c<4) can be if(p>0&p<4&(c=s[i]-48)>0&c<4), and then you can also drop the brackets around the loop. Also, else{p=c;s[i]+=c<1?12247:12272;} can be else s[i]+=(p=c)<1?12247:12272; – Kevin Cruijssen – 2018-12-13T10:27:45.520

1@KevinCruijssen Thank you! I was still golfing this answer, but it helped me nonetheless ^^ Now I think I'm done golfing it. – Olivier Grégoire – 2018-12-13T10:50:25.950

5

JavaScript (ES6), 95 89 88 bytes

Saved 6 bytes thanks to @ShieruAsakoto

Takes input as a string.

s=>s.replace(i=/./g,c=>'三二一〇〡〢〣〤〥〦〧〨〩'[i=112>>i&c<4?3-c:+c+3])

Arnauld

Posted 2018-12-13T00:35:08.100

Reputation: 111 334

89 bytes – Shieru Asakoto – 2018-12-13T02:34:44.367

@ShieruAsakoto That's much better! Thanks a lot! – Arnauld – 2018-12-13T02:56:25.257

5

Python 3, 102 bytes

f=0
for i in input():f=i in'123'and 9-f;print(end='〇一二三〤〥〦〧〨〩〡〢〣'[int(i)+f])

mypetlion reminded me of a trivial golf. -4 bytes.

Erik the Outgolfer

Posted 2018-12-13T00:35:08.100

Reputation: 38 134

3

Clean, 181 165 bytes

All octal escapes can be replaced by the equivalent single-byte characters (and are counted as one byte each), but used for readability and because otherwise it breaks TIO and SE with invalid UTF-8.

import StdEnv
u=map\c={'\343','\200',c}
?s=((!!)["〇":s++u['\244\245\246\247\250']])o digitToInt
$[]=[]
$[h:t]=[?(u['\241\242\243'])h:if(h-'1'<'\003')f$t]
f[]=[]
f[h:t]=[?["一","二","三"]h: $t]

An encoding-unaware compiler is both a blessing and a curse.

Οurous

Posted 2018-12-13T00:35:08.100

Reputation: 7 916

2

Perl 6 `-p`, 85 61 bytes

-13 bytes thanks to Jo King

s:g[(1|2|3)<((1|2|3)]=chr $/+57;tr/0..</〇〡..〩一二三/

nwellnhof

Posted 2018-12-13T00:35:08.100

Reputation: 10 037

2

Red, 198 171 bytes

func[n][s: charset"〡〢〣"forall n[n/1: either n/1 >#"0"[to-char 12272 + n/1][#"〇"]]parse
n[any[[s change copy t s(pick"一二三"do(to-char t)- 12320)fail]| skip]]n]

Galen Ivanov

Posted 2018-12-13T00:35:08.100

Reputation: 13 815

2

Jelly, 38 bytes

9Rż“øƓ“œ%“øƈ’;-25+⁽-EỌœị@DżD<4«Ɗ‘×¥\ƊƊ

Erik the Outgolfer

Posted 2018-12-13T00:35:08.100

Reputation: 38 134

2

C, 131 bytes

f(char*n){char*s="〇〡〢〣〤〥〦〧〨〩一二三",i=0,f=0,c,d;do{c=n[i++]-48;d=n[i]-48;printf("%.3s",s+c*3+f);f=c*d&&(c|d)<4&&!f?27:0;}while(n[i]);}

Explanation: First of all - I'm using char for all variables to make it short.

Array s holds all needed Suzhou characters.

The rest is pretty much iterating over the provided number, which is expressed as a string.

When writing to the terminal, I'm using the input number value (so the character - 48 in ASCII), multiplied by 3, because all these characters are 3 bytes long in UTF-8. The 'string' being printed is always 3 bytes long - so one real character.

Variables c and d are just 'shortcuts' to current and next input character(number).

Variable f holds 0 or 27 - it says if the next 1/2/3 character should be shifted to alternative one - 27 is the offset between regular and alternative character in the array.

f=c*d&&(c|d)<4&&!f?27:0 - write 27 to f if c*d != 0 and if they are both < 4 and if f isn't 0, otherwise write 0.

Could be rewritten as:

if( c && d && c < 4 && d < 4 && f == 0)
f = 27
else
f = 0

Maybe there are some bytes to shave off, but I'm no longer able to find anything obvious.

Michał Stoń

Posted 2018-12-13T00:35:08.100

Reputation: 21

120 bytes. – Jonathan Frech – 2018-12-14T17:58:40.823

1

Ruby `-p`, 71 bytes

$_=gsub(/[1-3]\K[1-3]/){|x|(x.ord+9).chr}.tr"0-<","〇〡-〩一二三"

Kirill L.

Posted 2018-12-13T00:35:08.100

Reputation: 6 693

1

K (ngn/k), 67 bytes

{,/(0N 3#"〇一二三〤〥〦〧〨〩〡〢〣")x+9*<\x&x<4}@10\

10\ get list of decimal digits

{ }@ apply the following function

x&x<4 boolean (0/1) list of where the argument is less than 4 and non-zero

<\ scan with less-than. this turns runs of consecutive 1s into alternating 1s and 0s

x+9* multiply by 9 and add x

juxtaposition is indexing, so use this as indices in...

0N 3#"〇一二三〤〥〦〧〨〩〡〢〣" the given string, split into a list of 3-byte strings. k is not unicode aware, so it sees only bytes

,/ concatenate

ngn

Posted 2018-12-13T00:35:08.100

Reputation: 11 449

1

Wolfram Language (Mathematica), 117 bytes

FromCharacterCode[12320+(IntegerDigits@#/. 0->-25//.MapIndexed[{a___,c=#2[[1]],c,b___}->{a,c,#,b}&,{0,140,9}+7648])]&

Note that on TIO this outputs the result in escaped form. In the normal Wolfram front end, it will look like this:

Kelly Lowder

Posted 2018-12-13T00:35:08.100

Reputation: 3 225

1Can you implement horizontal stroke notation for twos and threes? E.g. f[123] should return 〡二〣. – for Monica – 2018-12-15T07:35:53.320

1

Japt, 55 bytes

s"〇〡〢〣〤〥〦〧〨〩"
ð"[〡〢〣]" óÈ¥YÉÃ®ë2,1Ãc
£VøY ?Xd"〡一〢二〣三":X

It's worth noting that TIO gives a different byte count than my preferred interpreter, but I see no reason not to trust the one that gives me a lower score.

Explanation:

    Step 1:
s"〇〡〢〣〤〥〦〧〨〩"        Convert the input number to a string using these characters for digits

    Step 2:
ð                            Find all indexes which match this regex:
 "[〡〢〣]"                    A 1, 2, or 3 character
           ó    Ã            Split the list between:
            È¥YÉ              Non-consecutive numbers
                  ®    Ã     For each group of consecutive [1,2,3] characters:
                   ë2,1      Get every-other one starting with the second
                        c    Flatten

    Step 3:
£                              For each character from step 1:
 VøY                           Check if its index is in the list from step 2
     ?                         If it is:
      Xd"〡一〢二〣三"            Replace it with the horizontal version
                     :X        Otherwise leave it as-is

Kamil Drakari

Posted 2018-12-13T00:35:08.100

Reputation: 3 461

1

C# (.NET Core), 107 bytes, 81 chars

n=>{var t="〇一二三〤〥〦〧〨〩〡〢〣";var b=0;return n.Select(k=>t[k+(b+=k>0&k<4?1:b)%2*9]);}

Saved 17 bytes thanks to @Jo King

Old Answer

C# (.NET Core), 124 bytes, 98 chars

n=>{var t="〇一二三〤〥〦〧〨〩〡〢〣";var b=0<1;return n.Select(k=>{b=k>0&k<4?!b:0<1;return b?t[k]:t[k+9];});}

Takes input in the form of a List, and returns an IEnumerable. I don't know if this input/output is ok, so just let me know if it isn't.

Explanation

How this works is that it transforms all the integers to their respective Suzhou numeral form, but only if variable b is true. b is inverted whenever we meet an integer that is one, two, or three, and set to true otherwise. If b is false, we turn the integer to one of the vertical numerals.

Embodiment of Ignorance

Posted 2018-12-13T00:35:08.100

Reputation: 7 014

0

C#, 153 bytes

n=>Regex.Replace(n+"",@"[4-90]|[1-3]{1,2}",x=>"〇〡〢〣〤〥〦〧〨〩"[x.Value[0]-'0']+""+(x.Value.Length>1?"一二三"[x.Value[1]-'0'-1]+"":""))

zruF

Posted 2018-12-13T00:35:08.100

Reputation: 59

This is 153 bytes, by the way, characters don't always mean bytes. Some characters are worth multiple bytes. – Embodiment of Ignorance – 2018-12-15T04:21:45.930

Oh well, I edited my answer. Thanks for the information :) – zruF – 2018-12-17T08:02:05.930

0

R, 104 bytes

function(x,`[`=chartr)"a-jBCD"["〇〡-〩一二三",gsub("[bcd]\\K([bcd])","\\U\\1","0-9"["a-j",x],,T)]

An alternative approach in R. Makes use of some Perl-style Regex features (the last T param in substitution function stands for perl=TRUE).

First, we translate numerals to alphabetic characters a-j, then use Regex substitution to convert duplicate occurrences of bcd (formerly 123) to uppercase, and finally translate characters to Suzhou numerals with different handling of lowercase and uppercase letters.

Credit to J.Doe for the preparation of test cases, as these were taken from his answer.

Kirill L.

Posted 2018-12-13T00:35:08.100

Reputation: 6 693

Asked: 2018-12-13T00:35:08.100

Viewed: 2 375 times

Active: 2018-12-17T08:01:33.643