11
Pretty soon it's going to be 50 years since IBM unveiled its System/360 family of computers. These were the first to use the EBCDIC character set.
To mark the occasion, let's see who can write the shortest program capable of converting "ordinary" text to and from EBCDIC code page 037. We'll be using a translation table from Wikipedia that maps CP037 to a superset of ISO-8859-1:
EBCDIC037_to_Latin1 = [
0x00,0x01,0x02,0x03,0x9c,0x09,0x86,0x7f,0x97,0x8d,0x8e,0x0b,0x0c,0x0d,0x0e,0x0f,
0x10,0x11,0x12,0x13,0x9d,0x85,0x08,0x87,0x18,0x19,0x92,0x8f,0x1c,0x1d,0x1e,0x1f,
0x80,0x81,0x82,0x83,0x84,0x0a,0x17,0x1b,0x88,0x89,0x8a,0x8b,0x8c,0x05,0x06,0x07,
0x90,0x91,0x16,0x93,0x94,0x95,0x96,0x04,0x98,0x99,0x9a,0x9b,0x14,0x15,0x9e,0x1a,
0x20,0xa0,0xe2,0xe4,0xe0,0xe1,0xe3,0xe5,0xe7,0xf1,0xa2,0x2e,0x3c,0x28,0x2b,0x7c,
0x26,0xe9,0xea,0xeb,0xe8,0xed,0xee,0xef,0xec,0xdf,0x21,0x24,0x2a,0x29,0x3b,0xac,
0x2d,0x2f,0xc2,0xc4,0xc0,0xc1,0xc3,0xc5,0xc7,0xd1,0xa6,0x2c,0x25,0x5f,0x3e,0x3f,
0xf8,0xc9,0xca,0xcb,0xc8,0xcd,0xce,0xcf,0xcc,0x60,0x3a,0x23,0x40,0x27,0x3d,0x22,
0xd8,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0xab,0xbb,0xf0,0xfd,0xfe,0xb1,
0xb0,0x6a,0x6b,0x6c,0x6d,0x6e,0x6f,0x70,0x71,0x72,0xaa,0xba,0xe6,0xb8,0xc6,0xa4,
0xb5,0x7e,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7a,0xa1,0xbf,0xd0,0xdd,0xde,0xae,
0x5e,0xa3,0xa5,0xb7,0xa9,0xa7,0xb6,0xbc,0xbd,0xbe,0x5b,0x5d,0xaf,0xa8,0xb4,0xd7,
0x7b,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0xad,0xf4,0xf6,0xf2,0xf3,0xf5,
0x7d,0x4a,0x4b,0x4c,0x4d,0x4e,0x4f,0x50,0x51,0x52,0xb9,0xfb,0xfc,0xf9,0xfa,0xff,
0x5c,0xf7,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x5a,0xb2,0xd4,0xd6,0xd2,0xd3,0xd5,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0xb3,0xdb,0xdc,0xd9,0xda,0x9f];
Rules:
Your program should take two inputs: (a) a text string, and (b) a flag indicating the operation to be performed.
Based on this flag, your program should either convert each byte of text into the corresponding EBCDIC character, or vice versa.
Input can be obtained from any sensible sources (e.g., command line arguments, stdin, keyboard input), but must not be hard-coded into your program.
Output should be displayed on the screen (e.g., stdout,
document.write
) or written to a file/pipeline.Don't use any built-in or external encoding conversion functions (
iconv
, etc.).This is a code-golf challenge, so the shortest answer (fewest bytes) will win.
Examples:
(Note: These examples were produced in a terminal configured to use UTF-8 encoding. You may see different results depending on how your system is configured. Hex equivalents are shown for reference only, and don't have to be generated by your code.)
Input: "HELLO WORLD", convert to EBCDIC
Output: "ÈÅÓÓÖ@æÖÙÓÄ" (0xc8c5d3d3d640e6d6d9d3c4)
Input: "ÈÅÓÓÖ@æÖÙÓÄ", convert from EBCDIC
Output: "HELLO WORLD"
Input: "lower case mostly ends up as gremlins", convert to EBCDIC
Output "" <-- unprintable in utf-8
(0x9396a68599408381a285409496a2a393a840859584a240a4974081a24087998594938995a2)
#5 means for example I can't have a base64-encoded string and to
s.decode('base64')
to get my look-up table? – Claudiu – 2014-03-26T21:59:37.467What is "ordinary" text? ASCII? UTF-8? A native String type? – intx13 – 2014-03-26T22:01:30.357
Are we converting control codes as well? Or just printable characters? If so, by what rules? – intx13 – 2014-03-26T22:04:30.757
@intx13, the translation table is in the question. – Peter Taylor – 2014-03-26T22:07:16.207
@Claudiu That would be absolutely fine – r3mainer – 2014-03-26T22:12:40.697
@intx13 Yes, convert control codes too. (By "ordinary" I was referring to character sets that have the alphanumeric characters in the same places as ASCII, but just stick to the translation table given and treat the input and output as binary data.) – r3mainer – 2014-03-26T22:15:19.827
@Peter, Then can we explicitly represent the input and output as byte arrays? Or do they have to be strings in the native environment (UTF-8 for bash, UTF-16 for C#'s console, etc.) – intx13 – 2014-03-26T22:16:15.713