9 10 12 13 16 18 21 23 languages, 1000 bytes
- Bash (prints "duizend", Dutch)
- Zsh (prints "een Dausend", Luxembourgish)
- Dash (and other true POSIX shells) (prints "nje mije", Albanian)
- C (prints "et tusind", Danish)
- C++ (prints "ib txhiab", Hmong)
- brainfuck (prints "mille", Italian)
- boolfuck (prints "afe", Samoan)
- Brainlove (prints "mil", Spanish)
- 2DFuck (prints "sewu", Javanese)
- Python 1 (prints "one thousand", English)
- Python 2 (prints "eintausend", German)
- Python 3 (prints "hal kun", Somali)
- Asar (prints "tuhat", Estonian)
- ><> (prints "jedna hiljada", Bosnian)
- Gol><> (prints "eenduisend", Afrikaans)
- Befunge-98 (prints "o mie", Romanian)
- Befunge-96 (prints "otu puku", Igbo)
- Labyrinth (prints "elf", Maltese)
- Hexagony (prints "ett tusen", Swedish)
- GNU Make (prints "isang libo", Filipino)
- A Pear Tree (prints "hezar", Kurdish)
- Octave (prints "tūkstantis", Lithuanian)
- Whispers (prints "seribu", Malay)
#ifdef warnings //[[^
#_ if 0
#{
#(________ )
'''''echo' -n<<"'''+r'''":
@echo "isang libo"
define N #
1_801_201_
0
1...@
endif
print "tuhat"
macro _()
#}
disp("t\xc5\xabkstantis")
quit
#{
> "seribu"
>> Output 1
'''+r''':
[ -n "$ZSH_VERSION" ]&&echo een Dausend||([ -n "$BASH_VERSION" ]&&echo duizend||echo nje mije)
:<<"endmacro;\"^*///]\"\"\"#/t#"
#endif
#include<stdio.h>
int main(){puts(sizeof('1')-1?"et tusind":"ib txhiab");}
/*
print"hezar";exit;<<'/*';#NJ@G@GMCI
1234567890123456789012345678901234567890123|u;ts;6ie$;;tn;e@$<
]+[+++[[<+>->+++++>+<<]+>]<<<.<.+++.(.<.)>>+;+;;;;+;;+;;+;;+;;+;;+;+;+;+;+;;+;;+;[-]]![!.!...!..!..!.!..!..!.!.!.!.!...!.!...!.!...!.!.!.!.!][
'''
print(["hal kun","eintausend","one thousand"][(type(1==1)==type(1))+(3/2==1)])
r""""
01234567891234567890123456789012 1
>"ukup uto">:#,_@
H"eenduisend"\
>o<"jedna hiljada"~\ u
#}
endef # >"eim o"4k,@ /"`"/"/"\
endmacro;"^*///]"""#/t#
There should be a hard tab at the start of line 6.
Try it online! (use the switch languages button to try them all)
Length is exactly 1000 ASCII characters, and also 1000 bytes.
Asar is a SNES assembler that has quite sloppy input parsing and thus accepts a lot of inputs that one may consider invalid. I used version 1.71 for this, downloadable here, and depended on quite a few bugs, so there is a chance this will break in later versions.
All of the other languages are well-known enough to be on TIO.
I used Google Translate for the translations, so they may not be perfectly grammatically accurate. If anyone knows some of the languages better, please suggest improvements, I've got some wiggle room here :)
Credits: @JoKing for some minor optimizations regarding brainfuck, befunge, (gol)><>, hexagony, and python. @flawr for suggesting and helping with octave. @caird coinheringaahing for suggesting whispers.
Fun fact: I made most of this polyglot months ago. It included the first 10 languages mentioned here. (edit: I've since shuffled the language list around a bit so it makes more sense, but check the 2nd oldest version of this polyglot in the edit history and add hexagony to get that list).
Explanation (slightly outdated)
This explanation was made for an older revision. It's hard enough keeping track of all the natural language shuffling (to make more difficult programming languages use shorter natural language words) and Hexagony's constant grid shifting, let alone update the explanation every single revision. I'm leaving this here since it still gives a good overview of how what path most of these languages take.
Asar
Asar is a SNES assembler written by Alcaro in 2011, intended to be backwards-compatible with xkas (written by byuu in around 2006). It doesn't have the nicest code. The first line begins with a special kind of label definition, the label is called ifdef
. Then follows a warnings
command, which is supposed to control when warnings are thrown. Usually it's followed by either push
or pull
, but it doesn't reject other words following it. So that's the first line: a label-definition followed by a no-op. The 2nd line is a similar label definition (this time the label is called simply _
) followed by an if
statement. The condition is 0, so the contents of the if are ignored. This means we can pretty much go wild inside it, the only rule we have to follow is that all double-quotes must be paired. The end of that if statement is at the endif
on line 10. After that comes a simple print statement which prints the output. Then comes a macro declaration. Declaring a macro that isn't used is pretty much equivalent to an if 0
. The end of the macro declaration is on the very last line. In Asar, comments are started with ;
, so the rest of the last line is ignored.
Bash, Zsh, Dash
The first 3 lines are just regular comments. The 4th line is equivalent to echo -n <<"'''+r''':"
. This is an echo without a linebreak following it. Then we have a heredoc that ends at any occurrence of '''+r''':
. The contents of the heredoc are piped to the stdin of that echo command, but that doesn't matter since echo ignores its stdin. After the end of the heredoc on line 14, we have regular shell code that checks whether the variables for either zsh or bash are defined and chooses what to print based on that. After that, we have the null command :
, which does nothing, into which we pipe the rest of the script using another heredoc. The heredoc ends with endmacro;"^*///]"""#/t#
, which just happens to be the very last line of the script.
C, C++
The first line is a preprocessor directive which checks whether the define "warnings" is defined. There's also a comment after the directive. warnings
isn't defined, so the contents of the block are skipped. There are a bunch more garbage preprocessor directives inside the skipped if branch, but they are ignored too. The #endif
is on line 18. After that we have a standard C program. It checks whether the size of a character literal is greater than 1 or not. If it's greater, we are in C, because in C character literals are actually of int
type, which has a size of at least 2 bytes on literally every platform in use today. In C++, however, the type of a character literal is char
, which is defined to have a size of 1, so a different branch is taken.
The rest of the code is a single comment, started with /* at line 21. It's terminated in the middle of the last line, but followed immediately by a single line comment.
Brainfuck
On the first line, we have 2 open brackets. This is because one of them is eaten by the heredoc that ends the shell part, which is before the brainfuck part. There are some balanced brackets within the skipped part too, but they don't matter because they are balanced. The next closing bracket is on line 21, which is also the start of the brainfuck code. I used this text-to-brainfuck converter, because somehow it generated even better output than BF-Crunch. Then follows the boolfuck code, which is mostly ignored due to ;
not being defined. The increments don't matter either, we just switch to the next cell which has never been written to so the next open bracket is a guaranteed jump. The next closing bracket after that is on the very last line, and there are no brainfuck instructions after it.
Boolfuck
The boolfuck control flow is quite similar. However, once the brainfuck code is reached, most of it is ignored as -
isn't defined in boolfuck. The area with +
and ;
is where the output is actually printed. It just prints the string "mil" bit by bit, with +
negating the current bit and ;
outputting it. The rest of the control flow is identical to brainfuck.
Python 1, 2, 3
The first 3 lines are regular comments. The 4th line starts with a triple-quoted string literal, delimited by single quotes. This string literal is often closed and reopened because I needed to get a backslash in somewhere (I've forgotten where), so I needed the string literal to be raw, but I couldn't put the r before the first quotes. Anyways, the first place where the string literal is closed and not immediately followed by another string literal is on line 23, which is the start of the Python code. This code uses a few simple tests to determine which version of Python it's running on: in Python 1.6, there was no separate boolean type, so the results of comparisons were just int
s. The first condition checks that. The second condition checks whether division rounds down or not. In Python 2, division on integers rounded the result down, whereas in Python 3, it results in a floating point answer. After the Python code, we have another raw string literal that lasts until the last line, where it is immediately followed by a comment.
><>, Gol><>
Now we get to the 2D languages. Gol><> is almost perfectly backwards compatible with fish, so they start execution the same way. In fish, execution starts from the north east ("top left") corner of the program, with the instruction pointer pointing west. In fish, #
is a mirror which reflects the instruction pointer back in the direction it came from. The #
causes the IP to point west instead. In fish, program space is topologically a torus, so going off the western edge results in reappearing at the eastern edge of the program. There it encounters ^
, which changes the instruction pointer's direction to north. Execution then continues from the very bottom of the script. The IP goes right past the very last character of the program and hits the /
on line 31. The /
reflects the IP like a mirror, so it'll go east again. Here is where we determine if we are running fish or golfish. In golfish, quotes inside string literals can be escaped with `
. So in golfish, the string literal is only terminated on the 3rd quote on that line. In regular fish, however, the backtick escape doesn't exist, so the string literal ends after the 2nd quote. (The 4th quote is needed to keep Asar happy about paired quotes). Then, both IPs are reflected upwards. The fish one goes west on line 30 and follows a pretty simple text printing program. The golfish one goes west on line 29, after switching to a different stack with the u
in its path on line 30. There, it follows a very simple string printing code too (just push the string on the stack and call H
, which exits and dumps the whole stack as ASCII text).
Befunge-98
In Befunge-98, execution starts out in the north west ("top left") corner of program space, pointing east. #
is a trampoline command, it skips the next instruction. The letters a-f just push their hex value on the stack. Spaces are no-ops. w
pops 2 items from the stack, compares them, and turns left or right depending on which one is larger. here, f
is larger than e
(15 is larger than 14), so we turn left, with the IP pointing north. Now we jump to the bottom of the program, like in fish, and follow the path of arrows to line 31. This uses a simple string printing code with the k
operator, which repeats the next instruction (print character) some number of times, 5 in this case.
Hexagony
This was the one language that probably costed me the most time while writing this thing. This explanation is a bit oversimplified, go look at Hexagony's README or wiki page if you want to learn more. Hexagony reformats your program into a hexagon. Then execution starts at the north-west corner. The first command, #
, switches to the IP with the index of the current memory cell (there are 6 IPs, each one starting at a different corner of the hexagon). Memory is initialized to all 0's though, so this switches to the first IP, which is the one that is already executing. All alphabetic characters simply set the current memory cell to their value, so most of the rest of the 1st line is just no-ops. Then there's a mirror, /
, which makes the IP point north-west (as if it got reflected off the mirror). This means the IP will wrap around to the very bottom of the hexagon. There, it happens to hit the /
at the very end of the script. This is also a mirror and makes the IP go east again. Then we set the current memory cell to t
and use the "switch IP" command again. Since there are only 6 IPs, the current memory cell is taken modulo 6 when switching, so it goes to the 3rd IP, which starts at the eastern corner of the hexagon. This IP starts out moving southwest and it executes some useless operations before reaching the end of line 23 in the original code. There it uses the $
operator to skip over some characters the first time around, then encounters |
which turns it around and prints some more characters.
Labyrinth
In labyrinth, control flow is determined by the program's layout. All spaces and alphabetic characters are considered walls. The #
command pushes the number of items on the stack, which is 0, to the stack. Then execution continues in the only possible direction, south. Here, the number of items on the stack is pushed again, this time pushing a 1. Now there's 2 directions to go: south again or east. Since the topmost value on the stack is positive, execution would try to go west, but that path is blocked, so it goes east instead. _
pushes a 0. Now there's again 2 directions to go, but we came from one direction and unless nothing else is possible, the IP will never flip around and go back the direction it came from, so we continue south. Here we have a (
, which decrements the top of the stack. This means the topmost item is now -1
. Excluding the direction we came from, there's 3 directions to go, but since the top of the stack is negative, we go east. Here we push 0 a bunch of times, which means we continue east for as long as possible. In the end, we go south because that's the only other place to go. There we find a '
, which is a no-op if debugging output isn't enabled. Now, labyrinth treats tabs as a single space, so the character south of the '
is actually "
from the start of the echo's parameters. Execution continues there. Below that, there is only one path to follow, which leads to the 4th character of line 9. There, we simply construct the ascii values corresponding to "elf" and print them.
Make
The first 3 lines are comments. The 4th line is somehow a valid target name. (To be honest, I have no idea either about why this works.) Since this is the first target in the makefile, it's what gets executed when no argument is given to make. The rule just prints the character count. Then we define a variable called N which contains the rest of the script. It ends on the 2nd-to-last line with the endef
command. Following it is a comment that ends with a backslash. This causes make to interpret the next line as part of the comment too.
A Pear Tree
A Pear Tree starts execution by finding the longest substring of the program whose CRC32 is 0. Then it shifts the whole program so that said substring is at the start of the program. This long substring with a CRC32 of 0 is print"hezar";exit;<<'/*';#NJ@G@GMCI
, which just happens to be valid Perl code. No, actually I used a tool called crchack to generate some letters to add to a comment so that the string's CRC is 0. So this Perl program is shifted to the very beginning of the program. It contains a print statement, an exit statement and a heredoc. The heredoc is terminated by the line that was right before the perl script originally, and thus is at the very end of the program now. So the heredoc skips the entire rest of the script and all of the syntax errors that would arise from not doing so.
Do the same rules apply as in your previous challenge regarding output, or are we allowed to choose which format to output in this time? So is it mandatory to output
hundred thirty four
instead ofone hundred and thirty-four
, or are both allowed? And does this also apply to other languages (i.e. in Dutch:honderd dertig vier
- which literally translates as the three wordshundred thirty four
, instead ofhonderdvierendertig
- which would be the grammatically correct translation forone hundred and thirty-four
)? I would allow both variations, or else the grammatically correct one. – Kevin Cruijssen – 2019-10-22T14:19:58.9032It might also be a good idea to link to your previous challenge, and copy some of its rules to here, since challenges should be self-contained. – Kevin Cruijssen – 2019-10-22T14:30:32.920
2I only notice this now, but what is the win condition of the challenge? I don't see a code-challenge tag, nor any indication on the scoring? Reading the challenge I assumed the more languages the better the score, and byte-count if the amount of languages are equal (since that's usually the case), but I don't see this in the challenge description.. Right now this would be closed as "doesn't have a clear winning condition".. – Kevin Cruijssen – 2019-10-22T14:57:28.313
If a number is spelled the same in multiple languages, is that considered different for this challenge? For example, "six" in English and French. – 79037662 – 2019-10-22T15:25:39.707
1In case anyone's interested, 100 is "백" in Korean, "百" in Chinese, "ひゃく" in Japanese(Hiragana, because Kanji is all the same in CJK). 1000 is "천" in Korean, "千" in Chinese, "せん" in Japanese. – Bubbler – 2019-10-23T06:37:28.327
1[tag:rosetta-stone] is not a scoring criterion, though it feels like it should be. Please use [tag:code-challenge] – Jo King – 2019-10-23T10:13:29.673
Can we output trailing whitespace? – None – 2019-10-23T10:20:45.533
Is it generally accepted that different C compilers are different languages for polyglotting? – None – 2019-10-25T10:16:37.507
1Do dialects count as separate natural languages? (I am not referring to programming languages in this case.) – None – 2019-10-25T10:17:57.063
Here's the rule in short: if Ethnologue says it's separate, then it's a separate language, unless the two languages use the same numerals system. If two separate numbers have at least one number spelled the same and you are using it, one point to you. – Andrew – 2019-10-25T22:34:05.037