How do I find out which language some Unicode characters belong to?

5

1

On Facebook, there are currently messages floating around with these strange chars:

ก็็็็็็็็็็็็็็็็็็็็ กิิิิิิิิิิิิิิิิิิิิ ก้้้้้้้้้้้้้้้้้้้

They are used to confuse the reader, because they break out of the designated text areas.

Do they really belong to a language? If so, which one?

magnattic

Posted 2012-02-13T13:27:13.737

Reputation: 1 176

Was tempted to put it in the title, but didn't want to break the SE design. Would it work anyway? – magnattic – 2012-02-13T13:29:54.683

1They also break the preview for this post. – Ramhound – 2012-02-13T13:56:16.960

They belong to Thai. None of them are actual words. – boehj – 2012-02-18T23:10:18.743

Answers

9

They are Thai characters with long strings of combining diacritic marks after base characters. You do quite similar things with Latin letters, too, e.g. â̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂̂ (which is a with circumflex followed by several combining circumflexes). Or you could use a sequence of combining horns: ư̛̛̛̛̛̛̛̛̛̛̛̛̛̛̛̛̛̛ or cedillas: ç̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧̧. It naturally depends on rendering software how the contrived sequences will be displayed.

Jukka K. Korpela

Posted 2012-02-13T13:27:13.737

Reputation: 4 475

4

Okay, what the … did you just do to this post of yours? :P

– slhck – 2012-02-13T14:37:40.023

1

If you're on Linux, you can try the Perl script utfinfo.pl (see also Program to check/look up UTF-8/Unicode characters in string on command line?); the output I get is:

$ echo ก็็็็็็็็็็ กิิิิิิิิิิ ก้้้้้้้้้้  | perl utfinfo.pl
Got 64 uchars
Char: 'ก' u: 3585 [0x0E01] b: 224,184,129 [0xE0,0xB8,0x81] n: THAI CHARACTER KO KAI [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: '็' u: 3655 [0x0E47] b: 224,185,135 [0xE0,0xB9,0x87] n: THAI CHARACTER MAITAIKHU [Thai]
Char: ' ' u: 32 [0x0020] b: 32 [0x20] n: SPACE [Basic Latin]
Char: 'ก' u: 3585 [0x0E01] b: 224,184,129 [0xE0,0xB8,0x81] n: THAI CHARACTER KO KAI [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: 'ิ' u: 3636 [0x0E34] b: 224,184,180 [0xE0,0xB8,0xB4] n: THAI CHARACTER SARA I [Thai]
Char: ' ' u: 32 [0x0020] b: 32 [0x20] n: SPACE [Basic Latin]
Char: 'ก' u: 3585 [0x0E01] b: 224,184,129 [0xE0,0xB8,0x81] n: THAI CHARACTER KO KAI [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]
Char: '้' u: 3657 [0x0E49] b: 224,185,137 [0xE0,0xB9,0x89] n: THAI CHARACTER MAI THO [Thai]

sdaau

Posted 2012-02-13T13:27:13.737

Reputation: 3 758

0

Just search for them on Google. E.g. ก is Thai.

MSalters

Posted 2012-02-13T13:27:13.737

Reputation: 7 587

0

According to some very brief Google work, Google Translate believes they are Thai characters; which from the style looks about right.

Turix

Posted 2012-02-13T13:27:13.737

Reputation: 691

0

Mathias Bynens

Posted 2012-02-13T13:27:13.737

Reputation: 2 171