26

Grimes and Elon Musk named their baby: X Æ A-12.

What are the risks of non-ASCII names?

For example, does the COBOL unemployment platform support non-ASCII names? Would it be possible to get a social security number (SSN) for the baby? Does it make their baby an easier target to impersonation attacks using Unicode manipulations? Could the baby name trigger a code injection attack?

Clearly pure-ASCII names might cause issues: too: \0, ^Z, NULL, etc.

0x90
  • 1,402
  • 2
  • 19
  • 27
  • 90
    You can have problems with ASCII names too. My cousin has an ASCII name and always has a lot of trouble because it contains a null character and two line feeds. – reed May 08 '20 at 09:49
  • 27
    @reed My legal name starts with `0x04 EoT` and it just keeps screwing me over. I'm glad to see I'm not the only one, but I do feel sorry for your cousin. –  May 08 '20 at 12:19
  • 7
    @reed - might be more annoying for him - and the people around him - if it had ^G as well! – davidbak May 08 '20 at 17:37
  • 5
    I'll speak for all the people having `+++ATH0` in their name as they are still trying to get here with their modems affected by CVE-1999-1228. – Esa Jokinen May 08 '20 at 17:59
  • 18
    Even sticking with just the Latin alphabet poses problems. Imagine someone named Null. Those people already exist and experience regular headaches when dealing with computer systems. – Engineer Toast May 08 '20 at 19:07
  • 13
    They did not named their baby X Æ A-12. They named him (or her, I do not know) Kyle. This was just some supposedly funny or clever way to announce the name. – WoJ May 08 '20 at 20:23
  • 5
    Another problem of this kind is apparently with the name `Null` (https://www.wired.com/2015/11/null/) – WoJ May 08 '20 at 20:27
  • 6
    @WoJ : You are repeating false memes. https://www.insider.com/x-ae-a-12-elon-musk-grimes-baby-pronounced-kyle-2020-5 – Eric Towers May 08 '20 at 21:23
  • 8
    @EricTowers I stand corrected, you are right (https://www.snopes.com/fact-check/elon-musk-grimes-baby-x-ae-a12/). They are clearly idiots, the kind of people who should not have children because, well, they are idiots. Sorry for the wrong information (this is truly horrible, they could have changed THEIR names to IAmADumbPrick instead of making the kid life miserable) – WoJ May 08 '20 at 23:13
  • 3
    With ill-written software, you are not safe with ASCII-only names either: https://www.bbc.com/future/article/20160325-the-names-that-break-computer-systems But, being a son of Elon Musk (or any other rich and/or famous parent), I think the name has only little to add to the gross attention on your personality, most of it malicious. – fraxinus May 09 '20 at 08:30
  • 1
    Sometimes even a name of "regular" characters create an issue in length [Hawaiian Woman Gets IDs That Fit Her 36-Character Last Name](https://www.npr.org/sections/thetwo-way/2013/12/31/258673819/hawaiian-woman-gets-ids-that-fit-her-36-character-last-name) – chux - Reinstate Monica May 09 '20 at 09:41
  • 15
    A couple or years ago I encountered a payment platform that would not accept the name "Grant" (It was blocking other SQL keywords too, so "Bobby Tables" would also be denied) – Jasen May 09 '20 at 12:10
  • 2
    Better question: is it a good idea to assume everybody uses ASCII names in the USA (or anywhere else)? Answer: [**no**](https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/). – Asteroids With Wings May 09 '20 at 16:35
  • Unfortunately, this version of the question is mostly off-topic here. "Does system X accept non-ASCII?" "What does the SSN system accept?" And what vulnerabilities arbitrary others systems might have is purely speculative. Unicode manipulations affect ASCII names, so I'm not sure how this would make thins more vulnerable. Add to that that there are millions in the US who have non-ASCII names makes the title question odd. – schroeder May 11 '20 at 07:32
  • 1
    As all the examples in chat and the show, various systems can have design failures that can result in unexpected behaviours for any name/input. – schroeder May 11 '20 at 07:33
  • I’m voting to close this question because the evolving nature of the question has made it difficult to answer and this version is not an answerable security question. The question really needs to be: "what risks are there to non-ASCII names?" and you have covered that in the question itself. Asking which systems those issues might apply to is too broad to answer. – schroeder May 11 '20 at 08:25

6 Answers6

33

There are even Chinese people in the US. They name their children 李某. Would that be a problem? No. Some systems support these names, some use versions converted to ASCII through romanization (李某 → pinyin Lǐ Mǒu → Li Mou). The only non-ASCII character in X Æ A-12, Æ, is used e.g. in Danish names like Ægidius, converted to ASCII Aegidius.

A good example of such conversion is machine-readable passports: the last two lines on every passport contains only characters A–Z, 0–9 and the filler character <. For passports, every name in the world gets converted to ASCII.

Sometimes using mere ASCII may be more problematic than non-ASCII characters that are easily encoded in UTF-8:

  • Christopher Null is a real person who has reported some problems with his ASCII name.

  • My name is Logger, Startkeylogger.

  • Little Bobby Tables has a name as unique as the little baby Musk. This Robert doesn't have a social security number as he's a fictional character, but for real people the name isn't an obstacle.

    Exploits of a Mom

Esa Jokinen
  • 16,100
  • 5
  • 50
  • 55
  • 6
    The child may be named "李某", but for everyday purposes, an appropriate transliteration into the Latin alphabet (probably Pinyin) will be used. – Mark May 08 '20 at 21:58
  • 7
    For everyday purposes, the child will be called whatever their parents use (be that the official pronounciation of 李某 or not). It is when writing its name on a computer system that it will then often be adapted to whatever is suitable for the system and available input entry. – Ángel May 09 '20 at 03:28
  • 5
    The ascii compatible romanization (already mentioned in the answer) is **pinyin without tone marks** (Li Mou), as the pinyin itself doesn't really help with the original problem of having non-ascii characters (Lǐ Mǒu). Neither does *International Phonetic Alphabet* IPA (lʲǐː mə̌u). – Esa Jokinen May 09 '20 at 04:36
  • 1
    Good point, there is however a huge difference between being Chinese and giving your child a Chinese name (which romanizes just fine, too), and giving it a name which is strong indicator that your full contractual capability (or sanity, in the widest sense) is to be doubted. A name for which the child probably could sue you for mental cruelty, once grown up. That's even moreso the case if as the main shareholder of a company, you make public statements such as "that company is way overvalued". – Damon May 10 '20 at 19:00
  • The question is about whether it's problematic or not, and this answer is to demonstrate how it's not that unique after all, regarding information systems. Whether it's sane or not is not about information security, and should not be discussed here. – Esa Jokinen May 10 '20 at 19:09
  • @Mark I think for "everyday purposes" most people that have such a name will use systems that can deal with such a name. The transliteration is needed when dealing with systems that did not consider such names, e.g., when the person moved into the USA. – allo May 10 '20 at 21:55
20

I don't live in the US but I live in a street that has (if written correctly) 2 German Umlauts in their name.

I never had a problem with my passport or with US authorities, visa waiver programs or anything that concerns the US government. They will accept it either the way it is or a "translation" into non-Umlauts (which even exists in Germany as a legal way to write the same word without Umlauts).

I have no experience with SSN or unemployment systems in the US, but I have no reason to believe they will behave differently from the civil administration I did come in contact with that handled it without blinking.

Now private businesses are a whole other can of worms. It has become a lot better in the last 3-5 years, but I had to fit a lot of square pegs through round holes to use my address in the US. For example, I own a credit card from a major US brand. And one would assume you can use that to pay online, right? My credit card is no different than your credit card. Well, yes and no. One will need to give the address to the merchant that in turn will send it with the card data to the card processor to reach a higher confidence that it was actually me doing the purchase, not someone who scammed just my card number. So the crappy merchant website would not accept umlauts. No problem, use the other accepted spelling. Then the merchant would accept it, but the card processor would barf and say "no sir, that's not their correct address, as written down here.". I think I dropped 25% of my purchases online because the merchants were literally to stupid to draw my money from a major US brand credit card.

Another example: bring an App into the Apple store. That was only 3 years ago. Apple is a major international company, surely they would not screw it up, right? Well, they did not. Directly. I could create an account. But they have a partner where you need to be registered to be accepted as a company. Guess who had never heard of Umlauts? I needed to talk to third level support (actually a developer that had database access) to get that done because even their own internal support interface would not let them.

So... I think you will be fine with all official, administrative tasks, but don't expect an easy life. Life is not just death and taxes, if you want to do something in between, maybe something that's fun or makes money, it will be a lot easier if your name is "Jake Brown" than "Jörg Oßten".

nvoigt
  • 1,092
  • 4
  • 10
  • 6
    Using street names in authentication sounds horrible as even ascii street names could be written in several variations. There's also cultural differences other than the character sets, e.g. there's no first name and surname on every country. Authentication schemes should not rely on these kind of things, but unfortunately some do. – Esa Jokinen May 08 '20 at 07:51
  • @EsaJokinen It can be pretty bad. I had a problem once because Visa and my bank disagreed about the zip code of my debit card. But, what else can they do? – jpaugh May 08 '20 at 16:11
  • 4
    If credit cards were to be invented today, everybody would laugh about the security. – Esa Jokinen May 08 '20 at 16:26
  • About the US part: they accepted all of `ü`, `ue` and `u`? – WoJ May 09 '20 at 07:33
  • I know a phone company which refuses to allow apostrophes in names or addresses. *In Ireland!* – TRiG May 10 '20 at 18:03
  • @EsaJokinen This system is called AVS (https://de.wikipedia.org/wiki/Address_Verification_System) and I also had problems with it because non-US cards of major issuers do not have it. Obviously `null != adress` ;-) – kap May 10 '20 at 21:03
  • @WoJ The computer systems accepted the original spelling and the paper work was accepted by the civil administration's clerk as well, although they looked at my passport first, so they kinda knew what was coming on the form. But obviously I did not fully tour the US systems. I had contact with travelling, Visa (not the credit card) matters and some civil administration forms. – nvoigt May 11 '20 at 06:47
  • Not just street names, but city names! My home town of about 160000 has a name which can be written in two ways, depending on whether a spelling reform from 30 years ago also applies to proper names or not. The government itself doesn't know it, because the police station and the town hall use different spellings. The spelling on the airport differs from the spelling on the train station. Letters from one government institution use one spelling, letters from another use the other one. The difference is in the second character, so foreigners searching alphabetically have a hard time finding it. – vsz May 11 '20 at 08:36
9

Posting this relevant link, because (to my surprise) no-one's mentioned it yet:

Falsehoods Programmers Believe About Names

I think it should be compulsory reading for anyone who works on systems handling personal data!  (Though I suspect it would be even more useful if it included examples of each case.)

As comments indicate, even plain printable ASCII can cause problems: far too many systems have trouble with names which include apostrophes (such as O'Connor and D'Artagnan), hyphens (Day-Lewis, Zeta-Jones), or embedded spaces (Lloyd Webber, Bonham Carter, de Vries), or embedded capitals (McDonald, FitzGerald).  So it shouldn't be a surprise that non-ASCII chars are even less widely supported… let alone mononyms, digits, extremely long names, names which include terms which are offensive in some language, all caps, all lower case, or change!

gidds
  • 199
  • 2
  • 1
    I'm not sure how this answers the question asked. This answers talks about what systems *should do*, not the current impact. – schroeder May 09 '20 at 13:41
7

There are two answers to this question. One involves talking about "Is it possible?" the other involves talking about "What are the costs & concerns?".

The first will greatly enlighten the second.

The answer to the first question is somewhere from maybe to yes, but with workarounds.

The answer to the second depends on what you are trying to do and who you are. For someone like Musk, being a White Male Billionaire will smooth over many of the problems and costs.For many others, you may discover that having an ASCII name makes things much simpler.

How would you spell Æ over the phone? What about the IRS Website that only allowed for UPPERCASE letters in street addresses, if a site doesn't even do case conversion, what the hope that they will work with Unicode? Many old COBOL systems on IBM Mainframe platforms use EBCDIC which stands for the Extended Binary Coded Decimal Interchange Code. A-Z, a-z, and 0-9 exist in EBCDIC (but {} and [] don't).

I'd say that unless you both very rich and a bit weird, don't burden your children with strange names. School is tough enough without that extra burden.

Walter
  • 232
  • 1
  • 5
5

Speaking as someone who has a last name that, properly formed, would be "Ælwyn", and who happens to have a day job for a software company that writes, among other things, stuff for tracking "Land and Vital Records" ("Vital" being birth and death certificates, along with name changes)… in most places the kid would have problems, although there is pending and/or recent legislation (I can't recall which, offhand) that would require systems to cope with such names more effectively.

But in practice even then a lot of systems will transliterate it as a simple "AE" because most things treat it as a ligature rather than a distinct character, and decomposing a ligature is a typographical (presentation) change rather than a change of content.

But I wouldn't expect them to be appreciably more (or less) easily targeted by Unicode manipulations, really. Or at least if they are… it is going to be so far behind all the other headaches from systems that can't cope with them that they're not going to notice.

Edit:

Yes, having three non-familiar names will also cause at least some systems to struggle with it, though probably far fewer — several systems at least allow for multiple "middle" names. Before changing mine (yes, the Æ is self-inflicted, and yes, I knew it wouldn't translate well, which is why legally it is "Ae") I had two middle names, or a single middle name and a hyphenated last name, depending on which way you interpreted it, and several times it did cause "interesting" situations — but hyphenates are at least fairly well supported almost everywhere in the US now.

Edit 2:

To address the entirely valid point raised about Æ not strictly being a ligature: partially (or even mostly) correct. The character "ash" is not a ligature at all, and is in fact the proper form for words deriving from languages which used it (including my own last name). As for lexical vs. typographical ligature, all I can say is that even the Unicode standard can't seen to decide if it should be or not (see https://stackoverflow.com/questions/9376621/folding-normalizing-ligatures-e-g-%C3%86-to-ae-using-corefoundation for some discussion of this), but my actual point stands: in practice, a large amount of record-keeping software, if it can deal with it at all, will be prone to decomposing it (especially if anything like OCR software comes into play). To make it even more fun, unlike ẞ or the "Turkish I", there is a complete round-trip mapping between Æ←→æ.

Joel Aelwyn
  • 159
  • 4
  • I think there will be a problem with the spaces though? Won't the three elements probably be recognized as three first names? And out of curiosity, do you normally type your name as Aelwyn, or the other way? (sorry, on mobile so I don't have the right glyph) – WoJ May 09 '20 at 07:35
  • 6
    My friend, Mr. Hämäläinen, was once converted into ASCII as Mr. Homoloinen (Finnish for "*gay parasite*") on a hotel management system. Having a name with umlauts can sometimes be embarrassing! – Esa Jokinen May 09 '20 at 07:58
  • Æ may be a ligature for Ae, or may be a single character in its own right, depending on the language. In Old English, Æ was thought of as a letter, not a ligature. Even in languages which do consider it a ligature, it's a lexical ligature, not a typographic one, so it really shouldn't be decomposed. – TRiG May 10 '20 at 18:07
5

The Musk child was born in California. California does not allow non-ascii characters in names and the child's name as submitted may be rejected, according to this link.

Although it is not completely clear, the "Æ" may be the issue, as California requires the birth information to be entered into an electronic record. California adheres to the United States Standard Certificate of Live Birth (Section 102425 (f) (1) ) in that link.

The standard certificate is here.

These may be the rules for the standard certificate, although the whole matter seems to be far more elaborate than one would suspect, with the mother's cigarette smoking history included, etc. etc.

Wastrel
  • 151
  • 2
  • What do cigarettes have to do with anything? – TRiG May 10 '20 at 18:07
  • Exactly. In all the government regulations about what can or must be included on a birth certificate, I was unable to find anything that actually answered the question. So, the news article was the best I could come up with. – Wastrel May 11 '20 at 13:59