Adding an answer because too long in a comment, but on the specific point of why reserving everything if xn--
is enough.
In one of first iteration of IDNA standard ("Internationalizing Domain Names in Applications"), in a draft in November 2001 (draft-ietf-idn-idna-04) there was this:
- ACE prefix
The ACE prefix, used in the conversion operations (section 4), will
be specified in a future revision of this document. It will be two
alphanumeric ASCII characters followed by two hyphen-minuses. It MUST
be recognized in a case-insensitive manner.
The scheme allowed interoperability tests when there was multiple encodings proposed. So in fact it seems there was at least bl--
, bq--
, dq--
, lq--
, mq--
, ra--
, wq--
and zq--
(and when things solidified, xn
was chosen at random so that no one had a head start and no collisions with actual existing names). If you are old enough, you would remember that Network Solutions/Verisign then was selling bq--
domain names, as IDN testbed.
In February 2003:
An eligible subset of that list of 42 entries will be determined
by eliminating the following codes due to their use, in one or more
top-level domain zone files that have been reviewed, as the first two
characters of second-level domain labels that have hyphens in their
third and fourth character positions:
AA, QM to QZ, XA, XZ, and ZZ.
Going back to December 2000 at IETF San Diego has these notes:
ACE identifier candidates
- prefixes: AA--, AB--, ..., 99--
- suffixes: --AA, --AB, ..., --99
Relevant domain names: aa--a.com, aa-b.org, ..99--zzzz.net, aa--x.co.jp, etc.
a-aa.com, b--aa.org, ..., zzzzz--99.or.kr, etc.
Proposal
step 1: tentative suspension of registering relevant domain names for ACE identifier candidates
step 2: conduct a survey of relevant domain names already registered
step 3: select about 10 to 20 identifiers one of which is for test and
others for real use, based on the survey
step 4: permanent blocking of
registrations of domain names relevant to the selected identifiers
(except for registrations compliant to MDN semantics).
In November 2000 in draft-ietf-idn-aceid-00
we have:
All strings starting with a combination of two alpha-numericals,
followed by two hyphens, are defined to be ACE prefix identifier
candidates. All strings starting with one hyphen followed by three
alpha-numericals, and strings starting with two hyphens followed by
two alpha-numericals are defined as ACE suffix identifier candidates.
ACE prefix identifier candidates and ACE suffix identifier candidates
are collectively called ACE identifier candidates.
which got simplified in following June to just:
All strings starting with a combination of two alpha-numericals,
followed by two hyphens, are defined to be ACE prefix identifier
candidates. All strings starting with two hyphens followed by two
alpha-numericals are defined as ACE suffix identifier candidates.
And the mailing list archives before 2001-01 seems to be lost forever so no way to find more about that, I fear.