Your definition of safe seems to be "not everyone can register those", which is quite strange.
What does this protect you against exactly, and where do you plan to do this filtering?
In its tightest sense, in almost all TLDs, "anyone" can register domains. Sometimes the "anyone" is restricted or has to pay special price, but for a specialized attacker these would be low barriers if someone wants to try to circumvent your filter.
Also your regexp then makes no sense as suddenly what were TLD before are now 2nd level domains/labels? You would need to fix your question to remove ambiguity on what you are really trying to achieve here. Filtering TLDs? Filtering domains based on some specific structure (3rd level domains) and specific patterns in the 2nd level domain? Etc.
And [a-z]{2}
is wrong for a TLD pattern. First only ccTLDs are 2 characters as TLD, all others (gTLDs) are more. Also didn't you hear about IDNs to start with? Or the fact that the TLD is a domain and is governed at the registration layer by LDH rules so at least hyphens and digits are needed to be accepted, and will be needed for IDNs.
But even outside of that, with /\.(edu|gov|mil)\.[a-z]{2}$/
you immediately trust any country in the world (at lest those having decided to use this way of managing their TLD) to attach the same meaning of edu
, gov
or mil
as the meaning you seem to attach to it, and the same level of control/verification that you seem to intend. Can you vouch for that for any country in the world? (at the very least, this should show you that trying to use a regexp to validate things like "can everyone register those domains" will never work)
As for:
I know there were some .edu domains registered before the sharpened rules introduced in 2001. But are there many domains or just a handful? Is this TLD to be considered safe?
Do you mean now that your definition of safe is not binary anymore but like a percentage based on how many such "domains" (and how will you count them anyway?) exist in the TLD (and domains come and go anyway).
Are .aero, .museum and .post to be considered safe for whitelisting?
Did you ever see any .aero
or .post
domain being used anywhere lately?
And .museum
slightly only better.