Does Tor have any protection against an adversary simply running a very large number of nodes?
Someone with the necessary resources could just run thousands of relay nodes (including exit nodes). If they were an organization like the NSA, they could also make the major hosting companies running nodes turn over the private keys, or install backdoors, without the "owner" of the node noticing.
I know tor employs entry guards as a protection - a client chooses a set of entry guards at random, and only ever connects to those as entry nodes. If the entry guards are uncompromised, the user is safe. This gives the user at least the chance of not being profiled; without entry guards the user would eventually be caught.
However, what if the adversary is not interested in busting all users that access a certain site, or targeting a specific user. if they just want to identify some random portion of users that access that site, couldn't they do this by running a few thousand nodes and waiting?
I can imagine they could even target specific users, and force them to use only compromised nodes. Compromise one guard node of the user (wiretap his line, observe what server he connects to and send them a court order or some thugs, or just be lucky and control the right nodes by chance). Then run thousands of modified clients. Once the targeted user goes online, flood the network momentarily. In cooperation with your compromised nodes, keep the compromised paths free, so that the client will eventually build a circuit only on your nodes. Voila, you can eavesdrop on the user.
Are there any protections against this in Tor? Can you give an estimation on how many nodes the attacker would have to run? Are there any non-technical countermeasures, e.g. would someone intervene if 3000 new suspicious nodes would pop up on AWS?
(Note this is different from other questions on this site. For example my previous question asks about the case where the attacker can completely control your line; he fakes the whole network. Tor guards against this by using a list of known good nodes, and using signatures.)