1

We have email validation in our registration form (an Ajax call to a REST endpoint to validate an email address when a user enters it). Lets say a normal form first name, Last name, Email, Password, Address... We have 2 actions in this form.

  1. When user enter an email we do a call to backend and check wether this email is already taken by somebody.
  2. Form submit.

We do have a CSRF token(per one request in a hidden field) for the whole form(so we get a new token after email validation call).

But somebody can create a script to call to email validation endpoint with the token and collect the email addresses we have in the system.

Are there any good practices to secure this kind of scenario? We also plan to implement throttling.

Anders
  • 64,406
  • 24
  • 178
  • 215
Ntwobike
  • 111
  • 3

2 Answers2

2

Edit to answer actual question

Sorry for misunderstanding your question. That being said, here is an answer. If your goal is to make it so that no one knows what email addresses you have registered on the system (which is a good goal), then the simple answer is that you can't verify before hand if the email address is taken, and then let them know if so. Regardless of what security measures you put in place, you will still have a place where people can come and ask "is this person on your site?". Even rate limiting doesn't really do it, as a malicious attacker might be targeting a specific user. If they already know the person's email address before hand they only have to go to your register form and see if the email is taken. Instead, you are going to have to adjust your registration step:

  1. You can always validate that it is an actual email address (telling them that the email address isn't an email address will protect against typos and obviously won't reveal anything sensitive).
  2. Make them type the email address twice: this will, again, help prevent typos, which will be a big aid in the next step:
  3. Force the user through an email validation process, like I outline below: send them an email with a key that they must use to activate their account. This provides a couple layers of protection. If someone is using your registration form to sniff for used email addresses, they won't find out anything. In all cases your registration form returns a generic message "Please check your email for instructions on how to activate your account". If the email address already exists then, rather than sending activation instructions, you can send a "You just tried to register but already exist in our system. Click here to recover your password. If you didn't try to register, please click this link and then disregard this message" (or something like that). Otherwise, do the normal email validation message.

Since you don't want to tell people if they try to register with an email that already exists, you will get three different people who submit your registration form:

  1. New people who are trying to register (these people will have no problem at all)
  2. Old people who are already registered but forgot: your alternate email will help them get in very quickly.
  3. People who are trying to scrape your email list. They will get nothing at all.

As a result, malicious actors will be shut out and your legitimate users will get through without problem. A good mix between usability and security (IMO).

previous answer: mildly applicable

Proper email validation shouldn't result in someone being able to check for the existence of email addresses in your system. You may need to include more details about what exactly you are doing. That being said, it's always worth explaining what the answer should be:

Email validation should be done by a cryptographically-secure and random key that is emailed to the user. At that point in time validation can be as simple as clicking a link that you send to the user that looks something like:

https://example.com/verify_registration?key=LONGANDRANDOMSTRING

Things to keep in mind:

  1. Make sure the validation key is generated with a proper CSPRNG: if the user can guess the key from their input, it is useless.
  2. The user's email address is not part of the input and is not needed at any step in the validation process. The fact that the user knows the random key, which is only handed out via email, is what verifies that the user owns the email address. As a result, there should be no opportunity for people to scrape a list of email addresses.
  3. Make sure that your key has enough random bytes that it can't feasibly be guessed
  4. Normally you would only use POST requests for making changes, and therefore you want to make sure that this endpoint has CSRF protection. However, that is not necessary here. The reason is that the key itself is sufficiently random that no one but the person with the email should have it anyway. So it's better to make the validation process happen via a simple link and GET request. Even if the user just has to copy and paste the key from their email to a web form, someone will get it wrong. The way this works is simple enough and secure enough that (IMO) a simpler UI is perfectly reasonable.
  5. Worth mentioning because I see this all the time: hashes don't add randomness to a string. Use an actual CSPRNG to make the random string. Don't just take some otherwise simple data and hash it. Hashes don't add entropy: they must make it look random.
Conor Mancone
  • 29,899
  • 13
  • 91
  • 96
1

It’s not clear what threat you are trying to handle.

There are some options, which I‘ll outline from the top of my head:

  • information leakage for a single email

    You will have to deny registration at some point, so you have no way of doing something against it, except sending the information via email, i.e. let the first step be a input field that generates an email with a link to proceed.

    This makes the registration even more tedious as it is anyway so this is a horrible approach, UX-wise.

  • information leakage with brute force

    Iterating through (rather long) email addresses is tedious, but rate limiting would only keep attackers in check that do not use the cloud to proxy the requests from various IPs.

    You could however make the check computationally expensive by making the requesting party use slow hash functions multiple times before submitting to your API (JavaScript seems to be necessary anyway).

    This would require you to make those computations as well for every user on every registration, but that would be acceptable, probably.

    Additionally, this would be a waste of computing power if there isn’t a list of valid email addresses and it’s purely random iteration of RFC-valid email addresses.

  • advanced threats where the email is compromised or can be intercepted

    You can do nothing about that. This applies to the first szenario as well if you have no valid S/MIME or PGP key for the address when sending the link.

Generally speaking, you might want to consider using for example OAuth providers for authentication to not have to worry about all of this - or not use email addresses as IDs:

If you allow users to pick a pseudonym on registration that has to be unique, you can just allow a second registration for an email address; and why not*?

With email opt-in, you can delete the pending registration if it is not finished within x hours. And given you have collected a key from the user, you might also tell them they already got an account (and the pseudonym) in that email.

(*) Even users that do not have access to multiple email addresses can use the alias-feature from the RFC and create multiple accounts unless you test for equality such that a@b.c == a+b@b.c. So users might create multiple accounts anyway.

Tobi Nary
  • 14,302
  • 8
  • 43
  • 58