I recently had a look at implementing the swedish BankID identification system for an app.
A prerequisite is that a client certificate is installed on our backend, which authenticates requests from our backend server.
Briefly, the flow looks like this when signing in to our app (the "client app") on a device and using another device (phone/tablet) to authenticate the user.
- The client app sends an
auth
request to our backend. - Our backend sends an
auth
request to BankID. - We get a reply containing, among other things, a
qrStartToken
andqrStartSecret
which are used to generate QR codes that the user must scan with the BankID app on their phone/tablet. - The response also includes
autoStartToken
, which identifies the "session" in subsequent calls to the BankID API (a session lasts 30 seconds). - The client app keeps making
collect
requests (via our backend) every 2 seconds, and when they have scanned the QR code the response will contain the status"complete"
, which means they are logged in. - The QR codes must be regenerated on our backend using the
qrStartToken
,qrStartSecret
and number of seconds since the response from theauth
request. Hence the phrase "animated QR codes".
Until recently, the QR code was static and was retrieved by making a request to the BankID API and supplying the autoStartToken
(the session identifier). This has been deprecated, and apps should now use the "animated" QR codes, which are regenerated every second.
The reason BankID introduced the QR code system in the first place was that attackers would call unsuspecting victims, claiming they were from the bank or some other "authority" and telling the victim something along the lines of "We noticed that someone tried to hack into your bank account. Could you please log in using your BankID and verify if there are any suspicious transactions?". The victim would then start logging in to their account, but what they didn't know was that the attacker had already started logging in to their account on their computer a few seconds before them, so the "session" that the victim authorized in the BankID app on their phone was actually the attackers session! The QR code solves this by requiring visual access to the "session" (i.e. scanning the QR code in your desktop browser with the BankID app on your phone).
So, what I'm a bit curious about is what the "animated" QR code system improves? The static QR codes require only the autoStartToken
, and the animated ones require qrStartToken
and qrStartSecret
+ time of auth response as well. However, these are all received by the backend in the same response. So it seems to me that if an attacker could somehow get their hands on a autoStartToken
(within its 30 second TTL), they could also get their hands on the other variables. In any case, it would have to be a far more advanced attack than the "phone scam" explained above to be able to get access to the response from the BankID API.
Can someone explain what type of attack the animated QR codes actually prevent, that the static QR codes are vulnerable to?