[This is my view, I'm not claiming that it represents the view of the industry]
I totally agree that some piece of secret data stored on your main device blurs the line between "something you know" and "something you have". Which side of the line it falls on, I think, depends on the specifics of what that data is and how the authentication protocol works, and what perspective you're looking at it from.
Technically yubikeys, smartcards, and even OTP fobs, are also a piece of secret data stored in software, albeit in a way that's difficult even for an attacker with physical access to extract. I will argue that the thing you are proving possession of is not the device, but the secret data. With hardware tokens these are the same, but people go and apply the same thinking to phones and other kinds of secrets and I'm not sure that's the correct way to think about it.
Definitions of "Know" vs "Have"
What kind of secret it is, and how your device stores and accesses it gives a sliding scale of security (plain text file --> yubikey
). Somewhere in there lies the boundary between "know" and "have". Where you draw that line, I think, depends heavily on whose perspective you take. Some examples:
- End-user perspective: you probably draw the line at "came from my memory" vs "is stored on a device".
- System Administrator perspective: you probably draw the line at whether you can ask for the device back and be confident that the user no longer has the secret.
- Authentication Server's perspective: in most cases, the server has no way to tell whether the secret came from a secure device, or was derived from a password that the user typed in. So the relevant distinction from its perspective is whether it is a shared secret that both client and server know, or whether it's some sort of public key that you need to prove possession of without revealing the actual secret. The practical litmus test is that "knows" tend to be vulnerable to record&replay attacks, while "haves" tend not to be.
There are some cases that everybody agrees on: a password that the user types into a textbox is clearly a "know", and a smartcard with an RSA keypair is clearly a "have". But no matter how you define "know" vs "have", I think there will always be edge-cases that one of the above perspectives considers a "know" while another considers a "have".
Thought experiment
Say the sysadmin generates you a new random password, stores it encrypted on a yubikey such that the key will release the encrypted password upon request, and only your VPN client has the key to decrypt and actually use the plaintext password (no idea if this is realistic, but hey, it's a thought experiment). Is that a know or a have? From the end-user's perspective it's certainly a have ... they can't get into their account without the fob. From the admin's perspective it's (mostly) a have because unless the user went out of their way to hack the fob and the VPN client, the user can't learn the password, so you can ask for the fob back and give it to another employee. But from the server's perspective it's certainly a know because all it sees it a plaintext password, it has no way to tell whether it came off a secure device, or was typed in.
Server perspective
As an application developer, the theoretical distinction that matters to me is:
With "something you know" (ie passwords) you are sending the secret itself over the network to the server. With "something you have" (usually cryptographic keys or seeds) you never send the secret itself, but a one-time value or challenge response that proves you have possession of the secret.
Consider a man-in-the-middle sniffing your web traffic. They can steal your username / password. With OTP / yubikey / etc, the secret data is a cryptographic key or an RNG seed. They can sniff all the messages they want, they will never recover the "something you have" secret.
I'm arguing that if retrieving the second factor requires the attacker to have access to your device (physical access or rootkit) or to another account of yours, then it meets my definition of "something you have".
Resistance to cloning
Resistance to cloning once the attacker already has access the device is clearly a bonus, but (to me) not necessary to meet the definition. After all, to do a clone, the attacker already needs to "have" the device. The difference at that point between being able to use the device to impersonate you, and being able to clone the device to impersonate you is theoretically meaningless because, either way, they already have the ability to impersonate you.