241

I am learning the basics of SSH protocol. I am confused between the contents of the following 2 files:

  1. ~/.ssh/authorized_keys: Holds a list of authorized public keys for servers. When the client connects to a server, the server authenticates the client by checking its signed public key stored within this file

  2. ~/.ssh/known_hosts: Contains DSA host keys of SSH servers accessed by the user. This file is very important for ensuring that the SSH client is connecting the correct SSH server.

I am not sure what this means. Please help.

galath
  • 103
  • 3
Ankit
  • 2,623
  • 4
  • 15
  • 9
  • 2
    Similar question on [unix.se]: [SSH key-based authentication: known_hosts vs authorized_keys](http://unix.stackexchange.com/questions/42643/i-have-some-questions-about-ssh-key-based-authentication) – Gilles 'SO- stop being evil' Sep 26 '12 at 12:32

4 Answers4

247

The known_hosts file lets the client authenticate the server, to check that it isn't connecting to an impersonator. The authorized_keys file lets the server authenticate the user.

Server authentication

One of the first things that happens when the SSH connection is being established is that the server sends its public key to the client, and proves (thanks to public-key cryptography) to the client that it knows the associated private key. This authenticates the server: if this part of the protocol is successful, the client knows that the server is who it claims it is.

The client may check that the server is a known one, and not some rogue server trying to pass off as the right one. SSH provides only a simple mechanism to verify the server's legitimacy: it remembers servers you've already connected to, in the ~/.ssh/known_hosts file on the client machine (there's also a system-wide file /etc/ssh/known_hosts). The first time you connect to a server, you need to check by some other means that the public key presented by the server is really the public key of the server you wanted to connect to. If you have the public key of the server you're about to connect to, you can add it to ~/.ssh/known_hosts on the client manually.

By the way, known_hosts can contain any type of public key supported by the SSH implementation, not just DSA (also RSA and ECDSA).

Authenticating the server has to be done before you send any confidential data to it. In particular, if the user authentication involves a password, the password must not be sent to an unauthenticated server.

User authentication

The server only lets a remote user log in if that user can prove that they have the right to access that account. Depending on the server's configuration and the user's choice, the user may present one of several forms of credentials (the list below is not exhaustive).

  • The user may present the password for the account that he is trying to log into; the server then verifies that the password is correct.
  • The user may present a public key and prove that he possesses the private key associated with that public key. This is exactly the same method that is used to authenticate the server, but now the user is trying to prove its identity and the server is verifying it. The login attempt is accepted if the user proves that he knows the private key and the public key is in the account's authorization list (~/.ssh/authorized_keys on the server).
  • Another type of method involves delegating part of the work of authenticating the user to the client machine. This happens in controlled environments such as enterprises, when many machines share the same accounts. The server authenticates the client machine by the same mechanism that is used the other way round, then relies on the client to authenticate the user.
Gilles 'SO- stop being evil'
  • 50,912
  • 13
  • 120
  • 179
  • 27
    thanks this is very helpful. Would it be right to say that known_hosts file is maintained on client whereas authorized_key file is maintained on the server – Ankit Sep 28 '12 at 07:25
  • 6
    @Ankit Yes, that is the case. – Gilles 'SO- stop being evil' Sep 28 '12 at 14:38
  • I have both files on the server and ssh to it to test it. But the 2 files have different contents. So the keys are different in these files? – Timo Apr 11 '18 at 08:19
  • @Timo The keys are completely unrelated. One is the key of a machine, the other is the key of a user. – Gilles 'SO- stop being evil' Apr 11 '18 at 14:06
  • @Gilles So once the entry for a server's public key is added to the *known_hosts* file in the client's machine, any subsequent ssh session between the two doesn't require the server to prove that it has the right private key? – Geek Aug 30 '19 at 11:53
  • @Geek No, the server needs to prove that each time, otherwise there'd be no way to know that it's the same server. `known_hosts` lets the client know what the server's public key should be. – Gilles 'SO- stop being evil' Aug 30 '19 at 18:08
  • @Geek -- see my explanation "About Secure Files Containing Secure Keys," below. – IAM_AL_X Aug 31 '19 at 16:52
44

Those two files are both used by SSH but for completely different purposes, which could easily explain your confusion.

Authorized Keys

By default SSH uses user accounts and passwords that are managed by the host OS. (Well, actually managed by PAM but that distinction probably isn't too useful here.) What this means is that when you attempt to connect to SSH with the username 'bob' and some password the SSH server program will ask the OS "I got this guy named 'bob' who's telling me his password is 'wonka'. Can I let him in?" If the answer is yes, then SSH allows you to authenticate and you go on your merry way.

In addition to passwords SSH will also let you use what's called public-key cryptography to identify you. The specific encryption algorithm can vary, but is usually RSA or DSA, or more recently ECDSA. In any case when you set up your keys, using the ssh-keygen program, you create two files. One that is your private key and one that is your public key. The names are fairly self-explanatory. By design the public key can be strewn about like dandelion seeds in the wind without compromising you. The private key should always be kept in the strictest of confidence.

So what you do is place your public key in the authorized_keys file. Then when you attempt to connect to SSH with username 'bob' and your private key it will ask the OS "I got this guy name 'bob', can be be here?" If the answer is yes then SSH will inspect your private key and verify if the public key in the authorized_keys file is its pair. If both answers are yes, then you are allowed in.

Known Hosts

Much like how the authorized_keys file is used to authenticate users the known_hosts file is used to authenticate servers. Whenever SSH is configured on a new server it always generates a public and private key for the server, just like you did for your user. Every time you connect to an SSH server, it shows you its public key, together with a proof that it possesses the corresponding private key. If you do not have its public key, then your computer will ask for it and add it into the known_hosts file. If you have the key, and it matches, then you connect straight away. If the keys do not match, then you get a big nasty warning. This is where things get interesting. The 3 situations that a key mismatch typically happens are:

  1. The key changed on the server. This could be from reinstalling the OS or on some OSes the key gets recreated when updating SSH.
  2. The hostname or IP address you are connecting to used to belong to a different server. This could be address reassignment, DHCP, or something similar.
  3. Malicious man-in-the-middle attack is happening. This is the biggest thing that key checking is trying to protect you from.

In both cases, known_hosts and authorized_keys, the SSH program is using public key cryptography in order to identify either the client or the server.

Scott Pack
  • 15,167
  • 5
  • 61
  • 91
  • 4
    "Every time you connect to an SSH server it presents its private key in order to prove its identity." I certainly hope not! I assume you meant _its public key_. If a server presented me, the client, with its private key - it (A) wouldn't work for me to authenticate it and (B) is an indication that the server is so badly configured that I should stop doing business with it immediately. Private keys should only be accessible on the machine of origin by designated users. That's kinda the point. ;-) – underscore_d Sep 22 '15 at 20:33
  • This answer helped me more than the accepted one (: – chaosguru Aug 29 '18 at 09:07
  • If I ssh to a local server (local IP), and then later from the same computer but now remotely connect to the same server (public IP) will it trigger mismatching keys? How can you mitigate this? – deanresin Aug 08 '19 at 03:55
5

About Secure Files Containing Public Keys

To help you understand how "known_hosts" and "authorized_keys" are different, here is some context explaining how those files fit into "ssh". This is an over-simplification; there are lots more capabilities and complications to "ssh" than are mentioned here.

Associations are in Trusted Sources

While it has been said that public-key values "can be safely strewn about like seeds in the wind," keep in mind that it's the gardner, not the seed-pod, who decides which seeds get established in the garden. Altough a public-key is not secret, fierce protection is required to preserve the trusted association of the key with the thing that the key is authenticating. The places entrusted to make this association include "known_hosts", "authorized_keys", and "Certificate Authority" listings.

The Trusted Sources Used by "ssh"

For a public-key to be relevant to "ssh," the key must be registered ahead of time, and stored in the appropriate secure file. (This general truth has one important exception, which will be discussed later.) The server and client each have their own, securely stored list of public-keys; a login will succeed only if each side is registered with the other.

  • "known_hosts" resides on the client
  • "authorized_keys" resides on the server

The client's secure file is called "known_hosts", and the server's secure file is called "authorized_keys". These files are similar in that each has text with one public-key per line, but they have subtle differences in format and usage.

Key-pairs are Used for Authentication

A public-private key pair are used to perform "asymmetric cryptography." The "ssh" program can use asymmetric cryptography for authentication, where an entity has to answer a challenge to prove its identity. The challenge is created by encoding with one key, and answered by decoding with the other key. (Note that asymmetric cryptogrophy is used only during the login phase; then "ssh" (TSL/SSL) switches to another form of encryption to handle the data stream.)

One Key-pair for Server, Another for Client

In "ssh", both sides (client and server) are suspicious of the other; this is an improvement over the predecessor to "ssh," which was "telnet". With "telnet", the client was required to provide a password, but the server was not vetted. The lack of vetting allowed "man-in-the-middle" attacks to occur, with catastrophic consequences to security. By contrast, in the "ssh" process, the client surrenders no information until the server first answers a challenge.

The Steps in "ssh" Authentication

Before sharing any login information, the "ssh" client first eliminates the opportunity for a man-in-the-middle attack by challenging the server to prove "Are you really who I think you are?" To make this challenge, the client needs to know the public-key that is associated with the target server. The client must find the server's name in the "known_hosts" file; the associated public-key is on the same line, after the server name. The association between server-name and public-key must be kept inviolate; therefore permissions on the "known_hosts" file must be 600 -- nobody else can write (nor read).

Once the server has authenticated, it gets a chance to challenge the client. The authentication will involve one of the public-keys found in the "authorized_keys". (When none of those keys works, the "sshd" process falls-back on password style authentication.)

The File Formats

So for "ssh", as with any login process, there are lists of "friends", and only those on the list are allowed to attempt to pass a challenge. For the client, the "known_hosts" file is a list of friends who can act as servers (hosts); these are listed by name. For the server, the equivalent list of friends is the "authorized_keys" file; but there are no names in that file, since the public-keys themselves act like identifiers. (The server doesn't care where the login is coming from, but only where it's going. The client is attempting to access a particular account, the account name was specified as a parameter when "ssh" was invoked. Remember that the "authorized_keys" file is specific to that account, since the file is under that account's home directory.)

Although there are many capabilities that can be expressed in a configuration entry, the basic, most common usage has the following parameters. Note that parameters are separated by space characters.

For "known_hosts":

{server-id} ssh-rsa {public-key-string} {comment}

For "authorized_keys":

ssh-rsa {public-key-string} {comment}

Note that the token ssh-rsa indicates that the algorithm used for encoding is "rsa". Other valid algorithms include "dsa" and "ecdsa". Therefore, a different token might take the place of the ssh-rsa shown here.

Let "ssh" Auto-Configure the "known_hosts" Entry

In both cases, if the public key is not found within a secure file, then assymetric encryption does not happen. As mentioned earlier, there is one exception to this rule. A user is allowed to knowingly choose to risk the possibility of a man-in-the-middle attack by logging into a server that is not listed in the user's "known_hosts" file. The "ssh" program warns the user, but if the user chooses to go forward, the "ssh" client allows it "just this once." To assure it happens just once, the "ssh" process automatically configures the "known_hosts" file with the required information by asking the server for the public-key, and then writing that into the "known_hosts" file. This exception totally subverts security by allowing the adversary to provide the association of a server-name with a public-key. This security risk is allowed because it makes things so much easier for so many people. Of course, the correct and secure method would have been for the user to manually insert a line with server-name and public-key into the "known_hosts" file before ever attempting to login to the server. (But for low-risk situations, the extra work might be pointless.)

The One-to-Many Relationships

An entry in the client's "known_hosts" file has the name of a server and a public-key that is applicable to the server machine. The server has a single private-key that is used to answer all challenges, and the client's "known_hosts" entry must have the matching public-key. Therefore, all clients that ever access a particular server will have the identical public-key entry in their "known_hosts" file. The 1:N relation is that a server's public-key can appear in many client's "known_hosts" files.

An entry in the "authorized_keys" file identifies that a friendly client is allowed to access the account. The friend might use the same public-private key pair to access multiple, different servers. This allows a single key-pair to authenticate to all servers ever contacted. Each of the targeted server accounts would have the identical public-key entry in their "authorized_keys" files. The 1:N relation is that one client's public-key can appear in the "authorized_keys" files for multiple accounts on multiple servers.

Sometimes, users who work from multiple client machines will replicate the same key pair; typically this is done when a user works on a desk-top and a lap-top. Because the client machines authenticate with identical keys, they will match the same entry in the server's "authorized_keys".

Location of Private Keys

For the server side, a system process, or daemon, handles all incoming "ssh" login requests. The daemon is named "sshd". The location of the private key depends upon the SSL installation, for example Apple puts it at /System/Library/OpenSSL, but after installing your own version of OpenSSL, the location will be /opt/local/etc/openssl.

For the client side, you invoke "ssh" (or "scp") when you need it. Your command line will include various parameters, one of which may optionally specify which private key to use. By default, the client side key-pair are often called $HOME/.ssh/id_rsa and $HOME/.ssh/id_rsa.pub.

Summary

Bottom line is that both "known_hosts" and "authorized_keys" contain public keys, but ...

  • known_hosts -- the client checks if host is genuine
  • authorized_keys -- the host checks whether client login is allowed
IAM_AL_X
  • 151
  • 1
  • 3
-2

Not true at all.

The known_hosts file contain the fingerprint of the host. It is not the public or private key of the remote host.

It is generated from their key - but it is emphatically NOT the key itself.

If you SFTP to an address that might resolve to several (varying) hosts (load balanced etc) you must add the fingerprints from all the possible end points, or it will work initially and then fail when it is routed to the second (or subsequent) host.

Peter
  • 1
  • 2
    erm look at your known_hosts file and compare it to a host fingerprint when you connect.... That should clear it up a bit. Furthermore, your example would be exactly the same, regardless if it is fingerprints or public keys in the known_hosts file. – Njomsky Feb 16 '15 at 22:54
  • This answer would be better if it explained that ["the Debian openssh-client package sets several options as standard in `/etc/ssh/ssh_config` which are not the default"](https://manpages.debian.org/stretch/openssh-client/ssh_config.5.en.html) including `HashKnownHosts yes`. – Robin A. Meade Mar 26 '21 at 23:41