12

If I'm using something like gitolite to handle access control how well does authorized_keys scale? Meaning if I have say 50,000 users what will the performance be like (I'm guessing not very good). What are the alternatives?

Update: I decided to do some testing myself (which I should have done in the first place). I wrote a simple script to generate SSH keys and add them to a authorized_keys file. My computer isn't that fast so I only generated 8,061 keys and then added my own to the end, the file ended up being 3.1MB. I then added a git repository with one file and ran git clone three times:

With 8,061 keys (Mine is at the end of the file)
real    0m0.442s
real    0m0.447s
real    0m0.458s

With just a single key:
real    0m0.248s
real    0m0.264s
real    0m0.255s

The performance is much better than I thought it would be. I'm still very interested in any alternatives that may be faster more efficient for a large group of keys 50,000+.

Jeremy
  • 123
  • 7
  • Are you expecting to have 50000 _simulatenious_ users to a single server or 50000 total users with sporadic access? – Mxx Jul 09 '13 at 19:57
  • Sporadic access. I'm asking about the performance of the SSH server having to search a authorized_keys file having 50,000 keys on every login. – Jeremy Jul 09 '13 at 20:03
  • 3
    An `authorized_keys` file with 50k keys is only around 25MB. Surely that will be completely cached into filesystem buffers. I'd imagine the time to find the key in the file would be dwarfed by the time to actually use that key to authenticate the user. – cjc Jul 09 '13 at 20:27

1 Answers1

8

You can actually see the efficiency on GitHub as to how fast this is. You are not going to cause a significant bottleneck with that many keys.

Though as documented in their blog from 2009, they have changed how ssh keys are retrieved, from a database. Hat Tip: @Jeremey

But, you created over 8k keys, you can test again with 50k keys.

Those keys don't need to be valid keys, just write a generator and write the file and then append yours to the end.

Moshe Katz
  • 3,053
  • 3
  • 26
  • 41
vgoff
  • 408
  • 7
  • 19
  • 2
    Agreed that the OP should just test out a 50K keys file with his hardware. I'm not sure if GitHub is a great example: who knows what they do on the backend? For all we know, they have a customized sshd that stores the authorized_keys in Redis. – cjc Jul 09 '13 at 20:29
  • Perhaps, but they are pretty vocal about what they are working on, and other larger sites servicing public git repositories using gitolab have not mentioned it. Not worth much weight, of course, but I have not seen any mention of it. – vgoff Jul 09 '13 at 21:04
  • Gerrit boasts of an optimized ssh key lookup though, stating that it should be faster than Gitosis. I did not know about Gerrit project. – vgoff Jul 09 '13 at 21:20
  • 1
    I found a article on how github works: https://github.com/blog/530-how-we-made-github-fast. They use a patched SSHD server that gets the keys from a MySQL server. – Jeremy Jul 10 '13 at 16:04
  • 2
    take a look at AuthorizedKeysCommand option on sshd_config – Lluís Jul 10 '13 at 22:20