I use a redundant pair of OpenLDAP servers for PAM auth and directory services via NSS. It's been 100% reliable so far, but nothing runs flawlessly forever.
What steps should I take now so I have a fighting chance of recovering from failure of the LDAP server(s)? In my informal testing, it appears that even already authenticated shells are largely useless as all username/uid lookups hang until the directory server comes back.
So far I've come up with only two things:
- Do not use NSS-LDAP and PAM-LDAP on the LDAP servers themselves.
- Create a root-level account on all boxes that only accepts publickey authentication from our local subnet and protect that key well. I'm not sure how much good this would do me as once I'm logged in, I suspect I wouldn't be able to accomplish anything since all the userid lookups would be hanging.
Any other suggestions?