0

I use libnss-pgsql2 in order to have virtual system users stored in a PostgreSQL database. The virtual users in the database work just fine. They can log in. I can see their uid, gid, groups via the 'id' command. Example:

# id backup001
uid=10001(backup001) gid=10001(backup001) groups=10001(backup001)

However, on systems that I use libnss, I frequently get this error:

Could not connect to database

It happens, for instance, often with cron-jobs. I have one cron-job that runs every hour that dumps the postgresql databases to a backup. The contrab is this:

04 *  *   *   *     postgres umask 077 && /usr/bin/pg_dumpall | gzip > ~postgres/backup/postgresql-complete-dump-$(date +\%H).sql.gz

This job always produces the error. Thus, flooding me with an e-mail every hour.

My setup is pretty simple: The table layout I use to store the users is available here: http://p.adora.dk/P2486.html

I use Debian Squeeze on the server.

Relevant config files are: nsswitch.conf : http://p.adora.dk/P2489.html

(description: use "normal" system users in /etc/passwd and /etc/shadow, however, if the user is NOT found, then proceed with a lookup via pgsql)

nss-pgsql.conf : http://p.adora.dk/P2487.html

(description: contains the SQL queries that are used to look up various information that normally is found in /etc/passwd and /etc/group)

nss-pgsql-root.conf : http://p.adora.dk/P2488.html

(description: contains the SQL queries that are used to lookup confidential info that is normally found in /etc/shadow)

Things that I have done to debug this:

  • Verified that the connection strings in both nss-pgsql.conf and nss-pgsql-root.conf work as intended.
  • Verified that the timeout does not occur. I.e. the error is echoed immediately and not after 300 seconds. Also, this happens on a server that does not do anything - so the connection should be established without delay -- I have verified that it does.

I really hope you can help me fix this error.

Update 2012-08-22:

I tried doing an strace on psql. The relevant part of the strace is in the bottom of this paste: http://paste.adora.dk/P2492.html

I notice that it tries opening /etc/nss-pgsql-root.conf and get EACCESS, however, I do not believe this should be a problem. This file should be readable by root only as it corresponds to /etc/shadow which is also only readable by root.

25341 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
25341 open("/usr/lib/libgpg-error.so.0", O_RDONLY) = 4
25341 read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 \6\0\0004\0\0\0"..., 512) = 512
25341 fstat64(4, {st_mode=S_IFREG|0644, st_size=11540, ...}) = 0
25341 mmap2(NULL, 14512, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) = 0xb6f6c000
25341 mmap2(0xb6f6f000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x2) = 0xb6f6f000
25341 close(4)                          = 0
25341 mprotect(0xb70bf000, 4096, PROT_READ) = 0
25341 mprotect(0xb73d8000, 4096, PROT_READ) = 0
25341 munmap(0xb7414000, 40018)         = 0
25341 open("/etc/nss-pgsql-root.conf", O_RDONLY) = -1 EACCES (Permission denied)
25341 write(2, "\nCould not connect to database\n", 31) = 31

It is possible that this is a bug in libnss-pgsql.... What do you think?

Update 2012-08-22:

OK. I dug up this five year old bug report: http://pgfoundry.org/tracker/index.php?func=detail&aid=1010197&group_id=1000039&atid=234

It seems that this behavior is actually a bug. A patch has been provided, however, there is no activity on the bug report. Maybe the project is abandoned. I certainly hope not :(

ervingsb
  • 385
  • 1
  • 6
  • 16
  • I would like to add I also get the error message when I do other commands that are not database related. I am aware that the particular cronjob that I pasted uses postgresql, however, this is a coincidence. I also see the error if I log in as my normal user and then start or resume a screen (http://www.gnu.org/software/screen/) – ervingsb Aug 21 '12 at 12:51
  • Does postgresql's log have any relevant error messages? – DerfK Aug 21 '12 at 13:16
  • Do you get the same error if you replace localhost with 127.0.0.1 in your config files? – Jenny D Aug 21 '12 at 14:07
  • ervingsb: If you would like to add something to your post, you can use the "edit" link to update it. That will be easier to follow for later viewers than reading the question, and then the amendment in a comment. – Mark Stosberg Aug 21 '12 at 15:50
  • DerfK: No errors in PG log. Jenny: No difference if I put in 127.0.0.1 or localhost. I get the error either way. Mark: Thanks. – ervingsb Aug 22 '12 at 18:21

1 Answers1

1

I think the answer to your problem is in this this line:

open("/etc/nss-pgsql-root.conf", O_RDONLY) = -1 EACCES (Permission denied)

Try relaxing permissions on this file to be readable by "group" and "other" and see if that solves the problem.

You are wrong that the file corresponds to /etc/shadow. It corresponds to /etc/password, which is readable by "group" and "other". Your PostgreSQL database and tables used for authentication correspond to /etc/shadow.

It can't connect to the database because it can't read the database access credentials from this file.

Mark Stosberg
  • 3,771
  • 23
  • 27
  • Thanks for your reply. PostgreSQL logs does not show anything when I receive this error. I have enabled all kinds of logging: http://paste.adora.dk/P2491.html . Max_connections is set to 100. Of course the PG logs show lots of activity when I do 'id USER', etc. There is almost never more than 1 connection. This DB-server is not used by any internet-facing web-sites that can receive spikes in visitors. Also, the error is 100 % reproducible. It happens never when I do 'id USER', however, it happens everytime I start a screen or the psql client for instance. – ervingsb Aug 22 '12 at 18:19
  • I completely replaced my answer. Try the new one. – Mark Stosberg Aug 22 '12 at 18:37
  • My pg_hba.conf is available here: http://p.adora.dk/P2493.html How would you suggest that I relax them? Please note that I get the error even when running the 'psql' command with the postgres user (which has full access to postgres without password via ident). I am worried that the error is actually spurious and that it is not actually the database connection that fails -- I am not sure it even tries to actually connect to the database. What do you get from the trace above? – ervingsb Aug 22 '12 at 18:38
  • Please note that the EACCESS error above if a file system access error and *not* a database connection error. This is what leads me to think that this is either a bug in libnss-pgsql2 or something happens between this open() attempt and the "Could not connect to database" error. – ervingsb Aug 22 '12 at 18:39
  • I updated my answer to clarify further. Let me what happens when you make the file in question to be readable by "group" and "other". – Mark Stosberg Aug 22 '12 at 18:44
  • @ervingsb Your Postgres logs would not show anything. You are not getting to the point where you would try to connect to the database server. – voretaq7 Aug 22 '12 at 18:49
  • Mark, making nss-pgsql-root.conf world readable gets rid of the error message, however, this is *not* a solution to the problem, as it leaves the system insecure and the official documentation explicitly states that this file must be readable by root only. – ervingsb Aug 22 '12 at 18:55
  • What happens if you restrict the permissions again, but try psql with "-h localhost" instead of a local socket connection? – Mark Stosberg Aug 22 '12 at 19:07
  • Mark: I then get "Could not connect to database", however, 'psql' does connect. The error is entirely spurious. – ervingsb Aug 23 '12 at 20:02