5

I'm getting ready to start using PGBouncer, but I'm unsure of whether it should be used on my database server or on the app servers. If it's on the app servers, there will necessarily be multiple pools of connections, vs one central pool of connections for the app servers to share, but then the TCP connections have to be recreated for each new query instead of also being pooled, presumably. Which is the "proper" way to use a connection looker like PGBouncer, and are the points I make about each even valid?

Note:

For those who stumble into this question, see PgBouncer FAQ (the last question).

orokusaki
  • 2,693
  • 4
  • 28
  • 42
  • @mdpc - this is more of a server question than a database question, since it involves questions about TCP overhead, connection pooling, as well as separation of packages on app servers vs database severs. – orokusaki Dec 22 '12 at 22:31

1 Answers1

3

Personally, I'd put it on the app server. Here's why.

PGBouncer basically implements connection pooling, which is a useful (although sometimes annoyingly troublesome) thing that reduces overall application latency by removing the cost of setting up a new connection to the database. In many cases, this is done by the database connection driver itself - see the ADO.NET MSSQL driver on Windows, PDO on PHP, etc. etc. The principal aim being to minimize the time cost between the code saying "I need to talk to the database" and then being able to actually start throwing SQL commands around.

The drivers mentioned above implement the connection pooling, so the code has to do very little before it gets to the point that the database can receive SQL commands.

PGBouncer is an odd case, because you've still got to open a connection to the PGBouncer daemon, wherever it may be. Because you're trying to minimize connection time, it therefore makes sense to put the daemon as close to the application code as possible. Ideally, you'd connect over a socket on the same box, as then you don't have to go through the TCP rigmarole just to get to the connection pooler.

As with everything though, YMMV. Test, test, and then test some more.

growse
  • 7,830
  • 11
  • 72
  • 114
  • 1
    There's a caveat to this: at the number of client pools increases, the benefits of having the server-side pool increases since each pool must keep some minimum number of persistent connections open or risk "thrashing" new connections on the server when exceeding average normal client pool size. This, in turn, results in a very large number of open connections at all times to the database server which are not free to maintain (even if idle). A server-side pool doesn't care home many clients there are an shares a smaller pool of connections across all of them... – Matthew Wood Jan 08 '13 at 20:10
  • ...I'm implementing this shift right now in a very large project and expect to see impressive performance improvements over the client-side pools, even with the overhead for each new connection to PgBouncer. – Matthew Wood Jan 08 '13 at 20:12
  • What's the performance difference between connecting to PgBouncer running on the pg server, vs just connecting to the pg server? – growse Jan 09 '13 at 15:39
  • 1
    Massive. The cost of building up a new PostgreSQL connection is at least an order of magnitude higher than the designed-to-be-lightweight PgBouncer. For example: PostgreSQL validates and limits connections by IP/database/role using pg_hba.conf, whereas PgBouncer does not--it only validates via a local user/pass file (see system table pg_shadow for what is stored). It starts to add up. I've been doing PG for about 8 1/2 years now and when there's a performance issue, the first place I look is new connection churn. It's usually the culprit. – Matthew Wood Jan 09 '13 at 16:49