0

I've got:

  • a CPU E5-2407 (quadcore Xeon 2.2GHz)
  • 16GB of memory DDR3
  • RAID1 Intel520 SSD 240GB

My homemade applications are using a permanent PG connection on 7 pages (one called by customers as API, others called by a CURL bash loop with 10s timer) + website. The volume will go up to 1 million call per day on the API page (now it's around 1000 per day). The API page role is only to insert data in a table, handle by loop process afterwards.

I'll change the bash loop by a C program in maybe one month so the API page will be the only called using nginx (except website but it'll be a very very small volume compared to API).

What would you recommend as settings for processes/child processes/cache/buffer of nginx/PG/FPM ?

Thanks :)

KuroNeko
  • 19
  • 6
  • No it's not :) I'm asking a precise question especially on optimization with memory/disk. – KuroNeko Aug 16 '13 at 12:10
  • @KurroNeko There is no possible way we can give you recommendations for the settings you're asking about without knowing more about your environment than you can reasonably tell us. It is like asking us how far the accelerator pedal should be depressed in a 2011 Porsche 911 Turbo to reach 100MPH -- every car (environment) is different, and you need to test and get a feel for *your specific environment*. That said, consider [Craig's advice](http://serverfault.com/a/531470/32986) carefully – voretaq7 Aug 16 '13 at 15:32
  • FWIW I think it's far more complex than voretaq7 suggests - the speed at which data is processed, network characteristics, usage profile are (IME) far more relevant to capacity and performance than the details of the hardware. It is a capacity planning question. – symcbean Aug 16 '13 at 16:43

1 Answers1

1

I can't speak for the other layers of the stack, but for Pg, put PgBouncer in transaction pooling mode in front of PostgreSQL. That'll mean that as you spawn more persistent workers in your client code you don't blow out the number of PostgreSQL backends massively. It'll also reduce the startup/teardown costs for tons of short-lived Pg workers.

Postgresql max_connections slots aren't free, even if unused, and performance scales non-linearly with number of connections and with number of actively working sessions. See this wiki page for the details.

Additionally, if the app has lots of read-only, slow-changing data consider caching it in something like Redis or Memcached. You'll find that PostgreSQL's LISTEN and NOTIFY features make fine-grained and timely cache invalidation easy when you're doing this. It's probably only something to pursue once you know what the real world load is like and what gets hit most often.

Beyond that: benchmark it with your workload. Then benchmark it some more. Then some more. There is no substitute for simulated workload benchmarking - except real world workloads, which always throw some surprises at you.

Craig Ringer
  • 10,553
  • 9
  • 38
  • 59
  • The API page makes only inserts in the big table (it also selects some customer data in other tables but they are small). That's why I think I don't really need a lot of cache. My main concern is about max_connections which should be always more than nginx max processes/child to avoid rejects. Does having SSDs and 16GB memory allow me to add a lot of max_connections or is it only usefull for caching? – KuroNeko Aug 16 '13 at 12:07
  • @KurroNeko each connection slot takes up a certain amount of resources (see http://www.postgresql.org/docs/current/static/kernel-resources.html#SYSVIPC). Disk (and whether that disk is SSD or rotating metal) has no effect on this, however memory (and specifically kernel parameters) *do* -- review the documentation for full details. – voretaq7 Aug 16 '13 at 15:34