10

The strange and extremely slow IO pattern I'm seeing is this (output of iostat -dxk 1 /dev/xvdb1):

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb1             0.00     0.00  0.99  0.99     7.92     3.96    12.00     1.96 2206.00 502.00  99.41

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb1             0.00     0.00  0.00  0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.40

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb1             0.00     0.00  0.00  0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.40

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb1             0.00     0.00  0.99  0.00     3.96     0.00     8.00     0.99 2220.00 1004.00  99.41

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb1             0.00     0.00  0.00  0.00     0.00     0.00     0.00     1.00    0.00   0.00 100.40

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb1             0.00     0.99  0.99  0.00     7.92     0.00    16.00     1.14 2148.00 1004.00  99.41

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb1             0.00     0.00  0.00  0.00     0.00     0.00     0.00     2.01    0.00   0.00 100.40

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
xvdb1             0.00     0.00  1.00  1.00     4.00     8.00    12.00     2.01 1874.00 502.00 100.40

I don't know why disk utilization and await is so high, and the read/write rates are so low. What could the reason for this be?

The table being queried simply has several varchar columns only, one of which is last_name, which is indexed (actually lower(last_name) is indexed). The query itself is simple:

SELECT * FROM consumer_m WHERE lower(last_name) = 'hoque';

Here's the explain output:

                                           QUERY PLAN                                            
-------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on consumer_m  (cost=2243.90..274163.41 rows=113152 width=164)
   Recheck Cond: (lower((last_name)::text) = 'hoque'::text)
   ->  Bitmap Index Scan on consumer_m_last_name_index  (cost=0.00..2215.61 rows=113152 width=0)
         Index Cond: (lower((last_name)::text) = 'hoque'::text)

Also note that the database is on auto_vacuum, so no explicit vacuum/analyze was performed.

ehsanul
  • 427
  • 1
  • 8
  • 19
  • Did you customise your postgresql.conf? If CentOS has the same defaults as RHEL 5.x you'll have little memory for postgres, which could force a lot of disk IO. How big are the rows on this table? – Thiago Figueiro Dec 04 '10 at 07:37
  • The table fits into memory, as obviously does the index; it was partitioned that way. And postgresql.conf has been appropriately customized (shared_buffers, effective_cache_size etc). Even if this was not the case, I would not expect such degenerate performance. – ehsanul Dec 06 '10 at 23:55

3 Answers3

5

The fact that your device is /dev/xvdb1 implies that you're running under Xen. How is your storage configured? Is there contention for the underlying device, and how does iostat look on that?

Unless you can eliminate that as likely, that's where I'm going to point the whirling spinner of poor performance blame.

Basically, the overall approach to untangling a performance problem like this is to think about all the layers where a bottleneck could occur, and then devise tests to eliminate each one until you isolate the problem.

mattdm
  • 6,550
  • 1
  • 25
  • 48
  • No contention. Though you're right that this is a virtual server, the hard disk has been fully dedicated to this server, and I'm only running one database query at a time with no other intensive server operations. Storage is just a single spinning SATA disk. Note that I have a couple other (separate) servers/databases with pretty much the same setup, but which work fast with low IO, as expected, given similar queries/indexing. – ehsanul Dec 04 '10 at 03:34
  • Can you run `iostat` on the disk from dom0 just to see if the picture is similar? Can you do some other basic disk benchmarks from both levels? That'll at least help narrow down where to look next. – mattdm Dec 04 '10 at 03:50
  • Sure. Why do you expect a discrepancy based on where `iostat` is run from? Should it matter? I don't actually have direct access to dom0 now, though I could get it. I'll try `fio` for benchmarking in the meantime. – ehsanul Dec 04 '10 at 03:58
  • 3
    for one thing: snapshots can create such situation – Hubert Kario Dec 04 '10 at 12:41
  • 3
    You were right mattdm, there was contention, showing up on dom0. It was a problem of communication, my boss had given part of the hard disk to another server under the management of someone else, without my knowledge. I was under the impression that it was dedicated, because that's how we always set it up. I guess that's why it's always important to double check your assumptions. Thanks! – ehsanul Dec 07 '10 at 00:13
1

Here are some suggestions, in more or less random order:

  1. Autovacum is not turned on by default in CentOS. There are multiple settings you have to set to enable it. Double check so the vacum process actually runs. It's easy to miss one of the settings required.

  2. Note that you have to do a second filter step for that query, which can be expensive depending on what you get back. I would consider an index such as:

    CREATE INDEX consumer_m_lower_last ON consumer_m (lower(last_name));

    Which will match against your query and remove the Recheck.

  3. Also, as mattdm points out, you can't trust iostat in virtualized environments.

  4. You should probably check http://lonesysadmin.net/2008/02/21/elevatornoop/ if you have IO problems in an XEN environment. Elevator settings can have an impact, but not this large.

  5. Is the underlying disk using LVM snapshots? While this is very useful from a management perspective it can murder IO performance. This is true both if the block device you are usign is a snapshot, and if a snapshot has been taken of the block device.

pehrs
  • 8,749
  • 29
  • 46
  • Thanks for the suggestions. The index is actually on lower(last_name), even though I left out "lower" from the name of the index. So I don't know why there's a recheck going on there. The disk mounted on `/` is using LVM snapshots in fact, but not the one on which the database is stored. So I don't think that is it. I will look into your other suggestions though! – ehsanul Dec 06 '10 at 23:43
1

I doubt this is a problem with PostgreSQL, and is more likely just an issue with Disk IO. As the comments from another answer mention, if it is a disk IO issue, you really should measure from Dom0 so you get a picture of everything that is happening.

I had a very similar problem a while back and it turned out to be an issue with the disk controller. Very slow disk access was causing the system to bottleneck while waiting for disk IO (which showed up as very high load averages and wait times, but also cause processes waiting for disk to consume more CPU than they would otherwise. It turned out that the kernel wasn't recognizing the controller properly and was falling back onto the old school IDE controller instead of a fast sata one.

The fix was to boot with

hda=noprobe hda=none 

at the end of the kernel string in /etc/grub.conf. (Of course, add all the disks you have, ala: hdc=noprobe, hdc=none, hdd=...)

Jed Daniels
  • 7,172
  • 2
  • 33
  • 41