4

While diagnosing performance issues with vendor software that runs off of SQL Anywhere (9.0.2), I stumbled upon some interesting data in regards to I/O Bandwidth. According to the 9.0.2 manual, the database property "CurrIO" shows "The current number of file I/Os that were issued by the server but have not yet completed.". However, it's unclear what this number should be, given a hardware configuration and/or database utilization.

After a bit of searching, I found that the SQL Anywhere 10.0.0 manual does go into this setting in a bit more detail in their chapter on performance:

To detect whether I/O bandwidth is a limiting factor, check the CurrIO database statistic. If this statistic is not present on the graph, click the Add Statistics button and select CurrIO. Look for the largest sustained number for this statistic. For example, look for a high plateau on the graph; the wider it is, the more significant the impact. If the graph has sustained values equal to, or greater than 3 + the number of physical disks used by database server, it may indicate that the disk system cannot keep up with the level of database server activity.

Is this saying that, for example, if I have 5 disks in the server, this number should ideally be below 8? Is the meaning behind this value the same for version 9.0.2 as 10.0.0? The reason I find this hard to believe is the results of the following command are a bit off in my particular case:

SELECT db_property ( 'CurrIO' ), db_property ( 'MaxIO' ) 

The above command returns over 900 for the CurrIO and 1150 for the MaxIO. I have been monitoring this number for a few hours and the average is approximately 950 (Thanks to the Foxhound monitor from RisingRoad). These readings have been taken under normal database load.

Is my I/O bandwidth truly as inadequate as it looks, or am I misinterpreting these numbers?

Here is the current server configuration:

OS: Windows Server 2003 R2 32-bit

Database Version: SQL Anywhere (Adaptive Server Anywhere) 9.0.2.3381

CPU: 4x Intel Xeon Dual Core 3.00GHz

RAM: 26GB (22GB Allocated to SQL Anywhere cache)

HDD (C:/): OS + Temporary File Location

RAID 1

2x 36GB SCSI-320 (15k RPM)

HDD (D:/): DB File Location

RAID 5

4x 73GB SCSI-320 (15k RPM)

HDD (E:/): OS Pagefile + Log File Location (There is no Mirror Log)

RAID 5

4x 73GB SCSI-320 (15k RPM)

Notes: The RAID1 and first RAID5 (D:/) are on the same RAID controller. We were planning on upgrading both RAID5 with 146GB (15k RPM) drives in RAID10. Would that change help our apparent I/O Bandwidth issue?

3 Answers3

2

When dealing with RAIDs the traditional Disk counters in perfmon may give back misleading results. They will show cache I/O rather than disk I/O. So make sure you also look at the % Idle Time counter. This will probably the most accurate result, but it will be inverted (lower percentage equals busier disks)

Joseph Kern
  • 9,809
  • 3
  • 31
  • 55
  • Thanks, it looks like it averages around 20%, so perhaps disk I/O is not an issue. – Ralph Wissing Jun 18 '09 at 20:52
  • If the value you are reading is 20%, that means it's only idle 20% of the time. Meaning 80% of the time it's writing. You are getting very close to having I/O problems. – Joseph Kern Jun 22 '09 at 15:59
  • My mistake- I actually did the "conversion" when I posted. The graph read around 80% idle, so I/O load was about 20%. That being said, I think the only way to be sure would be to get the information from the RAID controller itself. – Ralph Wissing Jun 24 '09 at 13:45
1

The CurrIO statistic is not SMP safe in SA. You'd be better to look at the "PhysicalDisk" counters provided by Windows perfmon. In particular: "Current Disk Queue Length", "Avg. Disk Queue Length", "Avg Disk Write Queue Length" and "Avg. Disk Read Queue Length".

I'm not sure where the "3+#disks" value came from. If you are expecting lots of IO to be done on a drive, it is very reasonable to have several IOs outstanding on that drive.

Graeme Perrow
  • 545
  • 1
  • 4
  • 16
  • I also thought the "3 + # disks" was a little strange- Perfmon doesn't show anything outrageous, so the performance issues may not be due to disk I/O. Should "CurrRead" and "CurrWrite" be SMP safe? Looks to be around 100/s average, with spikes just over 1,000/s. – Ralph Wissing Jun 18 '09 at 21:02
1

Another way to see how much I/O is being performed by the database is by looking at the cache statistics. If the database is reading from cache, it isn't doing as much disk I/O. The two db properties that can be viewed are "CacheRead" and "CacheHits", like so:

SELECT db_property ( 'CacheRead' ), db_property ( 'CacheHits' )

The SQL Anywhere 10.0.0 manual recommends at least a 70% cache hit percentage. If it's below that, you may need to allocate more cache to the server. You can get the percentage directly like this:

SELECT STRING(((db_property ( 'CacheHits' ) / db_property ( 'CacheRead' )) * 100), '%')

In my particular case, when the database had a 22GB cache, the hit percentage was about 58%. After setting the cache to 55GB, the hit percentage went up to 97%. Although the "CurrIO" and "MaxIO" property exact numbers may be incorrect, the relative drop was drastic after this change as well.