1

We are running ElasticSearch on several Azure D12v2 instances, on Centos7.

While indexing data, machines' IO seems to be quite poor, dancing between 3MB and 15MB per second, which is blatantly slow for SSD storage.

                               -- Taken from IOTOP --
Total DISK READ :       0.00 B/s | Total DISK WRITE :       3.01 M/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:       2.68 M/s

                               -- Taken from IOTOP --
Total DISK READ :       0.00 B/s | Total DISK WRITE :       6.93 M/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:       8.54 M/s

Is anyone running this type of VM facing the same problem? If so, how can it be fixed? Am I missing something and these read/write speeds are "normal "?

Doing same tests on physical machines with mechanical drives had better result.

=====EDIT=====

I changed the instances to DS12v2, yet similar performance is attained (With some random spikes between 30 to 50 MB/s)

Navarro
  • 187
  • 6
  • 2
    D12v2 doesn't use an SSD OS disk or data disks. Only the temp disk is SSD. Which disk are you using? Can you try spinning up a DS12v2 which would have all SSD disks? – GregGalloway Jan 26 '17 at 18:14
  • I am using tmp SSD disks. – Navarro Jan 27 '17 at 07:52
  • Do you config `swap`? You could use `free` to check this. For a Linux VM, you need configure swap partitions. More information please refer to this [link](https://docs.microsoft.com/en-us/azure/virtual-machines/virtual-machines-linux-optimization#linux-swap-file) – Shui shengbao Jan 30 '17 at 06:53
  • These machines have, in fact, no swapping space. This seems to be a default behavior for Azure Linux machines. Thank you for sharing that link. – Navarro Jan 30 '17 at 08:47
  • Based on my experience, swap partitions could increase I/O. Maybe you could test. – Shui shengbao Jan 30 '17 at 12:40

1 Answers1

1

You might have to take this up with Azure support. However, you might also check to see if you're using an appropriate scheduler. For a platform like this, you should be using the "noop" or "deadline" scheduler. To see what scheduler you're using, try the following command:

cat /sys/block/sda/queue/scheduler (Your disk may not be sda, so change this for your needs)

This will output several scheduler options, and the one in brackets is the one you're currently using. You can change this temporarily by issuing the desired scheduler via echo:

echo noop > /sys/block/sda/queue/scheduler

You can also specify scheduler=noop as a boot time kernel argument if it needs to be changed beyond testing. This is done by editing /etc/sysconfig/grub.cfg and appending scheduler=noop to your kernel arguments.

Spooler
  • 7,016
  • 16
  • 29
  • Thank you for the detailed answer, unfortunately the nodes are using deadline. `cat /sys/block/sda/queue/scheduler` `noop [deadline] cfq` Will I see some improvement using noop? – Navarro Jan 27 '17 at 13:28
  • Switching to noop shouldn't help, but it won't hurt to try either. This is likely not your issue. – Spooler Jan 27 '17 at 16:58