12

We are running a website which is currently serving 3-5 million page views. Our site is a file sharing site and so it contains 250,000 files and few thousand symbolic links.

The hard disk is a 1500GB SATA disk.

Using hdparm we came to know that our hard disk speed has been reduced to 15-20 MB/s, which was 80 MB/s.

So now we want to run fsck to fix the disk problem.

  1. Will fsck will solve this issue?
  2. How much time will fsck take to complete (just we want to calculate the downtime which we are going to have)?
kasperd
  • 29,894
  • 16
  • 72
  • 122
khizar ansari
  • 205
  • 1
  • 2
  • 10

4 Answers4

9

The speed degradation is to be expected as the number of files being accessed simultaneously increases. Hard disk drives do not like to be accessed in parallel: every time the read/write head needs to switch cylinders you lose several milliseconds. Even if two files are on the same cylinder, or even the same track, you may still have to wait a rotation to move from one to another. If you measure drive performance in megabits per second, expect that to drop exponentially as parallel access increases.

fsck will not help with this: it only repairs damage to the directory structure, it does not perform any optimization.

The ideal solution would be switching to solid-state storage since that does not have any of the physical limitations of spinning platters. But that's probably cost-prohibitive.

The next best would be to use a RAID optimized for parallel access. Keep in mind that RAIDs can be configured for many different performance profiles, so you will need to take some time to learn the settings of any given RAID hardware and drivers.

You may be able to reduce the problem using aggressive filesystem caching. If your system has sufficient RAM, linux should be doing this fairly well already. Run a program like top to see how much free RAM there is. But if the most commonly used files do not fit in RAM (or any RAM you are likely to acquire), this won't really help.

A poor-man's work-around would be to split your files across several different physical hard-drives (not just different partitions on the same drive). That's not really a long-term scaleable solution and would end up costing you more than a decent RAID. But it might be a quick fix if you have drives lying around.

For any solution involving hard disk drives, make sure they have a fast rotation speed and low seek latency.

I have written an article with some general background on hard-drive performance here:

UNIX Tips - Filesystems

Seth Noble
  • 356
  • 1
  • 5
  • I don't see his `hdparm` benchmark having much to do with "parallel access". It sounds more, to me, like he's got a failing disk. It was faster in the past and now it's not. Probably because it's relocating sectors. – Evan Anderson Feb 21 '12 at 16:44
  • That is certainly a possibility, although I would think relocation on that scale would produce some I/O errors. Based on the very slow baseline of 80 megabits per second, I was assuming the test was run on an active system. So... are there I/O errors in the system log, how were the `hdparm` tests performed, and were the results in "megabits" or "megabytes" per second? – Seth Noble Feb 21 '12 at 17:51
4

I would expect 5 hours for the fsck to complete.

I would instead consider (that means: testing, testing and testing) a migration to reiserfs.

marcoc
  • 738
  • 4
  • 10
3
  1. No (fsck can fix corrupted filesystem metadata, not a broken disk, nor is it a defragmentation tool).
  2. Depends on the filesystem. With ext3, excruciatingly long, I'd reserve several hours. More modern filesystems such as ext4 or xfs can easily be an order of magnitude faster.
janneb
  • 3,761
  • 18
  • 22
  • You mention xfs -- I wanted to note that you wouldn't use fsck for that. When I tried "fsck.xfs" it told me, "If you wish to check the consistency of an XFS filesystem or repair a damaged filesystem, see `xfs_repair`." – Noumenon Oct 21 '20 at 04:03
1

hdparm does a sequential read. Your File Server disk should be doing a lot of seek as the other guys said.

If you are getting HD errors, they should be appearing in your /var/log/ somewhere.

Why don't you try "smartctl -t short /dev/sda" and then "smartctl -t long /dev/sda" ??... With most of the new HDD, you can issue this command even when using the HD. Smart will give you some results. You can read your HDD health using "smartctl --all /dev/sda" ..

If you're sending a hdparm to HDD that is mounted with concurrent access, that could be the answer why your results are a lot less than before.

I should move your data to a RAID setup asap.