11

In my web server(apache is running, Linux CentOS), there is a very large log file(50 Gbyte). This web server have some web services in production.

When I tried to delete the log file, web server had no response about 10 seconds. (Service out time.)

rm -f monthly.log

Is there any way to delete this large file without apache freezing?

Jinbom Heo
  • 211
  • 2
  • 5

8 Answers8

23

Rotate it first via logrotate, using a config like this:

/path/to/the/log {
    missingok
    notifempty
    sharedscripts
    daily   
    rotate 7
    postrotate
        /sbin/service httpd reload > /dev/null 2>/dev/null || true
    endscript
    compress
}

then create a cron job at midnight to delete the rotated file:

30 2 * * * nice -n 19 ionice -c2 -n7 rm -f /path/to/the/log/file.1
Karma Fusebox
  • 1,064
  • 9
  • 18
quanta
  • 50,327
  • 19
  • 152
  • 213
  • Can you explain what this means / does? – mowwwalker Feb 21 '13 at 06:13
  • 1
    you are 'nicing' and 'ionicing' the deletion. Nice used to arguably prevent any CPU overusage, but the most important here is ionice, where you are actually telling the scheduler to delete the file with a lower priority. -c is for class, where 1 is real time, 2 normal, and 3 idle. Within class 2, you have from 0 to 7 (IRRC) where 7 is the lowest one. IF that still creates problems, run it with 'ionice -c3' and it should be fine. – golan Feb 21 '13 at 19:06
5

For faster deletion of big files, you can use the truncate command - Say to shrink it to a size of zero and then delete it:

 truncate -s 0  monthly.log && rm -f monthly.log

As quanta recommended, you need to logrotate it first though.

Daniel t.
  • 9,061
  • 1
  • 32
  • 36
3

If you don't need the data, truncate it using /dev/null:

cat /dev/null > monthly.log

The webserver will continue to write data to the file after the truncation, which avoids any need to restart the webserver (unlike rm monthly.log, which removes the file).

After solving the immediate crisis, consider logrotation as Quanta suggested. You don't want this happening again. Note that the Apache logfiles are already rotated by default on CentOS

Also consider sending the web logs through syslog (using /usr/bin/logger, for example). Logs that are created using syslog also usually have logrotation set up already.

Stefan Lasiewski
  • 22,949
  • 38
  • 129
  • 184
3
echo "0" > monthly.log && rm -f monthly.log
Amit Biyani
  • 51
  • 1
  • 1
  • 3
3

I would truncate/zero the file with the : > /path/to/monthly.log operation. Then possibly restart the Apache process and set up log rotation to prevent this from happening in the future...

This comes up often, though:

See: Is there a way to delete 100GB file on Linux without thrashing IO / load?

In unix, what's the best way to reduce the size of a massive log file that is actively being written to?

Linux server out of space

ewwhite
  • 194,921
  • 91
  • 434
  • 799
2

If you're using the ext3 filesystem, consider switching to ext4.

Ext3 can be slow at deleting large files because it stores the location of every individual 4k block: a 50GiB file (50*1024^3 bytes) occupies 13107200 blocks, each of which is recorded in the inode table as a 32-bit block number, for a total of 50MiB of bookkeeping data just to keep track of where the file's contents are located on the disk. That large block list may be scattered across many indirect blocks, all of which have to be updated when the file is deleted. Disk seeking to access all those indirect blocks is probably what's causing the delay.

Ext4, on the other hand, allocates files in "extents" of up to 128MiB. That 50GiB file can be recorded in the inode table using just 400 extent records, rather than 13107200 individual block numbers, which dramatically reduces the amount of disk I/O needed when deleting the file.

Note that if you convert an existing ext3 filesystem in-place into ext4, new files will be allocated using extents, but existing files will still use block lists. You can use the chattr +e command to reallocate an existing file using extents; performance-wise, this is comparable to making a copy of the file and then deleting the original.

Wyzard
  • 1,143
  • 6
  • 13
1

This boils down to a filesystem performance issue. There's an interesting answer to this at this SO question but this does rather depend on what filesystem you're using. I used XFS when creating a filesystem to store hundreds of multi-gigabyte MPEG2 files for MythTV because at the time the delete performance of XFS was far superior to ext3. Things may have changed considerably in the intervening years.

I do like @quanta's answer though. Splitting up the file into smaller parts will lead to faster deletion.

Tim Potter
  • 1,754
  • 15
  • 15
1

The problem comes from, i suppose, that you are deleting the file from the privileged user which have more priority to disk operations than the apache webserver user. No matter wich way you choose to delete the log file (rm -f or truncate by > ) you should lower its disk priority operations to a minumum:

  ionice -c3 rm -f filename.log
Andrei Mikhaltsov
  • 2,987
  • 1
  • 22
  • 31