2

I'm working on a program that needs to run on a Linux distro with an ext2 filesystem. This program will write files which may become very large. I notice that ext2 has a maximum file size of 16GB to 64GB. However, one thing on wikipedia's page that scared me somewhat is the following line:

There are also many userspace programs that can't handle files larger than 2 GB.

...when it's talking about ext2's limitations. Does this mean that I should be careful about letting a file grow larger than 2 GB?

Jason Baker
  • 1,229
  • 6
  • 20
  • 25
  • 4
    He is working on a PROGRAM that is handling files. Why on earth do you think this is about file systems or administrating a server. Shouldn't a programmer tell him what happens when a file crosses the 2gig boundary ? – Hassan Syed Dec 08 '09 at 15:24

4 Answers4

2

What you'll find is that some programs use 'fseek' to move around in a file.

int fseek ( FILE * stream, long int offset, int origin );

If they do things relative to the start of the file (SEEK_SET for the origin parameter), then they only have a signed 32 bit integer as the offset parameter, so they can only get 2 gig into the file.

For programs that don't use fseek/ftell (for instance, a program that just reads through the entire file in a linear fashion), and for programs that just use fseek to jump back and forth slightly from the current position (SEEK_CUR with offsets < 2G), there's no problem, everything will work just fine, no matter how big the file is. It's only programs that randomly access the file data that are going to have a problem.

Note that some environments have an 'fseek64' and 'ftell64' functions, which give the caller a 64 bit signed integer, and thus access to anything they want.

Michael Kohne
  • 2,284
  • 1
  • 16
  • 29
  • 1
    POSIX defines lseek and fseeko which take an off_t offset; _FILE_OFFSET_BITS macro is usually needed in order to make sure that off_t is a 64bit integer. – Luca Tettamanti Dec 08 '09 at 17:38
1

I've never had problems, and my system logs are routinely larger than 2 gigs on some of my servers with external IPs (the logs rotate weekly, not by size). I also run a couple of massive feeds that produce files that are 3-6 gigs in size, and I haven't had problems with those either.

I'd say it's completely dependent on what user-land programs you need: if there is a deal breaker, you may need to re-evaluate.

Satanicpuppy
  • 5,917
  • 1
  • 16
  • 18
1

The file size limit is very dependent on the block size of your file system. The single file limit is 16GB if you have a 1K block size, 256GB for 2K and 4TB for 4K. You can check your block size using:

mojo-jojo david% sudo tune2fs -l /dev/sda1 | grep "Block size"
Block size:               4096

This is on an ext3 partition, but they'll have the same limits. I would be very surprised if you have a 1K block size partition, and as such, you don't need to worry about the file system.

Having said that, some programs do fail to have large file support (larger than 2GB), but I've not seen one in a very long time. The last one I saw was commons-java's jsvc, which fell over when its log file got larger than 2GB. Pretty much anything written in the last 6 years will work unless someone went out of their way to do something weird.

David Pashley
  • 23,151
  • 2
  • 41
  • 71
1

The 2 GB limit has its origin in the 32-bit size of ssize_t/size_t/off_t on older systems. That is port of the POSIX spec and is not especially related to ext2.

As mentions in an comment above, you can compile your app with the flag "_FILE_OFFSET_BITS=64" to that these types have a size of 64-bit.

Here is an article about the state of the large file support in Linux.

dmeister
  • 195
  • 5