Best folder configuration for application that opens around 1 million files and ulimit problems

1

I have an application that creates 16^5 files (1048576). I can configure it to create all them in the same folder or separate them in whatever way I want (having 1, 2, 3 or 4 subfolders on each folder). Example:

/*

or

/a/*
.
.
.
/f/*

or

/a/a/*
/a/b/*
.
.
.
/f/f/*

or

/a/a/a/*
.
.
.
/f/f/f/*

or

/a/a/a/a/*
/a/a/a/b/*
.
.
.
/f/f/f/f/*

All the files have less than 4KB. I am using Ubuntu 12.10 64bits and an ext4 partition to store this. What folder structure would be the best structure for this case? Maybe other filesystem would be best suitable for this case, any ideas?

Anyway I am trying to run this algorithm, I should be able to open 9999999 files:

user@pc$ ulimit
unlimited

user@pc$ cat /proc/sys/fs/file-max
9999999

user@pc$ cat /etc/sysctl.conf
fs.file-max = 9999999

However, when I run it saving everything in one single folder the fopen call fails around 999999 files:

user@pc$ ls database/ | wc -l
999958

Strangely this 999999 was my previous value for file-max at system files. I have of course rebooted my machine before updated the value, maybe it is too big and then it keeps the last one. What can be wrong?

Frederico Schardong

Posted 2013-03-02T06:47:42.453

Reputation: 117

Whomever that didn't like my question, please comment so I can fix it – Frederico Schardong – 2013-03-02T06:52:18.970

Why in the world would you need to create so many files, can't you group some of the information into a single large file? – terdon – 2013-03-02T11:10:57.833

Try dividing so that you don't have more than 10^4-10^5 files per directory. Use reiserfs for small files. – Ярослав Рахматуллин – 2013-03-05T06:53:22.837

Answers

1

If you look at proc(5), /proc/sys/fs/file-max "defines a system-wide limit on the number of open files for all processes". In particular, it doesn't say that a single process can open that many files.

You may want to refer to sysconf(3), which describes OPEN_MAX as "The maximum number of files that a process can have open at any time". You can retrieve this value by running getconf OPEN_MAX.

I actually don't know offhand how large you can make OPEN_MAX, and I'm not inclined to investigate further at this hour, but feel free to experiment and report back to us.

Incidentally, I would also run ulimit -a to show all the limits. Running ulimit alone in bash implies ulimit -f, which only shows the maximum size of files written by the shell and its children.

P.S. If your application needs to hold a million files open at the same time, I would highly suggest re-evaluating your design.

jjlin

Posted 2013-03-02T06:47:42.453

Reputation: 12 964

Yes I am going to change it, thanks for your answer. – Frederico Schardong – 2013-03-03T21:20:04.533