We have a number of filesystems for our computational cluster, with a lot of users that store a lot of really large files. We'd like to monitor the filesystem and help optimize their usage of it, as well as plan for expansion.
In order to this, we need some way to monitor how these filesystems are used. Essentially I'd like to know all sorts of statistics about the files:
- Age
- Frequency of access
- Last accessed times
- Types
- Sizes
Ideally this information would be available in aggregate form for any directory so that we could monitor it based on project or user.
Short of writing something up myself in Python, I haven't been able to find any tools capable of performing these duties. Any recommendations?