1

I'm trying to track down and root cause a 'too many file open' issue on RHEL 7.

From googling around, if I run this command it give me the open file count in the first column and the PID in the second column.

$ lsof | awk '{print $2}' | sort | uniq -c | sort -n
   ...
   6300 20779
  31417 21703
  32319 21399
*1439165 21459*

But then if I try to count the open files for PID 21459, I get a much smaller number (1.4M vs 5.8k) - Why is there a difference and which is correct?

$ lsof -p 21459 | wc -l
5876

I can then get more info about PID 21459, which is Apache NiFi

$ ps -Flww -p 21459
F S UID         PID   PPID  C PRI  NI ADDR SZ WCHAN    RSS PSR STIME TTY          TIME CMD
0 S root      21459  20779 99  80   0 - 22255308 futex_ 61812044 26 10:37 ?   1-08:40:36 java -classpath /opt/nifi/nifi-1.9.2/./conf:/opt/nifi/nifi-1.9.2/./lib/jul-to-slf4j-1.7.25.jar:/opt/nifi/nifi-1.9.2/./lib/jcl-over-slf4j-1.7.25.jar:/opt/nifi/nifi-1.9.2/./lib/jetty-schemas-3.1.jar:/opt/nifi/nifi-1.9.2/./lib/javax.servlet-api-3.1.0.jar:/opt/nifi/nifi-1.9.2/./lib/slf4j-api-1.7.25.jar:/opt/nifi/nifi-1.9.2/./lib/logback-classic-1.2.3.jar:/opt/nifi/nifi-1.9.2/./lib/nifi-properties-1.9.2.jar:/opt/nifi/nifi-1.9.2/./lib/nifi-runtime-1.9.2.jar:/opt/nifi/nifi-1.9.2/./lib/nifi-framework-api-1.9.2.jar:/opt/nifi/nifi-1.9.2/./lib/logback-core-1.2.3.jar:/opt/nifi/nifi-1.9.2/./lib/nifi-nar-utils-1.9.2.jar:/opt/nifi/nifi-1.9.2/./lib/nifi-api-1.9.2.jar:/opt/nifi/nifi-1.9.2/./lib/log4j-over-slf4j-1.7.25.jar -Dorg.apache.jasper.compiler.disablejsr199=true -Xmx64g -Xms16g -Djavax.security.auth.useSubjectCredsOnly=true -Djava.security.egd=file:/dev/urandom -Dsun.net.http.allowRestrictedHeaders=true -Djava.net.preferIPv4Stack=true -Djava.awt.headless=true -XX:+UseG1GC -Djava.protocol.handler.pkgs=sun.net.www.protocol -Dnifi.properties.file.path=/opt/nifi/nifi-1.9.2/./conf/nifi.properties -Dnifi.bootstrap.listen.port=39005 -Dapp=NiFi -Dorg.apache.nifi.bootstrap.config.log.dir=/opt/nifi/nifi-1.9.2/logs org.apache.nifi.NiFi

If I run the command below, it shows 3079 file descriptors.

$ ll /proc/21459/fd | wc -l
3079
$ ll /proc/21459/fd
lr-x------ 1 root root 64 Jul 26 15:27 609 -> /opt/nifi/nifi-1.9.2/work/nar/extensions/nifi-email-nar-1.9.2.nar-unpacked/NAR-INF/bundled-dependencies/poi
lrwx------ 1 root root 64 Jul 26 15:27 61 -> socket:[98493]
lr-x------ 1 root root 64 Jul 26 15:27 610 -> /opt/nifi/nifi-1.9.2/work/nar/extensions/nifi-email-nar-1.9.2.nar-unpacked/NAR-INF/bundled-dependencies/poi
lr-x------ 1 root root 64 Jul 26 15:27 611 -> /opt/nifi/nifi-1.9.2/work/nar/extensions/nifi-email-nar-1.9.2.nar-unpacked/NAR-INF/bundled-dependencies/spr
lr-x------ 1 root root 64 Jul 26 15:27 612 -> /opt/nifi/nifi-1.9.2/work/nar/extensions/nifi-email-nar-1.9.2.nar-unpacked/NAR-INF/bundled-dependencies/spr
lr-x------ 1 root root 64 Jul 26 15:27 613 -> /opt/nifi/nifi-1.9.2/work/nar/extensions/nifi-email-nar-1.9.2.nar-unpacked/NAR-INF/bundled-dependencies/spr
lr-x------ 1 root root 64 Jul 26 15:27 614 -> /opt/nifi/nifi-1.9.2/work/nar/extensions/nifi-email-nar-1.9.2.nar-unpacked/NAR-INF/bundled-dependencies/spr

Note the colors on this output appear to indicate a problem?

58 (red text)   -> socket:[111898] (white text, red bg)
610 (blue text) -> /opt/nifi/somefile.jar (red text, black bg)
...etc...

output of ll proc/pid/fd

What else can I do to track down the cause of all the open files? How can I actively monitor the number of open files for a process? (just keep running lsof?) How do I tell if it is an application problem or server configuration problem?

  • This seems like a application setting. So I would check the recommendations provided by the Nifi docs. But before do check the current system limits (limits.conf or via ullmit) and increase those numbers if necessary. The sockets are also opening a file descriptor. To see more on that check/read on TCP TIME_WAIT. Monitoring lsof is not a bad idea. So something like this might do the job: `lsof -p [PID] -r [interval in seconds] > mylsof.out`. Also check `netstat | grep 'TIME_WAIT' |wc -l` – Tux_DEV_NULL Jul 29 '19 at 10:30
  • Appreciate the ideas. Will try that. Current limit is set to 80,000 (NiFi recommends at least 50,000). – lifebythedrop Jul 29 '19 at 23:09

0 Answers0