0

The situation is strange because I have 2 identical servers with the same application but

on one server i get a ulimit error from the custom application ulimit error: too many open files but on the other it is working as expected.

I really ensured the config is the same but I cannot figure it out why this is doing it.

FACTS

/etc/systemd/system.conf DefaultLimitNOFILE=100000000:100000000

/etc/systemd/user.conf DefaultLimitNOFILE=10000000

/etc/security/limits.conf

arserver         soft     nproc          10000000
arserver         hard     nproc          10000000
arserver        soft     nofile         10000000
arserver        hard     nofile         10000000
root soft     nproc          10000000
root hard     nproc          10000000
root soft     nofile         10000000
root hard     nofile         10000000

cat /etc/sysctl.conf

net.core.rmem_default = 65536
net.core.wmem_default = 65536
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
net.ipv4.tcp_max_orphans = 4096
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_synack_retries = 3
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_reordering = 3
net.ipv4.tcp_fastopen = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 32768 65535
vm.nr_hugepages = 1250
fs.file-max = 10000000

cat tracelog | grep pam_limits

arserver@arserver03:/carmicli/carmi$ cat testlog1 | grep limits
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/security/pam_limits.so", O_RDONLY|O_CLOEXEC) = 7
openat(AT_FDCWD, "/proc/1/limits", O_RDONLY) = 7
openat(AT_FDCWD, "/etc/security/limits.conf", O_RDONLY) = 7
read(7, "# /etc/security/limits.conf\n#\n#E"..., 4096) = 2345
openat(AT_FDCWD, "/etc/security/limits.d", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 7
openat(AT_FDCWD, "/proc/1/limits", O_RDONLY) = 7
openat(AT_FDCWD, "/etc/security/limits.conf", O_RDONLY) = 7
read(7, "# /etc/security/limits.conf\n#\n#E"..., 4096) = 2345
openat(AT_FDCWD, "/etc/security/limits.d", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 7

ulimit -a ulimit was updated in ulimit after reboot but the application will still not start

arserver@arserver03:/carmicli/carmi$ ulimit -a
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) 0
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 1030919
max locked memory           (kbytes, -l) 32998380
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1048576
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 10000000
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

In also created systemd service to see if i can override the global limits like this

[Unit]
Description=Carmi Miner
After=network.target

[Service]
User=root
WorkingDirectory=/app/carmi/
ExecStart=/app/carmi/app.elf
Restart=on-abnormal

LimitNOFILE=1000000000
LimitNOFILESoft=1000000000

[Install]
WantedBy=multi-user.target

But it still fails

Apr 08 14:19:45 arserver app.elf[3553]: ulimit error:  too many open files, possibly.
Apr 08 14:19:45 arserver systemd[1]: app.service: Main process exited, code=exited, status=19/n/a
Apr 08 14:19:45 arserver systemd[1]: app.service: Failed with result 'exit-code'.
Apr 08 14:19:45 arserver systemd[1]: app.service: Consumed 21.420s CPU time.

I also added the pam_limits.so to pam.d configs as i read that in non LSB releases it can be missing from some parts. addded required to common_session sudo and sshd


arserver@arserver03:/app/app$ grep -r "pam_limit" /etc/pam.d/
/etc/pam.d/cron:session    required   pam_limits.so
/etc/pam.d/login:session    required   pam_limits.so
/etc/pam.d/sshd:session    required     pam_limits.so
/etc/pam.d/sudo:session    required   pam_limits.so
/etc/pam.d/su:session    required   pam_limits.so
/etc/pam.d/common-session:session required        pam_limits.so
/etc/pam.d/common-session-noninteractive:session    required   pam_limits.so
/etc/pam.d/runuser:session      required    pam_limits.so
arserver@arserver03:/carmicli/carmi$ grep -r "pam_limit" /etc/pam.d/
/etc/pam.d/cron:session    required   pam_limits.so
/etc/pam.d/login:session    required   pam_limits.so
/etc/pam.d/sshd:session    required     pam_limits.so
/etc/pam.d/sudo:session    required   pam_limits.so
/etc/pam.d/su:session    required   pam_limits.so
/etc/pam.d/common-session:session required        pam_limits.so
/etc/pam.d/common-session-noninteractive:session    required   pam_limits.so
/etc/pam.d/runuser:session      required    pam_limits.so

Ive been pulling my hair out on this problem for the past week,if anyone is able to help will be much appreciated.

Going to ubuntu 20.04 is an option but will take me a long time to move the data so I would prefer to figure out the solution if possible.

UPDATE

when I do sudo su to root i get the same problem but with this error in the auth log

Apr  8 14:45:31 arserver05 su: pam_limits(su:session): Could not set limit for 'nofile' to soft=10000000, hard=10000000: Operation not permitted; uid=0,euid=0
Apr  8 14:45:31 arserver05 su: pam_limits(su:session): Could not set limit for 'nofile' to soft=10000000, hard=10000000: Operation not permitted; uid=0,euid=0

UPDATE 2

Cant set the ulimit above 1048576

root@arserver03:/home/arserver# ulimit -n 1048576
root@arserver03:/home/arserver# ulimit -n 10485767
bash: ulimit: open files: cannot modify limit: Operation not permitted
root@arserver03:/home/arserver# ulimit -n 1048576
Arturski
  • 274
  • 1
  • 5
  • 17
  • 1
    Does this answer your question? [Practical maximum open file descriptors (ulimit -n) for a high volume system](https://serverfault.com/questions/48717/practical-maximum-open-file-descriptors-ulimit-n-for-a-high-volume-system) – djdomi Apr 08 '22 at 16:54
  • No it doesn't help because the settings are in place including to the application but the ulimit error still happens where as on the server configured with the same settings the same application runs ok – Arturski Apr 09 '22 at 09:39
  • since you basically want to remove the limit, have you tried `ulimit -n unlimited` and please gave the output from `cat /proc/sys/fs/file-max` – djdomi Apr 09 '22 at 10:54

0 Answers0