2

We run rsnapshot (rsync) and use ionice & nice. Towards the end of the backup memory usage explodes and the below iSCSI errors start appearing. This is a Dell PowerEdge 1850 running Fedora 25 with 4.9 kernel, 8GB RAM, 8GB swap. This is connected to a hardware RAID called a VessRAID 1840i. I've seen the suggestion to disable TCP window scaling.

Another seemingly related thread has suggestions to:

sysctl -w net.ipv4.tcp_tw_recycle=0

as well as disabling TCP timestamps:

sysctl -w net.ipv4.tcp_timestamps=0
To make this change permanent, make an entry in /etc/sysctl.conf.

A similar sounding error, "iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)", I found a suggestion to:

turn off iSCSI pings by setting:

node.conn[0].timeo.noop_out_interval = 0
node.conn[0].timeo.noop_out_timeout = 0

And you can turn the replacement_timer to a very long value:

node.session.timeo.replacement_timeout = 86400

Here are the logs from /var/log/messages when memory usage starts to spiral out of control:

iscsid: Kernel reported iSCSI connection 1:0 error (1022 - Invalid or unknown error code) state (3)
iscsid: connection1:0 is operational after recovery (1 attempts)
chronyd[761]: Selected source 199.102.46.75
systemd: Starting dnf makecache...
dnf: Metadata cache refreshed recently.’systemd: Started dnf makecache.
systemd-logind: New session 10 of user root.
systemd: Started Session 10 of user root.
systemd-logind: Removed session 10.
kernel: connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 76580624, last ping 76585814, now 76591204
kernel: connection1:0: detected conn error (1022)
iscsid: Kernel reported iSCSI connection 1:0 error (1022 - Invalid or unknown error code) state (3)
iscsid: connection1:0 is operational after recovery (1 attempts)
systemd: Starting dnf makecache...
kernel: connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 81687108, last ping 81692267, now 81700110

And some logs showing the memory usage and kernel logs:

automount: page allocation stalls for 25608ms, order:1, mode:0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK)
CPU: 3 PID: 1163 Comm: automount Not tainted 4.9.5-200.fc25.i686+PAE #1
Hardware name: Dell Computer Corporation PowerEdge 1850/0D8266, BIOS A07 04/25/2008
ee55fdb8 d6f6c6c0 d7578510 00000000 ee55fde8 d6d854a6 d757cc59 027000c0
ee55fdf0 ee55fdf8 d7578510 ee55fdcc 1cbb0e3b 00000001 d76fa158 00000001
ee55feb0 d6d86040 027000c0 d7578510 00006408 00000001 ee5a6480 00000013
Call Trace:
[<d6f6c6c0>] dump_stack+0x58/0x78
[<d6d854a6>] warn_alloc+0xf6/0x110
[<d6d86040>] __alloc_pages_nodemask+0xb10/0xce0
[<d6dd5437>] ? kmem_cache_alloc+0xf7/0x1c0
[<d6c6aa89>] ? copy_process.part.44+0x549/0x14e0
[<d6c6a648>] copy_process.part.44+0x108/0x14e0
[<d6c94bf6>] ? check_preempt_curr+0x76/0x80
[<d6c6bbe4>] _do_fork+0xd4/0x370
[<d6c6bf6c>] SyS_clone+0x2c/0x30
[<d6c0369c>] do_int80_syscall_32+0x5c/0xc0
[<d73737e9>] entry_INT80_32+0x31/0x31
Mem-Info:
active_anon:10713 inactive_anon:8264 isolated_anon:0#012 active_file:249801 inactive_file:1557433 isolated_file:0#012 unevictable:1968 dirty:2 writeback:0 unstable:0#012 slab_reclaimable:188422 slab_unreclaimable:5832#012 mapped:8048 shmem:245 pagetables:1236 bounce:0#012 free:42226 free_pcp:0 free_cma:0
Node 0 active_anon:42852kB inactive_anon:33056kB active_file:999204kB inactive_file:6229732kB unevictable:7872kB isolated(anon):0kB isolated(file):0kB mapped:32192kB dirty:8kB writeback:0kB shmem:980kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
DMA free:3040kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15916kB mlocked:0kB slab_reclaimable:12720kB slab_unreclaimable:92kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 751 8055 8055
Normal free:5440kB min:3472kB low:4340kB high:5208kB active_anon:0kB inactive_anon:0kB active_file:4276kB inactive_file:4088kB unevictable:0kB writepending:0kB present:892920kB managed:791232kB mlocked:0kB slab_reclaimable:740968kB slab_unreclaimable:23236kB kernel_stack:1408kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 0 58430 58430
HighMem free:160424kB min:512kB low:8944kB high:17376kB active_anon:42852kB inactive_anon:33056kB active_file:994928kB inactive_file:6225644kB unevictable:7872kB writepending:8kB present:7479040kB managed:7479040kB mlocked:7872kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:4944kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 0 0 0
DMA: 14*4kB (UME) 9*8kB (UME) 4*16kB (UM) 5*32kB (UME) 8*64kB (UME) 7*128kB (UME) 3*256kB (UE) 1*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 3040kB
Normal: 1032*4kB (UMEH) 126*8kB (UME) 20*16kB (UME) 6*32kB (UME) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5648kB
HighMem: 1476*4kB (UM) 3131*8kB (UM) 1817*16kB (UM) 730*32kB (UM) 497*64kB (UM) 161*128kB (M) 22*256kB (M) 38*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 160888kB
Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
1809281 total pagecache pages
228 pages in swap cache
Swap cache stats: add 5032, delete 4804, find 7386/7796
Free swap  = 8342136kB
Total swap = 8355836kB
2096989 pages RAM
1869760 pages HighMem/MovableOnly
25442 pages reserved
0 pages hwpoisoned

Are those discussions I've linked above related? Which setting change(s) should I try?

Also seeing error like this:

kernel: perf: interrupt took too long (9721 > 9697), lowering kernel.perf_event_max_sample_rate to 20000
kernel:  [<d7374857>] error_code+0x67/0x6c
RobbieTheK
  • 390
  • 5
  • 15
  • I'm still getting some details from the vendor but before I commit an answer, this is possibly a controller issue. The unit has a system report and developer debug reporting mechanism and so far they are seeing that the controller received a watch dog timer reset multiple times, and ctrl detected dirty cache of 19% in the unit, which may be normal for a large rsync job like this is. Also they saw the port received multiple login logout errors in the unit which could point to an issue with the controller. – RobbieTheK Jan 30 '17 at 21:42

2 Answers2

1

I got a couple of replies from one of the Linux kernel maintainers. This does seem to be an issue with 32-bit kernels and low memory pressure. Here are some comments (after I provided some more logs and tracepoints):

all those are from the kswapd (background memory reclaim). Which means that it doesn't catch any allocation which can stall for too long. Anyway the above tracepoint show that we are able to make some progress during the reclaim (nr_reclaimed > 0). So I suspect that this is indeed a large lowmem pressure and I do not see what we can do about that.

as well as:

and this one is hitting the min watermark while there is not really much to reclaim. Only the page cache which might be pinned and not reclaimable from this context because this is GFP_NOFS request. It is not all that surprising the reclaim context fights to get some memory. There is a huge amount of the reclaimable slab which probably just makes a slow progress.

That is not something completely surprising on 32b system I am afraid.

It seems it's not something that will get fixed any time soon, if ever. FWIW I have these settings in /etc/sysctl.conf

# Disable TCP window scaling
net.ipv4.tcp_window_scaling=0
# tip from http://serverfault.com/questions/235965/why-would-a-server-not-send-a-syn-ack-packet-in-response-to-a-syn-packet
net.ipv4.tcp_timestamps=0
net.ipv4.tcp_tw_recycle=0
# vm.zone_reclaim_mode=1 CONFIG_NUMA needs to be enabled in kernel configuration for this setting
vm.min_free_kbytes=131072

#tip from https://www.spinics.net/lists/kernel/msg2403670.html
# write-cache, foreground/background flushing
vm.dirty_ratio = 3

# vm.dirty_background_ratio = 5 (% of RAM)
vm.dirty_background_ratio = 1

# make it 10 sec
vm.dirty_writeback_centisecs = 1000

# http://serverfault.com/questions/696156/kswapd-often-uses-100-cpu-when-swap-is-in-use
vm.swappiness=25
vm.vfs_cache_pressure=1000
RobbieTheK
  • 390
  • 5
  • 15
0

This was a different issue for us. We had our MTU on the interfaces on the server and switch side set to 9000, but we forgot to set the MTU on the switch uplinks to the switch where our storage devices was connected. So it was trying to connect with bigger packets than allowed and failing. Once we set the MTU to 9000 on the uplinks, the error went away and connections started to work.

apple4ever
  • 51
  • 4