2

I keep getting random dropouts for my VPN tunnel, it only happens rarely (~twice a week) if I do a "service ipsec restart" then it immediately starts working again. Really annoying as I'm try to replicate a large VM to our DR site and everytime the tunnel drops I have to start again!

Config as below. Any ideas guys?

esp-group DR {
         compression disable
         lifetime 3600
         mode tunnel
         pfs enable
         proposal 1 {
             encryption aes128
             hash sha1
         }
     }


 ike-group DR {
         dead-peer-detection {
             action restart
             interval 15
             timeout 30
         }
         lifetime 28800
         proposal 1 {
             dh-group 2
             encryption aes128
             hash sha1
         }
     }



peer *.*.*.* {
             authentication {
                 mode pre-shared-secret
                 pre-shared-secret ***
             }
             connection-type initiate
             description "DR Site"
             ike-group DR
             local-address *.*.*.*
             tunnel 2 {
                 allow-nat-networks disable
                 allow-public-networks disable
                 esp-group DR
                 local {
                     prefix 192.168.*.0/24
                 }
                 remote {
                     prefix 10.*.0.0/24
                 }
             }
         }

After checking the logs, they seemed to be full with this message:

Apr  3 13:23:37 *.*.*.* pluto[20789]: packet from *.*.*.*:500: received Vendor ID payload [Dead Peer Detection]
Apr  3 13:23:37 *.*.*.* pluto[20789]: packet from *.*.*.*:500: ignoring Vendor ID payload [RFC 3947]
Apr  3 13:23:37 *.*.*.* pluto[20789]: packet from *.*.*.*:500: ignoring Vendor ID payload [draft-ietf-ipsec-nat-t-ike-03]
Apr  3 13:23:37 *.*.*.* pluto[20789]: packet from *.*.*.*:500: ignoring Vendor ID payload [draft-ietf-ipsec-nat-t-ike-02_n]
Apr  3 13:23:37 *.*.*.* pluto[20789]: packet from *.*.*.*:500: ignoring Vendor ID payload [draft-ietf-ipsec-nat-t-ike-02]
Apr  3 13:23:37 *.*.*.* pluto[20789]: packet from *.*.*.*:500: ignoring Vendor ID payload [draft-ietf-ipsec-nat-t-ike-00]
Apr  3 13:23:37 *.*.*.* pluto[20789]: packet from *.*.*.*:500: initial Main Mode message received on *.*.*.*:500 but no connection has been authorized with policy=PSK
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: process_status_message: bad node [****] in message
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG: Dumping message with 12 fields
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[0] : [t=status]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[1] : [st=active]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[2] : [dt=2710]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[3] : [protocol=1]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[4] : [src=****]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[5] : [(1)srcuuid=0x201c570(36 27)]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[6] : [seq=28077b]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[7] : [hg=50b63627]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[8] : [ts=533d60db]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[9] : [ld=0.00 0.01 0.05 1/87 31182]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[10] : [ttl=3]
Apr  3 13:23:39 *.*.*.* heartbeat: [3397]: ERROR: MSG[11] : [auth=1 96fa591a077c1bd3941d450c9c8973d8f0a9440f]
Mark Gifford
  • 21
  • 1
  • 3

2 Answers2

1

The setting I found which helped tunnel stability a lot was

set vpn ipsec auto-update '60'

My dead peer detection intervals & timeouts were longer than yours (30 & 120 seconds, respectively), and I used VTIs, but your configurations are otherwise almost identical to mine. I was able to sustain 400 Mbps through the tunnel inside a VyOS VM no problems.

Paul Gear
  • 3,938
  • 15
  • 36
0

Sorry, cannot comment, not really an answer:

  • Did you try to increase the ike-group DR dead-peer-detetion timeout value?
  • Is it possible than, randomly, twice a week, there is a peak of network usage which saturate all the available bandwidth?
Eddie C.
  • 487
  • 1
  • 3
  • 12