4

We use Cisco ASA for our IPSEC VPNs, using the EZVPN method. From time to time we encounter problems where an ISP has made a change to their network and our VPN stops working. Nine times out of ten the ISP denies that their change could have stopped this working - I suspect because they don't understand exactly what might have caused the problem. Rather than just bashing heads with them I want to try and point them in a direction that might get a speedier resolution.

In my current incident, I can ssh onto the external interface of the ASA and do a little poking around:

 sh crypto isakmp sa

   Active SA: 1
    Rekey SA: 0 (A tunnel will report 1 Active and 1 Rekey SA during rekey)
Total IKE SA: 1

1   IKE Peer: {Public IP address of London ASA}
    Type    : user            Role    : initiator
    Rekey   : no              State   : AM_TM_INIT_XAUTH_V6C

At the other end of the link I see the following:

Active SA: 26
<snip>
25  IKE Peer: {public IP address of Port-Au-Prince-ASA}
    Type    : user            Role    : responder
    Rekey   : no              State   : AM_TM_INIT_MODECFG_V6H

I can't find any documentation for what AM_TM_INIT_XAUTH_V6C or AM_TM_INIT_MODECFG_V6H, but I'm pretty sure it means that the IKE handshake has failed for some reason.

Can anyone suggest any likely things that might be preventing IKE from succeeding, or specific details of what AM_TM_INIT_XAUTH_V6C means?

Update: We connected the ASA at the site of a customer of another ISP. The VPN connection came up immediately. This confirms that the problem is not configuration related. The ISP is now accepting responsibility and investigating further.

Update: The connection suddenly came back online last week. I have notified the ISP to see if they changed anything, but not heard back yet. Frustratingly I am now seeing a similar issue on another site. I found a Cisco doc on the effects of fragmentation on VPN. I am starting to think that this may be the cause of the issues I am seeing.

dunxd
  • 9,482
  • 21
  • 80
  • 117
  • I've got a bunch of output from `debug crypto isakmp 255` - too much to drop in here. Can anyone give me any pointers for what in that might be relevant for troubleshooting. I can then add it to the question. – dunxd Jun 24 '11 at 08:52
  • Pastebin it, then link to the pastebin ;) – Tom O'Connor Jun 24 '11 at 09:12
  • Is anyone seriously going to trawl through a debug dump? That would be kind. However, how much of the info in there would I need to redact before pastebin it? Would prefer if someone kind enough to look through would specify what sort of things they would be looking for, and I paste that specifically. – dunxd Jun 24 '11 at 09:47
  • I'm at pains to say this, after having re-read the question, but I'm not sure I'd want to deal with an ISP who don't know the effect of their changes on their own damn network. Find an ISP who don't block ESP, IKE or IPSEC, then use them for the VPN connection. – Tom O'Connor Jun 24 '11 at 10:31
  • Do you have some less-verbose logs that would be easier to redact? Even having the informational-level logs should give a pretty good idea of where a tunnel build is failing. – Shane Madden Jun 24 '11 at 17:01
  • @Tom wouldn't that be nice, but the site is in Haiti and I work for an NGO with limited budget, so I don't have that kind of luxury. – dunxd Jun 26 '11 at 18:58

3 Answers3

2

With a little assistance from Cisco I did some deeper analysis of what was happening, and figured out the things that I needed to be checking for. The useful things that Cisco told me:

  • debug crypto isakmp 5 gives enough detail to see whether problems are occurring with ISAKMP traffic
  • clear crypto isakmp sa clears out any stale security associations.
  • clear crypto isakmp {client_ip_address} can be used on the HQ to clear out a specific security association (you don't necessarily want to clear all your security associations if it is only one device that is having trouble!
  • packet captures at both ends are really useful to figure out what is going on

Reading up a little on the IPSEC suite, and ISAKMP more specifically showed that the following need to be allowed through any firewalls in the path:

  • ISAKMP traffic on UDP port 500
  • ISAKMP (used for NAT-Tunnelling) traffic on UDP port 4500
  • ESP traffic (IP Protocol 50)
  • AH traffic (IP Protocol 51)

It seems a lot of people out there don't realise the important difference between IP protocols and TCP/UDP ports.

The following packet captures focussed on the above types of traffic. These were set up on both the remote and HQ ASAs:

object service isakmp-nat-t 
    service udp destination eq 4500 
    description 4500
object-group service ISAKMP-Services
    description Traffic required for ISAKMP
    service-object esp 
    service-object ah 
    service-object object isakmp-nat-t 
    service-object udp destination eq isakmp
access-list ISAKMP extended permit object-group ISAKMP-Services host {hq_ip_address} host {remote_ip_address}
access-list ISAKMP extended permit object-group ISAKMP-Services host {remote_ip_address} host {hq_ip_address}
capture ISAKMP access-list ISAKMP interface outside

You can then download the captures from each device at https://{device_ip_address}/capture/ISAKMP/pcap and analyse it in Wireshark.

My packet captures showed that ISAKMP traffic outlined above was getting fragmented - since those packets are encrypted, once they are fragmented it is hard to put them back together and things break.

Giving this information to the ISP meant they could do their own focussed checking, and resulted in them making some changes to a firewall. Turns out the ISP was blocking all ICMP traffic on their edge router, which meant that Path MTU Discovery was broken, resulting in fragmented ISAKMP packets. Once they stopped blanket blocking ICMP the VPN came up (and I expect all their customers started getting better service in general).

dunxd
  • 9,482
  • 21
  • 80
  • 117
0

It's quite possible your ISP is misinterpreting your traffic as P2P filesharing or something nefarious. Take a look at M-Lab to find out if that's what could be happening.

0

An AM_TM_INIT_XAUTH error likely means your pre-shared keys don't match. (source www.cisco.com/warp/public/471/easyvpn-nem.pdf)

All that the needs to work to establish an IPSec session is for udp traffic destined to port 500 (for IKE) and ESP traffic (or udp 4500 for NAT-T) to be permitted. This seems like a configuration issue rather than an ISP-caused problem. Feel free to post your relevant configuration if you'd like some help verifying.

JakePaulus
  • 2,347
  • 16
  • 17
  • We tested the ASA on a connection going through another ISP. It worked. The ISP has admitted the problem is likely at their end. – dunxd Aug 01 '11 at 08:40