2
5
Update 3: 22 January 2017
I've narrowed down the update that causes this problem: KB3201845 OS Build 14393.479 from December 9 2016. I went through all these updates step by step.
Looking at the changes files manifest, these files have been changed:
- agilevpn.sys
- vpnike.dll
Other topics I've created:
Microsoft Technet Forums, 2 other people have confirmed the same problem there.
Update
Windows 10 pre-anniversary (version 10.0.10586) works fine
Windows 10 post-anniversary (version 10.0.14393) drops, so the problem is with the anniversary update.
Update 2
I can connect using two pre-anniversary Windows hosts fine
If I connect 1 anniversary Windows host to the VPN and keep it alive artificially by pinging, I cannot connect a second host to the VPN. In the router's "debug crypto ikev2", this happens:
Jan 17 19:20:35.811: IKEv2:(SESSION ID = 25,SA ID = 2):Session present in ID PAIR TREE, but absent in TUPLE TREE
Jan 17 19:20:35.811: IKEv2:(SESSION ID = 25,SA ID = 2):: Failed to add new SA into session DB
Jan 17 19:20:35.811: IKEv2:(SESSION ID = 25,SA ID = 2):Queuing IKE SA delete request reason: unknown
Jan 17 19:20:35.811: IKEv2:(SESSION ID = 25,SA ID = 2):Sending DELETE INFO message for IPsec SA [SPI: 0x3DA95352]
Jan 17 19:20:35.811: IKEv2:(SESSION ID = 25,SA ID = 2):Check for existing IPSEC SA
Jan 17 19:20:35.811: IKEv2:(SESSION ID = 25,SA ID = 2):Delete all IKE SAs
Jan 17 19:20:35.811: IKEv2:(SESSION ID = 25,SA ID = 2):Sending DELETE INFO message for IKEv2 SA [ISPI: 0x3B52B7EBA8A353B1 RSPI: 0x16F36D79AAE675A5]
I have a Cisco IOS router, 892 model, which I'm setting up IKEv2 with EAP-MSCHAPv2 as remote authentication (backed by a Windows 2012 Server Network Policy Server) and local certificate authentication. Everything works, I can connect to the VPN and ping a loopback address on the router. Windows 7 and 8.1 work fine, Android with Strongswan too.
However, on Windows 10 (10.0.14393 - fully up to date 16 jan 2017), exactly 60 seconds after the last data exchange (like a ping), Windows drops the connection. So:
- T+0 VPN connection opened
- T+60 VPN connection drops
- T+0 VPN connection opened
- T+20 1 Ping to 172.16.0.5, reply received
- T+80 VPN connection drops
The VPN is dropped with the following message in the System event log:
Source: RasClient
Event ID: 20226
CoId={43121588-861C-447A-A510-C44C2BA86639}: The user LAPTOP-GLENN\Glenn dialed a connection named ikev2-test which has terminated. The reason code returned on termination is 829.
So I started digging and enabled RAS debugging on the client:
netsh ras diagnostics * state=enabled
The only relevant stuff I could find was in C:\Windows\tracing\rasman.log (20:11:51 is when the disconnect happens - interesting bits prefixed with "***"):
[1384] 01-16 20:11:51:216: FreeInterfaceLuidIndex: Luid = 0
[1384] 01-16 20:11:51:216: RasUpdateVpnLuidCache: Luid: 17000000000000, fAddLuid:0
[1384] 01-16 20:11:51:216: RasUpdateVpnLuidCache: Removed Luid 17000000000000 from cache
[1384] 01-16 20:11:51:216: FreeInterfaceLuidIndex: done 0
[1384] 01-16 20:11:51:216: DeallocateRouteRequestCommon: pBundle=0xa50adde0, type=0x800
[1384] 01-16 20:11:51:232: DeActivated Route , bundlehandle 0x4, prottype = 2048
[1384] 01-16 20:11:51:232: DeAllocateRoute: PI_Type=0x800, PI_AdapterName=\DEVICE\{93A76D72-2010-45BB-9096-244B06735879}, PI_Allocated=-1
[3524] 01-16 20:11:51:248: SendProtocolResultToRasman: msgid=1, hPort: 6.
*** [3524] 01-16 20:11:51:248: Setting last error for port VPN2-1 to ppp error 0x3635
[3524] 01-16 20:11:51:248: SetProtocolResultAvailableEvent: Notification handle event for port 6 is not registered.
[2640] 01-16 20:11:51:248: WorkerThread: Disconnect event signaled on port: VPN2-1
[2640] 01-16 20:11:51:248: OVEVT_DEV_STATECHANGE. pOverlapped = 0xa4611940
[2640] 01-16 20:11:51:248: onecoreuap\net\rras\ras\rasman\rasman\worker.c, 2031: Disconnecting port 6, connection 0xa6af47e0, reason 1
[2640] 01-16 20:11:51:248: Disconnecting Port 0xVPN2-1, reason 1
[2640] 01-16 20:11:51:248: DisconnectPort: Saving Bundle stats for port VPN2-1
[2640] 01-16 20:11:51:263: RevertPostConnectionActions
[2640] 01-16 20:11:51:263: RasImpersonateUser. 0x0
[2640] 01-16 20:11:51:263: DeleteCredentialsFromCredMan
[2640] 01-16 20:11:51:263: DeleteCredentialsFromCredMan Done: 0
[2640] 01-16 20:11:51:263: RasRevertToSelf. 0x0
[2640] 01-16 20:11:51:263: QueueCloseConnections: no dependent connections
[2640] 01-16 20:11:51:263: 10. Throwing away handle 0x0!
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\util.c, 2315:Setting port 6 for autoclosure...
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\util.c 2327: Disconnected Port 6, reason 1. rc=0x0
[2640] 01-16 20:11:51:263: FreeBundle: freeing pBundle=0xa50adde0
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\util.c: 2443: port 6 state chg: prev=2, new=3
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\util.c: 2459: port 6 state chg: prev=3, new=4
[2640] 01-16 20:11:51:263: 5. Notifying of disconnect on port 6
[2640] 01-16 20:11:51:263: SignalPortDisconnect: Notification handle event for port 6 is not registered.
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\util.c: 2573: port 6 async reqtype chg: prev=0, new=0
[2640] 01-16 20:11:51:263: ***** DisconnectType=1,DisconnectReason=4,pConn=0xa6af47e0,cbports=1,signaled=1,hEvent=0xffffffff,fRedial=0
[2640] 01-16 20:11:51:263: Calling DwQueueRedial
[2640] 01-16 20:11:51:263: DwQueueRedial
[2640] 01-16 20:11:51:263: DwQueueRedial returned 0x0
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\util.c, 2634: Autoclosing port 6
[2640] 01-16 20:11:51:263: PortClose: port (6). OpenInstances = 1
[2640] 01-16 20:11:51:263: Freeing the notifier list for port 6
[2640] 01-16 20:11:51:263: PortClose (6). OpenInstances = 0
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\request.c: 3845: port 6 async reqtype chg: prev=0, new=0
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\request.c: 3848: port 6 state chg: prev=4, new=4
[2640] 01-16 20:11:51:263: PortClose: Resetting PCB_OpenedUsage for port: 6.
[2640] 01-16 20:11:51:263: RemoveConnectionPort: port 6, fOwnerClose=0, pConn=0xa6af47e0, pConn->CB_Ports=0
*** [2640] 01-16 20:11:51:263: Completely disconnected connection: Reason: ERROR_LINK_FAILURE (829)
[2640] 01-16 20:11:51:263: SendSensNotification(_RAS_DISCONNECT) for 0x00040000 returns 0x00000000
[2640] 01-16 20:11:51:263: Successfully notified event(128, C:\Users\Glenn\AppData\Roaming\Microsoft\Network\Connections\Pbk\rasphone.pbk, ikev2-test) to the caller.
[2640] 01-16 20:11:51:263: SignalNetman: IRasEventNotify::RasEvent returned S_FALSE
[2640] 01-16 20:11:51:263: DwSendNotificationInternal(ENTRY_DISCONNECTED) rc=0x1
[2640] 01-16 20:11:51:263: RemoveConnectionPort: FreeConnection hconn=0x40000, pconn=0xa6af47e0, AutoClose=1
[2640] 01-16 20:11:51:263: FreeConnection: pConn=0xa6af47e0, 1
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\request.c, 3918: Clearing the autoclose flag for port 6
[2640] 01-16 20:11:51:263: fAnyConnectedPorts: 0
[2640] 01-16 20:11:51:263: SetRasmanServiceStopControl: Enabled 1
[2640] 01-16 20:11:51:263: PortClose: DisableAutoWPPTracing failed with error 0x2
[2640] 01-16 20:11:51:263: DisconnectPort Complete
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\worker.c: 2077: port 6 state chg: prev=4, new=4
[2640] 01-16 20:11:51:263: onecoreuap\net\rras\ras\rasman\rasman\worker.c: 2081: port 6 async reqtype chg: prev=0, new=0
[1276] 01-16 20:11:51:263: The specified notification entry with cookie 2 found.
[1276] 01-16 20:11:51:263: Dequed notification entry: (128, C:\Users\Glenn\AppData\Roaming\Microsoft\Network\Connections\Pbk\rasphone.pbk, ikev2-test) from the queue.
- Ran the same config on a Cisco CSR1000v router and the exact same thing happened.
- Created a REG_DWORD named "InactivityIdleSeconds" under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class{4d36e972-e325-11ce-bfc1-08002be10318}\0012 (WAN Miniport (IKEv2)) and played with its value but that didn't change anything (see here why I tried that)
- Played around with Dead Peer Detection but that didn't change anything either
I can't find what the criteria are for Windows 10 to consider an IKEv2 connection dead or what state changes cause Windows 10 to think the VPN has to be disconnected.
Soooo... any pointers?
Microsoft Technet: 0x3635 means 13877 ERROR_IPSEC_IKE_RPC_DELETE "Deleted via RPC call."
The Cisco IOS router configuration.
Looks like something set to disconnect after 60s. It's not a router session setting though. You may want to check the AAA DC1 server for that, it's probably configured there. – Overmind – 2017-01-17T08:10:39.397
1I've scoured through the Network Policy Server settings and no such setting is configured. Also wouldn't explain why I'm only seeing this behavior on Windows 10 and why it terminates with a link failure. – RedShift – 2017-01-17T11:30:54.150
Different encryption protocols can have different behavior. Make sure the W10 station is using the same exact protocols just as a W7 to connect. Try switching between SSTP, L2TP/IPsec and set PPTP as last resort. – Overmind – 2017-01-17T11:45:59.143
2I'm trying to move away from legacy protocols, so no I can't downgrade... – RedShift – 2017-01-17T17:18:51.440
1I have exactly the same problem, your logs are basically verbatim of my logs. The only difference is I have an Ubuntu server with StrongSwan 5.1 server instead of Cisco router. My Win10 build number is also same.
PPTP is working, but has other issues hence the switch to IPSec. Android StrongSwan as well. – jnovacho – 2017-01-17T20:11:49.453
@jnovacho see the update 3. – RedShift – 2017-01-22T21:01:48.617
Great job figuring it out. Thanks for notifying, we'll how it will play out. I see "Cumulative Security Update for Windows 10 x64" in coming months, which will magically fix the issue. – jnovacho – 2017-01-23T09:13:36.853