Cisco 9971 AnyConnect VPN phone

Started by Dieselboy, April 23, 2016, 01:46:44 AM

Previous topic - Next topic

Dieselboy

the 9971 was the flagship Cisco desk phone not so long ago and I've never had it stable.
It doesn't play well with wifi - it will register once then go unregistered and stay like that forever. It's a bit better on the LAN.

Combine wifi and AnyConnect VPN and you have no hope. I've had tac release private firmware to me to fix it and to be honest it just causes other problems.
When we have got it working and registered, when you initially answer the call there is very bad audio quality as if there is packet loss. Then it clears.
Lastly the longer the phone is registered, the more chance there is of audio quality being crap. If the phone has been registered all day and you place or receive a call, you might find the quality to be so bad you cant have the voice call. Rebooting the phone fixes the problem until it creeps back in again. It seems like a memory leak.

Now I'm working more from home, I have my 9971 set up to the corp CUCM and I'm using anyconnect VPN from the phone itself. The experience I have is that the phone will connect to the VPN fine, and take minutes and minutes to register to CUCM. As soon as it is registered it will unregister and then register again. These issues don't happen with the SCCP 8945, connecting the same way to the VPN.

To try and trace the issue yesterday, I ran a ping -t from my laptop in the office to the VPN IP of the phone. Once I done that, the phone remained registered all day long and had no issues at all. I didn't realise that this allowed the phone to work and at the end of the day when I was closing everything down, I closed off the running ping.

Minutes after closing the ping, the phone unregistered and began having those odd issues again. It has been 18 hours since and the phone has been constantly registering, unregistering, disconnecting and reconnecting the vpn and repeating and will not stay registered.

As a test, I run a ping -t to the next available VPN POOL IP address and then disconnected / reconnected the phone. The phone connected to the VPN and the pings started replying. Almost immediately the phone registered to CUCM. It was not delayed for minutes like previously. As of now, the phone has been registered fine for 37 minutes. This is too much to be a coincidence.

I'm guessing the phone has issues with encapsulation / decapsulation for some reason. Having a steady flow of packets seems to keep the phone stable.

BUT WHY!?
I have a TAC case opened.

I remember when I bought this 9971 I was pretty pleased to own it. I hate it with a passion now. I will be trading it in for a DX650 at some point. The CEO has a DX70 for his home office and he loves it. There's no issues at all on the DX70. And the number of features / functionality on that one unit is great.

So basically, ping the phone and it acts normally. Stop the pings and things go to crap.

deanwebb

So set up a workstation to run constant pings and save some money.

:problem?:
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Dieselboy

It doesn't make sense! Stable for over 8 hours now.

All I can think is that in idle mode it has trouble encapsulating or decapsulating the packets. I don't know enough about the inner workings to comment any further really.

Okay so who do I send the bill to for server licensing and electricity costs to ping our 9971 phones so they function, as a phone? May be Cisco has a solution where they can initiate cloud pings to all 9971 phones required on a AnyConnect VPN.

I'm tempted to video myself doing a control -C on the ICMP ping command prompt and see how long it takes for the phone to drop off. I've not seen anything like this before. I wonder how much of a difficult time I will have convincing Cisco I'm not telling porkie pies.

The only thing I can think of is that the phone connects to the SSL VPN which is running off a Cisco 2921 IOS router. It is not an ASA so I don't need a fancy voice VPN license. I of course have an SSL VPN license, which is for 50 concurrent users. We only ever see 5 or 6 users or phones on it though. The router CPU and memory is low. So apart from the fact the phone doesn't connect to an ASA there's no reason for this to fail. The phone is wired into my POE switch and my AnyConnect VPN from my laptop on wifi has no problems.

Dieselboy

 :problem?: :problem?:
I started recording a video and I done a control C on the ping window. Less than 2 minutes later, the phone unregisters.

What I did notice is that on the phone, after ending the pings it says Tx packets are incrementing but there are no incrementing packets in the Rx direction.

Not ruling out a bug with the IOS VPN component but my 8945 phones are working fine and are relied upon for the USA team. They will tell me if there are issues.

I'm uploading the video but it's almost a gig in size. Going to bed soon as it's 2:20am and I've been what they call "out out". I'll post a link so you can witness what I'm seeing :)

deanwebb

Cool. We'll do what we can to help out.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Dieselboy


deanwebb

My proxy blocks youtu.be, but not youtube.com... Can you post a youtube version of the link so that I don't have to hack and jack my proxy on a loverly Sunday morn?
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Dieselboy


deanwebb

OK, I don't know why I missed the connection between VPN and pings before, but the video hit it home for me. Those pings are "interesting traffic", and they keep the VPN tunnel up. Once they go away, if there's no other traffic, the VPN goes down. Why? No interesting traffic.

It's not encaps/decaps, it's the traffic flow itself that's busted between that phone and the other end of its VPN. The keepalive packets, for some reason known only to Cisco, aren't keeping anything alive.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Dieselboy

But the keepalives are interesting traffic too. The idle timeout is longer than the keepalive duration. As i mentioned it's only the 9971 having the problem. 8945 and dx70 are fine. As are other vpn users.

deanwebb

That may be an issue with the 9971. If the keepalive was interesting, it would have gone through and the VPN would have stayed up.

In an unrelated note, you sound totally 'Straya, mate!

Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Dieselboy

 :rofl:
No way do I sound Australian :) I'm from South London :)

So about your comment regarding the VPN staying up and interesting traffic - the VPN stays up for ages after the phone de-registers. I assumed that the disconnect of the VPN and immediate reconnect was the phone doing this itself. I knew there would be interesting traffic as there would be keepalives.
However--
I set up a bunch of packet captures this evening and it turns out that my crappy Telstra Technicolor Gateway is closing the DTLS port in under 30 seconds. Therefore, after the phone registers and starts sending keepalives, the phone will send another keepalive 30 seconds later but within this time the Telstra firewall has closed the DTLS port. It does not re-open the DTLS port when the phone sends a keepalive it seems. So when the response comes back through the VPN connection, the firewall instead sends an ICMP host unreachable message in response to receiving a UDP packet destined for the random UDP port. So the keepalive response never gets through the Telstra firewall to the 9971. The TCP connection remains up though and TCP packets get through, but the VPN uses UDP/443 for transport. Even though this happens, the phone can send traffic out fine and this reaches the corp office but return traffic is being dropped at the firewall. So NAT is working, but it appears the Telstra firewall isn't creating the UDP session outbound and allowing the return. I find this a bit odd. Even so, why is the UDP timeout under 30 seconds. 

What's worse is that the timeout is not configurable on the Telstra firewall because it is so very locked down. So I cannot increase the UDP timeout to at least test it.

So currently trying to figure out how I can get more "interesting" traffic on the phone only. Something like a device setting like an ntp update interval or anything that would get a response in under 30 seconds.

:lol:

deanwebb

There ya go.

And you totally picked up an Aussie accent. It was like troubleshooting with Paul Hogan.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.