New weird issue, IPSEC VTI VPN packet loss

Started by Dieselboy, June 01, 2016, 01:46:53 AM

Previous topic - Next topic

Dieselboy

I'm seeing an increasing amount of lost packets on one VTI VPN tunnel. Packet captures initially led me to think that the traffic was being dropped in the internet somewhere.
Today I've captured lots of traffic at various points and I eventually worked backwards to capture traffic on the VPN router itself on the egress interface of the where the VPN tunnel is sourced from. This capture shows missing ESP sequence numbers as though the router is not sending some ESP packets to the wire.

Sounds like a fun one.

Dieselboy

False alarm
:awesome:
Wireshark's "Expert Info" lying. In fact the packets are present, just out of sequence. The time stamps on the packets show that they are sent and sometimes at the exact same timestamp, just out of sequence.

deanwebb

Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Dieselboy

I thought it was the firewall  >:D

What is WAD?

Actually it was partly the firewall this time. I had not put a static route pointing out a specific interface for one of the VPN tunnels, at the time I made our new internet line up. I have 3 x internet lines at the corp office so I have 3 x VTI tunnels going to the remote site. So I can influence routing to the remote site I have configured 3 x different remote peers, using secondary addresses at the remote site. So all 3 tunnels utilise a different internet circuit for HA.
So what was happening is that the backup VPN tunnel was coming up (haven't looked into how, exactly). but the ASA was routing it out the primary interface which was natting the source to the primary IP. The remote site was syslogging that it was receiving traffic with an invalid SPI.
Not sure if this was causing loss though. but it wasn't correct. The issue came about when I changed the default route. In hindsight I should have configured 3 x static routes independent of the default route. What I had done is considered that I dont need a static route to VPN peer X if it's covered by a default route. But thinking about it more, if the ASA fails over I don't want VPN traffic being routed wrong.

So when I say it was a firewall problem,  what I really mean is that it was a pilot issue - the person controlling the firewall was the issue. I.E. - me.
:awesome:


Regarding out of sequence, yes the packets are seen out of sequence. But looking at the timestamp I can get around 10 ESP packets sent at the same time down to the exact millisecond. And within that stream, a couple of ESP packets are not sent in sequence.
Example, 10 x ESP packets sent at 1pm, 0 minutes, 1000s. Seq numbers 1 through to 10. I might find that it will in fact go like 1,2,4,3,5,6,8,9,10,7
but all of them have the same timestamp so even though they are Out Of Sequence they are sent at the exact same time. - 1000ms after 1pm :) (example)

When it goes across the internet I wonder if the sequence changes by the time it arrives at the remote end anyway. I know internet routing between SL and AU is flaky at the moment anyway. Some traffic goes across USA to get from Australia to Sri Lanka.

The remote end should have a buffer so it can sort the packets anyway. I don't think this is a real problem unless buffer space runs out, or the out of sequence packets are transmitted at different times (or worse, lost in transit).

deanwebb

WAD == Working As Designed

Routing on a firewall is not a firewall issue, it is a routing issue. Therefore, the following is still true:

:notthefirewall:
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Dieselboy

LOL

In that case, it will never EVER be the firewall :)

deanwebb

Hence the image, my friend. Hence the image.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Dieselboy

Question though why would ESP packets sent out of sequence be WAD?

deanwebb

TCP/IP is built on its ability to handle and work with out of sequence packets, as opposed to IBM's SNA networking. SNA was zero-tolerance with that kind of sloppiness.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Dieselboy

Ah ok.

IPSEC is UDP and somewhat real-time, though. Alas there's no problem there anyway. I was just a bit concerned if there were fragmented packets.

wintermute000

Q: How did the psychic know what was in the IPSEC tunnel?

......

A: Because of ESP

<boom-tish>

deanwebb

Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.