Cisco GetVPN and EIGRP

Started by sgtcasey, January 14, 2015, 04:56:33 PM

Previous topic - Next topic

sgtcasey

With GetVPN you can exclude some traffic from being encrypted.  Per Cisco's configuration documentation it was shown that they excluded EIGRP from being encrypted because you need the links up before you can encrypt the traffic going over them.

However, what happens if the crypto fails for one reason or another?  Your EIGRP stays up with the previously encrypted links now not passing traffic correctly yet every device in your EIGRP AS sees those paths as still valid.  Bam, you're losing traffic in a blackhole.

My idea is to go ahead and allow the EIGRP traffic to be encrypted.  Now if your crypto fails your EIGRP does as well dropping the paths from the routing tables and preventing traffic loss.  Since EIGRP would not come up until the link was already encrypted then a floating static route pointing to the primary and secondary key servers would be needed as well.

This is an interesting problem and I'm looking to get input on how others might view this and what resolutions they can come up with.

Thanks!
Taking the sh out of IT since 2005!

Seittit

quick thought, may not be attractive:

use a separate EIGRP autonomous system for your VPN link.

{core network} <--EIGRP 100--> [ VPN HUB ]  <--EIGRP 200--> [ VPN SITE]  <--EIGRP 100--> {site network}

if you advertise a default route from core network to VPN hub over EIGRP 100, VPN hub takes it and redistributes it into EIGRP process 200 (establishing EIGRP peers over VPN). then VPN SITE router takes the default route in EIGRP process 200 and redistributes it into EIGRP 100 with appropriate metrics, and with that your connectivity from VPN site to core network is up/up.

here's where we go down the rabbit hole.

set up intelligence on VPN site router that says "ping a network within the core network sourced from my EIGRP process 100 interface (LAN facing interface).

  • if you receive replies, wait five minutes and try again (repeating)
  • if you do not receive replies, redirect traffic to alternative path

This intelligence can be set up with a combination of IP SLA tests and a small EEM script. You can even get fancier and say "if latency / jitter / packet loss / etc is poor traversing the VPN circuit, fail over to alternative path.

sgtcasey

Hi Seittit,

This method seems a little more involved than a few floating static routes to maintain connection to the key-server routers.  The encrypted links are being used as redundant links to various remote sites so we're doing a bit of load-balancing across them as well.  I choose GetVPN because we could do one-to-many type of WAN through our providers network.  So far I'm liking this method however I'm concerned about what might happen if we lose crypto (and I suspect it might have already happened but have no way to confirm that as the source of the outage).

Sadly, I never considered this when I was doing my testing of the design.  I've done a bit of Google searching on this and so far haven't come up with any thing useful.  I refuse to believe I'm the first one to wonder about this!   :)
Taking the sh out of IT since 2005!

wintermute000

Side topic, what is your requirement for GetVPN that is not fulfilled by either traditional IPSEC over GRE (VTI) or DMVPN?
When I last looked @ it, GetVPN was being flogged as a SP technology for customer separation without MPLS so I think of it as the crypto version of a LSP, or maybe I have it completely backwards lol

sgtcasey

Quote from: wintermute000 on January 15, 2015, 06:03:49 PM
Side topic, what is your requirement for GetVPN that is not fulfilled by either traditional IPSEC over GRE (VTI) or DMVPN?
When I last looked @ it, GetVPN was being flogged as a SP technology for customer separation without MPLS so I think of it as the crypto version of a LSP, or maybe I have it completely backwards lol

We needed to encrypt traffic moving over microwave links per the security teams.  Our WAN design calls for point to multi-point so a remote site will have several links to hub sites.  That way if a hub site drops that remote site still has a connection to the rest of our network.  It also helps with allowing traffic to take more direct routes to where it needs to go.

So far GetVPN is working great for us.  It's easy to set up, doesn't require tunnels and such, and so far has not been a problem at all.  I'm just trying to figure out a way to monitor it better than what we're doing now and getting alerts that an EIGRP adjacency dropped is a great way to do that.  :)
Taking the sh out of IT since 2005!

wintermute000

#5
With my limited GetVPN knowledge, my reading of it is that it relies on the underlying routing path (Cisco confusingly calls it underlying VPN but they mean MPLS-VPN i.e. a private WAN however you slice it) already in place. It does not establish an overlay network where you can run a separate routing instance over. Hence you need to have routing working OK then the encryption just goes on top transparently.

https://cciethebeginning.wordpress.com/2014/09/23/get-vpn-it-is-all-about-group/


The implication is that if you run your EIGRP within the GetVPN then how does the underlying routing work and your endpoints discover each other in the first place? I note you're proposing floating statics to your key servers but what about simple CE to CE reachability? (or do you have a flat p2mp topology like a VPLS etc.)

Since its not tunnelled and I see no separate VTI or 'instance' mechanism (sorry, too much JunOS time lately lol) then the conventional IVRF/FVRF type design to split your overlay routing from your underlying routing isn't possible.


Aside from watching for syslogs that include GetVPN messages (are syslogs/snmp etc. encrypted?) the other solution would be to setup IPSLAs from the hub using traffic types that are encrypted.





sgtcasey

You are correct in that GetVPN will not work if the group-member router cannot communicate with the key-server router.  That is why we've come up with the idea to set up those floating static routes on a remote router pointing it to the key-server routers and then the same on the key-server routers.  The group-member router then contacts the key-server router and gets the crypto up and working.  At that point EIGRP would be able to communicate and the EIGRP adjacency would come up and edge out the static route since it would have a higher AD value than the default EIGRP one.

My concern comes from the use of our EIGRP summary routes.  At each WAN interface we use a summary statement so we only have a single route being advertised out.  If the crypto stops working but EIGRP remains up then that summary continues to be advertised and other sites will continue to try and send traffic to the router which no longer has working crypto.  A black hole of traffic basically.

As I type this I'm still tossing around ideas in my head...
Taking the sh out of IT since 2005!

mynd

Not helpful in finding you a solution, but gots me a question:

QuoteThat is why we've come up with the idea to set up those floating static routes on a remote router pointing it to the key-server routers and then the same on the key-server routers.  The group-member router then contacts the key-server router and gets the crypto up and working.  At that point EIGRP would be able to communicate and the EIGRP adjacency would come up and edge out the static route since it would have a higher AD value than the default EIGRP one.

This sounds like it will put the encrypted traffic destined for the VPN endpoint over the VPN tunnel o.O

Or is the key_server and the tunnel endpoint two different devices? In that regards, do the VPN end points need to communicate with the key-server after the tunnel is established?

As for monitoring, does GetVPN support Dead Peer Detection? If one side isn't encrypting and the opposite supports DPD, I'd think it would tear down the tunnel.




sgtcasey

#8
No tunnels in GetVPN which is nice.  :)  The key-server routers are dedicated to just managing the TEK/KEK keys being used to encrypt the traffic.  They control key changes and such.  We're using two key-server routers for HA.  One is the primary and the other is the secondary.  If the primary fails or drops offline then we can continue normal key rotations without impact.  Communication between the group members and the key-server routers happens constantly.

I'm just not able to get my head around how to prevent a crypto problem from causing traffic interruption.  If only EIGRP would drop if/when the crypto goes south.  Oddly enough, I can't seem to find *any* information on this with the hours of searching and reading I've done.

[Edit] It does appear that GetVPN supports DPD between the key-server routers.  I've not found anything to indicate it does between the key-server routers and the group-members.  Here are some good links on GetVPN:

http://www.cisco.com/c/dam/en/us/products/collateral/security/group-encrypted-transport-vpn/GETVPN_DIG_version_1_0_External.pdf
http://www.cisco.com/c/en/us/products/collateral/security/group-encrypted-transport-vpn/deployment_guide_c07_554713.html

[Edit #2] I just did some very quick testing using a static route with the AD set to some number higher than EIGRP uses and it works.  With EIGRP disabled (interfaces passive on the test router) adding in the appropriate static routes at the site router and then the directly connected routers (hub) I then am able to communicate with the key-server router.  I add the crypto map back onto the remote router interface and the crypto comes up just fine.  No traffic is being passed right now so as not to interfere with production.  I then made sure the remote site router did not have an active EIGRP adjacency with the site core and enabled EIGRP across the unused WAN link and the EIGRP route edged out my static as expected.  The next step is to create another crypto group/profile/set/etc. and run an actual test with EIGRP set to be encrypted.
Taking the sh out of IT since 2005!

wintermute000

#9
Well thats the 'real' test isn't it.... you didn't exactly need to test whether an EIGRP route would override a static with an AD worse than EIGRP.
I was just going to say, grab a few spare routers and lab whether or not an encrypted EIGRP adjacency will form if the routers have floating statics to the KS.


Do keep us posted, I'm curious about this as my 15 minutes googling didn't shed any light on this issue either, as you say you just find lots of instructions to exclude routing protocols from the encryption ACL.


I do find some people talking about GetVPN over mGRE but then if you're going to do that, why not just use DMVPN. No KS dependency to worry about and you've lost the main advantage (no tunnelling).

sgtcasey

Quote from: wintermute000 on January 22, 2015, 03:20:42 AM
Well thats the 'real' test isn't it.... you didn't exactly need to test whether an EIGRP route would override a static with an AD worse than EIGRP.

Ah, never assume always test.  :)

I've found problems over the past year with EIGRP and IOS bugs so I don't want to assume it will work.  I'll probably just test using the new hardware already in place.  These are redundant circuits so taking one down to do some changes/testing won't impact my end-users.
Taking the sh out of IT since 2005!

LynK

Quote from: sgtcasey on January 22, 2015, 10:34:28 AM
These are redundant circuits so taking one down to do some changes/testing won't impact my end-users.

...#YOLO :professorcat:
Sys Admin: "You have a stuck route"
            Me: "You have an incorrect Default Gateway"

sgtcasey

Quote from: LynK on January 26, 2015, 02:16:14 PM
...#YOLO :professorcat:

Not on my network... we've got patients to think about.  :)
Taking the sh out of IT since 2005!

sgtcasey

#13
I thought I would post an update to my thread. 

We found a bug in the Cisco software running the GetVPN key-server routers - Cisco bug ID CSCuq18492.  They would lose connection to each other (despite being able to ping back and forth with no problems) every so often.  This caused an outage when both of the key-server routers went primary just before a re-key.  Some of the GetVPN group member routers were associated to the real primary and others associated to the new primary.  When the re-key happened both routers sent out different keys so when the group members received their new keys and began to use them traffic flows stopped but since EIGRP was excluded from encryption all of the new messed up links continued to appear to the routers as valid paths so traffic kept flowing down.  How did I find this out?  By a major WAN outage causing impact to the largest sites on my network.

After the code issue was resolved by upgrading to a flavor of 15.4 I then implemented some testing with a new design.  Specifically, putting EIGRP into the encryption so if there were to be a problem with encryption the EIGRP traffic would also stop flowing correctly, any affected EIGRP adjacency would drop, and the path would no longer appear as valid for traffic.

This also brought up another issue... if there is no EIGRP adjacency then how will a remote site router know how to talk to the GetVPN key-server router?  Floating static routes fixed this.  I'm using a static route with an AD of 220 so when EIGRP comes up the lower AD value will edge out the static route and traffic will flow through normal paths established per EIGRP.

This has been working now for a couple weeks back into production after tons of testing with application owners and allowing smaller sites to burn in for a while to make sure.  In fact, our provider had a cut fiber a week or so ago which dropped one of our GetVPN hub sites entirely.  The remote site that was up at the time in testing/validation simply dropped that single EIGRP adjacency while continuing to stay connected to the other two hub sites with zero dropped traffic and no reports of issues.

I figured I'd post this here since I couldn't find any other information on this anywhere else on the Internet in my searching.  I'm not saying it doesn't exist... I just couldn't find it.
Taking the sh out of IT since 2005!

wintermute000

Cool nice work mate, so you're saying using floating statics for key servers allows you to successfully put eigrp inside the getvpn?