Any caveats doing routing AND VXLAN logical VTEP on same N9K pair

Started by wintermute000, March 17, 2016, 06:51:14 PM

Previous topic - Next topic

wintermute000

A pair of N9Ks can obviously act as a regular pair of L3 switches
They can also form a logical VTEP for VXLAN purposes


Does anyone know if there any caveats if you want to perform both functions on the same pair of chassis? i.e. you do the routing for the underlay AND the overlay on the same pair of 9ks and the routed traffic goes over its own VXLAN VTEP




routerdork

I thought you could do this but as I review some of the Spine/Leaf designs I researched before they all show this happening on separate boxes.
"The thing about quotes on the internet is that you cannot confirm their validity." -Abraham Lincoln

NetworkGroover

Note: I'm not speaking for Cisco specifically here

Your wording is a little hard for me to understand.

Are you asking if you can run routing and VXLAN on the same box?  Of course - this is what a leaf switch does - encapsulates the frame into VXLAN, and then routes it through the spine. VXLAN and routing are two discrete functions, though obviously VXLAN leverages routing to move encapsulated frames from one VTEP to another.

I'm sure I'm missing the mark though - a diagram could probably help.
Engineer by day, DJ by night, family first always

routerdork

The way I understood his question was whether or not you could run VXLAN and Legacy VLANs with a typical SVI for a gateway on a switch together without any gotchas. But maybe I'm off too.
"The thing about quotes on the internet is that you cannot confirm their validity." -Abraham Lincoln

NetworkGroover

Quote from: routerdork on March 18, 2016, 02:22:22 PM
The way I understood his question was whether or not you could run VXLAN and Legacy VLANs with a typical SVI for a gateway on a switch together without any gotchas. But maybe I'm off too.

If I'm understanding you correctly, from a vendor-neutral perspective, I don't *think* there should an issue as long as they're different VLANs, i.e.  You don't have an SVI in VLAN 10 that you're using as a gateway that you're then routing from, in addition to having VLAN 10 mapped to a VNI.
Engineer by day, DJ by night, family first always

NetworkGroover

Actually, strike that.... that's not making sense to me I think...

Let me lab this up real quick and get back to you.
Engineer by day, DJ by night, family first always

wintermute000

That would be appreciated. And yes the traditional vlan would need to be carried across the vxlan. Like routerdork says all documents seem to split the functions

NetworkGroover

#7
Well, I built this out in vEOS/GNS3, and it appears to work.  I have a two-tier spine/leaf using eBGP and VXLAN.  My first dummy host is connected to two leaf switches via MLAG.  The second dummy host is connected to a third leaf switch via a single connection.  Both hosts are in VLAN 10.

The MLAG-peered leaf switches are configured with VARP (Like HSRP, but active/active), providing a GW VIP in VLAN 10, and VLAN 10 is also mapped to VNI 10 for VXLAN. I created a loopback on the spine switch and advertised it.  I can successfully ping both the loopback on the spine switch, as well as the host across the VXLAN tunnel.

I guess that kind makes sense - VXLAN is effectively doing a L2 lookup of the MAC table, and the dest MAC address is learned from the VTI (Vx1), while to reach a remote subnet is a L3 lookup of the routing table.
Engineer by day, DJ by night, family first always

wintermute000

Thanks for that. The scenario I'm looking @ however is possibility of using VXLAN as a DCI block collapsed with a conventional core L3 switching block. In this case I'd need L2 extension for some subnets, L3 routing over VXLAN segment for others (or even natively over the underlay?). Hey I didn't come up with the requirements (as usual with these DCI scenarios, the customer typically has no clue and has been sold DCI as a magic potion - in fact we're struggling to even get them to nail down an ironclad set of Requirements). Thankfully @ this stage the quote hasn't even landed yet but I'm doing some preliminary recon.

Hmmm the Cisco Way (tm) appears to be using separate switches. Check the "Figure 16. Routing Block Design with Cisco Nexus 9300 Platform Switches" section. In this recommended topology, the VTEP switches are only doing L3 for purposes of multicast routing between the VTEPs. Redundancy is via back to back vPCs - so the routing pair interacts purely @ L2 with the VTEP pair.  But again, it doesn't seem to explicitly rule out collapsing the function.

http://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-series-switches/white-paper-c11-732453.html

NetworkGroover

I don't understand what you mean by "routing over VXLAN".  VXLAN works by VTEPs sharing host mac addresses - the only routing is for VTEP IP reachability.

Anyway, maybe this will help shed some light? Dunno if it's public or not, and it was written a while ago, but give it a try:

https://www.arista.com/custom_data/downloads/?f=/support/download/DesignGuides/Arista_Design_Guide_DCI_with_VXLAN.pdf

EDIT - And unfortunately it's the end of the day for me, and have to prep for a week-long trip tomorrow so I won't have time to read the article you posted and comment further, sorry ;(
Engineer by day, DJ by night, family first always

packetherder

Does this address what you're getting at?

http://blog.ipspace.net/2014/06/trident-2-chipset-and-nexus-9500.html

QuoteNo routing with overlays (BRKARC-2222 slide 81). Trident 2 chipset doesn't support routing of VXLAN-encapsulated packets, and based on some other vendors' limitations it seems it has the same challenges with any overlay technology (including TRILL and potentially SPB).

It's my understanding (based on scarce information available) that the problem might lie in the structure of the forwarding pipeline – by the time the chipset figures out it's the overlay tunnel endpoint for the incoming packet, and performs the L2 lookup of the destination MAC address, it's too late for another L3 lookup.

The workaround is hinted at in the BRKARC-2222 presentation: the packet has to be recirculated through the forwarding pipeline.

Remember the front-panel cables between F2 and M1 linecards on Nexus 7000? Same idea, implemented in silicon, probably resulting in similar performance.

Cisco solved the problem with its ACI Leaf Engine (ALE) chipset. One could also implement L3 forwarding on fabric modules in modular switches, or use a second Trident 2 chipset (building a leaf-and-spine architecture within the ToR switch).

Takeaway: Trident 2 has challenges performing L3 forwarding in combination with L2 tunnels. Have a long discussion with your vendor before implementing a design that uses the two features together, even when the datasheets imply everything works just fine.

Finally, looking at the Nexus 9300 architecture (BRKARC-2222 slide 59), there are only 8 40GE lanes between Trident 2 chipset and ALE chipset on Nexus 93128TX, which means that you won't get line rate VXLAN routing on Nexus 93128TX.


wintermute000


Great pickup




EDIT: what I mean by routing over VXLAN is: in a DCI scenario, I may want to have a SVI on one side form a routing adjacency with a SVI on the other side via a VXLAN segment. So I can keep the L3 underlay completely separate, and run both L3 and L2 over the VXLAN. The term 'vxlan routing' when used by much of the literature implies L3 routing from one VXLAN segment to another - this is not exactly what I'm referring to though again it will come up in certain traffic flows. I'm aware EVPN 'solves' this problem.




All signs point to: do it on separate devices, let the VTEPs be just VTEPs and route only for VTEP reachability, upstream devices hit it via L2. I'll run through that Arista PDF on Monday and compare it to the Cisco N9K VXLAN design guides.




routerdork

"The thing about quotes on the internet is that you cannot confirm their validity." -Abraham Lincoln