Leaf & Spine Architectures

Started by routerdork, October 08, 2015, 09:01:13 AM

Previous topic - Next topic

burnyd

Quote from: AspiringNetworker on December 23, 2015, 12:23:08 PM
Quote from: that1guy15 on December 23, 2015, 10:43:58 AM
Quote from: burnyd on December 23, 2015, 10:05:13 AM
Quote from: AspiringNetworker on December 22, 2015, 08:29:08 AM
The idea was route source tracing just by looking at the AS_PATH you'll know what ToR it belongs to if you have pairs of them in the same AS.... or if you're using the same AS for all of them.

I see.  Generally you have enough bandwidth if traffic lands on ToR 1 for Tor 2 loopback / vtep that its not an issue.  Haha if its an issue go ahead and add another spine switch.  Also, go to your internal thing and look at the bgp unnumbered request I put in.  The one that allows bgp over ipv6 link local ips and overlays ipv4.  Right now today I want to use 169.254.x.x ips and keep reusing the ip's everywhere but its more of a political thing internally.

Cases like this are the reason IPv6 link-local is the way to go. Just need more adoption of IPv6... or more flexibility with MPBGP using IPv6 AFI to carry IPv4 prefixes.

Not even following what you guys are talking about.... lol.  What's the goal? Why so complex?  What's the  driver to do whatever it is you're describing that current methods can't address? Just curious at this point.

https://docs.cumulusnetworks.com/display/DOCS/Configuring+Border+Gateway+Protocol+-+BGP

Check out the portion on BGP unnumbered.

that1guy15

My biggest driver is not needing to allocate and maintain address space for PTP links ( /30 or /31s) when they are nothing but transit. With Ipv6 link-local establishes and you move on, no real need to provision anything else on the link.

Of course with this use-case and others there are factors you have to take into account. But from my standpoint it can simplify provisioning.
That1guy15
@that1guy_15
blog.movingonesandzeros.net

NetworkGroover

#47
Quote from: that1guy15 on December 24, 2015, 09:39:45 AM
My biggest driver is not needing to allocate and maintain address space for PTP links ( /30 or /31s) when they are nothing but transit. With Ipv6 link-local establishes and you move on, no real need to provision anything else on the link.

Of course with this use-case and others there are factors you have to take into account. But from my standpoint it can simplify provisioning.

Elaborate?  I'll have to try to read through this more thoroughly later, but I don't see how this really saves you any work.  At first glance of the doc Dan linked it seems like it saves you some IPv4 space, but that's just because you're leveraging IPv6 underneath it? Now you have to configure dual stack?  How does this work for vendor interop?  How do you traceroute if you need to?  How does this simplify automation/provisioning?  This isn't me challenging you - this is me asking to be educated.
Engineer by day, DJ by night, family first always

wintermute000

#48
Well at first glance, what vendor interop? Unless you're mixing vendors in your leaf/spine fabric - but then again, ipv6 link local routing is standard for say OSPFv3.

What always blows my mind is ip unnumbered. Even if I know how to get it working, I just don't 'get' it, esp. when in context of routing protocols, like the cumulus proposal for ip unnumbered OSPF. WTF does the database look like? What are the transit link entries under the type 1? what about the type 2s?
Anyone know whether that works in Cisco land (IOU preferably lol), worth half an hour of my time to lab and see? (or excuse to finally lab up some cumulus? LOL I'll throw it on the ever growing pile containing vMX, vEOS, the new version 15 vSRX, yada yada)

Also re-reading some stuff above and aspiring networker, can you clarify - what peer link are you talking about in this quote?? I see your BGP / route table output but I am still confused at your setup. Do you have a diag? I mean, in a classic CLOS fabric, each spine is SUPPOSED to have only one link to each leaf, correct? The ECMP is from leaf to another leaf (i.e. via multiple spines).
"So technically.. even though that IP is actually one hop away... because I have both leaf switches in the same AS, some traffic will be hashed to the other leaf switch and will have to cross the peer link in order to hit the VTEP IP... no big deal - iBGP will handle it.. but not the MOST optimal path to take. Prepending the AS for route source tracing actually also has a side effect of addressing this.  Now my route to 192.168.254.4 points to the leaf switch that it resides on."

As an aside, can you lab a VXLAN VTEP via vEOS? Multicast included I presume?

As for your MLAG / virtual VTEP... oh boy, more Ivan flashbacks


https://www.ipspace.net/Redundant_Server-to-Network_Connectivity


Just found out before we went on Xmas break that we lost that massive Arista opportunity, the customer went with Cisco. We're still not sure what went down, aside from the obvious pants dropping from the teal team. I'm curious to know whether there was any stick that accompanied the offer to lube up


re: That1guy, have you labbed this in any detail? the NH implications are interesting and painful.


http://www.noction.com/blog/ipv4_bgp_vs_ipv6_bgp
https://supportforums.cisco.com/document/84261/advertising-ipv6-prefixesroutes-over-ipv4-ebgp-peers

burnyd

Quote from: AspiringNetworker on December 24, 2015, 12:40:12 PM
Quote from: that1guy15 on December 24, 2015, 09:39:45 AM
My biggest driver is not needing to allocate and maintain address space for PTP links ( /30 or /31s) when they are nothing but transit. With Ipv6 link-local establishes and you move on, no real need to provision anything else on the link.

Of course with this use-case and others there are factors you have to take into account. But from my standpoint it can simplify provisioning.

Elaborate?  I'll have to try to read through this more thoroughly later, but I don't see how this really saves you any work.  At first glance of the doc Dan linked it seems like it saves you some IPv4 space, but that's just because you're leveraging IPv6 underneath it? Now you have to configure dual stack?  How does this work for vendor interop?  How do you traceroute if you need to?  How does this simplify automation/provisioning?  This isn't me challenging you - this is me asking to be educated.

Yes combine that with ipv6 autoconfig and bgp dynamic listener and you have no need to statically configure peers or ever worry about p2p links.  You pretty much plug shit in wherever with the same AS on all the leaf switches spines remove private AS then done...... just sayin. 

burnyd

Quote from: wintermute000 on December 25, 2015, 04:50:04 AM
Well at first glance, what vendor interop? Unless you're mixing vendors in your leaf/spine fabric - but then again, ipv6 link local routing is standard for say OSPFv3.

What always blows my mind is ip unnumbered. Even if I know how to get it working, I just don't 'get' it, esp. when in context of routing protocols, like the cumulus proposal for ip unnumbered OSPF. WTF does the database look like? What are the transit link entries under the type 1? what about the type 2s?
Anyone know whether that works in Cisco land (IOU preferably lol), worth half an hour of my time to lab and see? (or excuse to finally lab up some cumulus? LOL I'll throw it on the ever growing pile containing vMX, vEOS, the new version 15 vSRX, yada yada)

Also re-reading some stuff above and aspiring networker, can you clarify - what peer link are you talking about in this quote?? I see your BGP / route table output but I am still confused at your setup. Do you have a diag? I mean, in a classic CLOS fabric, each spine is SUPPOSED to have only one link to each leaf, correct? The ECMP is from leaf to another leaf (i.e. via multiple spines).
"So technically.. even though that IP is actually one hop away... because I have both leaf switches in the same AS, some traffic will be hashed to the other leaf switch and will have to cross the peer link in order to hit the VTEP IP... no big deal - iBGP will handle it.. but not the MOST optimal path to take. Prepending the AS for route source tracing actually also has a side effect of addressing this.  Now my route to 192.168.254.4 points to the leaf switch that it resides on."

As an aside, can you lab a VXLAN VTEP via vEOS? Multicast included I presume?

As for your MLAG / virtual VTEP... oh boy, more Ivan flashbacks


https://www.ipspace.net/Redundant_Server-to-Network_Connectivity


Just found out before we went on Xmas break that we lost that massive Arista opportunity, the customer went with Cisco. We're still not sure what went down, aside from the obvious pants dropping from the teal team. I'm curious to know whether there was any stick that accompanied the offer to lube up


re: That1guy, have you labbed this in any detail? the NH implications are interesting and painful.


http://www.noction.com/blog/ipv4_bgp_vs_ipv6_bgp
https://supportforums.cisco.com/document/84261/advertising-ipv6-prefixesroutes-over-ipv4-ebgp-peers

Thats the only thing traceroute.  If you have a operations staff who is easily confused they will be like wtf?  But if someone wanted to they could source the ICMP redirects from the loopback which could be routeable its up to the organization / owner.

If you set this up to leverage ipv6 auto configuration which is similar to dhcp and they started talking to each other and dynamically specified peers in the right AS on a ztp thats like plug and play bgp right there.  The only thing you would even care to advertise in my case or most cases might be a few server subnets. In my case I would have like a few SVI's only because I am using nsx.

burnyd

Quote from: burnyd on December 25, 2015, 08:40:33 AM
Quote from: AspiringNetworker on December 24, 2015, 12:40:12 PM
Quote from: that1guy15 on December 24, 2015, 09:39:45 AM
My biggest driver is not needing to allocate and maintain address space for PTP links ( /30 or /31s) when they are nothing but transit. With Ipv6 link-local establishes and you move on, no real need to provision anything else on the link.

Of course with this use-case and others there are factors you have to take into account. But from my standpoint it can simplify provisioning.

Elaborate?  I'll have to try to read through this more thoroughly later, but I don't see how this really saves you any work.  At first glance of the doc Dan linked it seems like it saves you some IPv4 space, but that's just because you're leveraging IPv6 underneath it? Now you have to configure dual stack?  How does this work for vendor interop?  How do you traceroute if you need to?  How does this simplify automation/provisioning?  This isn't me challenging you - this is me asking to be educated.

Yes combine that with ipv6 autoconfig and bgp dynamic listener and you have no need to statically configure peers or ever worry about p2p links.  You pretty much plug shit in wherever with the same AS on all the leaf switches spines remove private AS then done...... just sayin.

Im not sure how they unnumbered ospf stuff works but the bgp stuff is really cool.

As far as the mlag vtep stuff works what steve is trying to say is that you have 1 IP for vtep's and not two.  You anycast the vtep address and you only need 1 vtep for two top of rack switches.  On the backend these guys are actually mapping their arp to mac tables within linux to each other over their peer link so they keep in sync. 

NetworkGroover

Quote from: wintermute000 on December 25, 2015, 04:50:04 AM
Well at first glance, what vendor interop? Unless you're mixing vendors in your leaf/spine fabric - but then again, ipv6 link local routing is standard for say OSPFv3.

What always blows my mind is ip unnumbered. Even if I know how to get it working, I just don't 'get' it, esp. when in context of routing protocols, like the cumulus proposal for ip unnumbered OSPF. WTF does the database look like? What are the transit link entries under the type 1? what about the type 2s?
Anyone know whether that works in Cisco land (IOU preferably lol), worth half an hour of my time to lab and see? (or excuse to finally lab up some cumulus? LOL I'll throw it on the ever growing pile containing vMX, vEOS, the new version 15 vSRX, yada yada)

Also re-reading some stuff above and aspiring networker, can you clarify - what peer link are you talking about in this quote?? I see your BGP / route table output but I am still confused at your setup. Do you have a diag? I mean, in a classic CLOS fabric, each spine is SUPPOSED to have only one link to each leaf, correct? The ECMP is from leaf to another leaf (i.e. via multiple spines).
"So technically.. even though that IP is actually one hop away... because I have both leaf switches in the same AS, some traffic will be hashed to the other leaf switch and will have to cross the peer link in order to hit the VTEP IP... no big deal - iBGP will handle it.. but not the MOST optimal path to take. Prepending the AS for route source tracing actually also has a side effect of addressing this.  Now my route to 192.168.254.4 points to the leaf switch that it resides on."

As an aside, can you lab a VXLAN VTEP via vEOS? Multicast included I presume?

As for your MLAG / virtual VTEP... oh boy, more Ivan flashbacks


https://www.ipspace.net/Redundant_Server-to-Network_Connectivity


Just found out before we went on Xmas break that we lost that massive Arista opportunity, the customer went with Cisco. We're still not sure what went down, aside from the obvious pants dropping from the teal team. I'm curious to know whether there was any stick that accompanied the offer to lube up


re: That1guy, have you labbed this in any detail? the NH implications are interesting and painful.


http://www.noction.com/blog/ipv4_bgp_vs_ipv6_bgp
https://supportforums.cisco.com/document/84261/advertising-ipv6-prefixesroutes-over-ipv4-ebgp-peers

I can show you my GNS3 setup when I'm more motivated :P  But it's really easy - two spines in AS 64600, and then three leaves split up in two AS's.  Leaf 1 and 2 are physically connected to each other in AS 65000 in "rack1" and leaf3 is standalone in AS 65001 in "rack2".  There is 1 connection from each leaf to each spine.
Engineer by day, DJ by night, family first always

NetworkGroover

#53
Oh, and far as vEOS VXLAN support - yes, and yes (I think).  Whenever I configure VXLAN I use HER instead of multicast - and frankly I have little to no experience working with multicast.  vEOS supports pretty much anything that isn't a hardware specific feature.

And as far as "standard" protocols.. as anyone who's done a good amount of vendor interop testing knows... not everyone implements everything in every standard....  for numerous reasons.  Testing should always be done to confirm and nothing should ever be assumed.
Engineer by day, DJ by night, family first always

NetworkGroover

Quote from: wintermute000 on December 25, 2015, 04:50:04 AM
As for your MLAG / virtual VTEP... oh boy, more Ivan flashbacks


https://www.ipspace.net/Redundant_Server-to-Network_Connectivity


"The networking team has to build the network infrastructure before having all the relevant input data"

Pffft - when does THAT ever happen...
Engineer by day, DJ by night, family first always

NetworkGroover

Sorry for the spam...

Winter - if you do find out if there was anything in particular that sealed the deal for Cisco, I'd love to know.  I probably suspect it was the typical pants dropping on price though (I wonder how long they can keep doing that, or how many times they can pull it off before customers realize it's only a one-time price, and when they go to renew, prepare to get YOUR pants dropped). 

I know it may happen once in a while due to some feature, but I don't remember ever losing for technical reasons.
Engineer by day, DJ by night, family first always

burnyd

Quote from: AspiringNetworker on December 25, 2015, 06:08:02 PM
Quote from: wintermute000 on December 25, 2015, 04:50:04 AM
As for your MLAG / virtual VTEP... oh boy, more Ivan flashbacks


https://www.ipspace.net/Redundant_Server-to-Network_Connectivity


"The networking team has to build the network infrastructure before having all the relevant input data"

Pffft - when does THAT ever happen...

HER is the way to go unless you guys would just let everyone use cvx inside the switch as a kvm vm.....Im not bitter about that at all lol.  But HER just works and it works well.

burnyd

Quote from: burnyd on December 26, 2015, 03:47:07 PM
Quote from: AspiringNetworker on December 25, 2015, 06:08:02 PM
Quote from: wintermute000 on December 25, 2015, 04:50:04 AM
As for your MLAG / virtual VTEP... oh boy, more Ivan flashbacks


https://www.ipspace.net/Redundant_Server-to-Network_Connectivity


"The networking team has to build the network infrastructure before having all the relevant input data"

Pffft - when does THAT ever happen...

HER is the way to go unless you guys would just let everyone use cvx inside the switch as a kvm vm.....Im not bitter about that at all lol.  But HER just works and it works well.


Yah that stuff will get old real fast.  Its not just for arista its also with other network vendors as well.  As more people get involved and network switches converged into different products ie evo sddc,hyper converged etc I dont see it playing out very well for cisco. You will always have those guys who 100% rely on the vendor to do all the work and they will stay with a company like cisco but I see that fading away.

NetworkGroover

#58
Quote from: burnyd on December 26, 2015, 03:49:20 PM
Quote from: burnyd on December 26, 2015, 03:47:07 PM
Quote from: AspiringNetworker on December 25, 2015, 06:08:02 PM
Quote from: wintermute000 on December 25, 2015, 04:50:04 AM
As for your MLAG / virtual VTEP... oh boy, more Ivan flashbacks


https://www.ipspace.net/Redundant_Server-to-Network_Connectivity


"The networking team has to build the network infrastructure before having all the relevant input data"

Pffft - when does THAT ever happen...

HER is the way to go unless you guys would just let everyone use cvx inside the switch as a kvm vm.....Im not bitter about that at all lol.  But HER just works and it works well.


Yah that stuff will get old real fast.  Its not just for arista its also with other network vendors as well.  As more people get involved and network switches converged into different products ie evo sddc,hyper converged etc I dont see it playing out very well for cisco. You will always have those guys who 100% rely on the vendor to do all the work and they will stay with a company like cisco but I see that fading away.

Yeah... unfortunately having "one throat to choke" comes with it's own cons (and price tag).

Regarding CVX, I believe that takes a little more resources than what you'll get out of a VM on a switch - at least to any level of real scale.
Engineer by day, DJ by night, family first always

wintermute000

OK OK I get it. When you are talking MLAG, the arista switches are NOT in a traditional stack, its some kind of Nexus vPC type feature correct? (i.e. the switches are still separate entities and separate control planes but can present a shared etherchannel to a downstream host)


Also, what is HER?