[BGP] Advertisement of ECMP routes

Started by NetworkGroover, March 18, 2015, 12:02:11 PM

Previous topic - Next topic

NetworkGroover

Hi guys,

I'm writing a paper on BGP in the data center and could use your help understanding something.  I've set up a virtual lab in GNS3 and attached a diagram to the post.

My question is in regards to what happens at LEAF2.  LEAF2 has two equal cost paths to reach 10.0.2.0/24:

BGPDC-LEAF2(config-router-bgp)#sh ip bgp | i 10.0.2.0/24
* >Ec 10.0.2.0/24         10.255.255.2     0       100     0       64600 65002 i
*  ec 10.0.2.0/24         10.255.255.10    0       100     0       64600 65002 i

BGPDC-LEAF2(config-router-bgp)#sh ip route
Codes: C - connected, S - static, K - kernel,
       O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,
       E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,
       N2 - OSPF NSSA external type2, B I - iBGP, B E - eBGP,
       R - RIP, I - ISIS, A B - BGP Aggregate, A O - OSPF Summary,
       NG - Nexthop Group Static Route

Gateway of last resort is not set

C      1.1.1.0/31 is directly connected, Vlan4094
C      1.1.1.2/31 is directly connected, Vlan4093
C      10.0.1.0/24 is directly connected, Vlan10
B E    10.0.2.0/24 [200/0] via 10.255.255.2, Ethernet1
                            via 10.255.255.10, Ethernet2
< .. Omitted for brevity .. >


LEAF2 is only advertising to LEAF1 the 10.0.2.0/24 route via 10.255.255.2:

BGPDC-LEAF1(config-if-Et1-2)#sh ip bgp | i 10.0.2.0/24
* >   10.0.2.0/24         10.255.255.2     0       100     0       64600 65002 i

BGPDC-LEAF1(config-if-Et1-2)#sh ip bgp neighbors 1.1.1.3 received-routes | i 10.0.2.0/24
* >   10.0.2.0/24         10.255.255.2     0       100     -      64600 65002 i

BGPDC-LEAF2(config-router-bgp)#sh ip bgp nei 1.1.1.2 advertised-routes | i 10.0.2.0/24
* >Ec 10.0.2.0/24         10.255.255.2     -       100     -      64600 65002 i


Why is it not advertising both? Does it only advertise the "best" of the ECMP routes?
BGPDC-LEAF2(config-router-bgp)#sh ip bgp 10.0.2.0/24
BGP routing table information for VRF default
Router identifier 10.255.254.12, local AS number 65000
BGP routing table entry for 10.0.2.0/24
Paths: 2 available
  64600 65002
    10.255.255.2 from 10.255.255.2 (10.255.254.1)
      Origin IGP, metric 0, localpref 100, weight -, valid, external, ECMP head, best, ECMP contributor
  64600 65002
    10.255.255.10 from 10.255.255.10 (10.255.254.2)
      Origin IGP, metric 0, localpref 100, weight -, valid, external, ECMP, ECMP contributor
Engineer by day, DJ by night, family first always

that1guy15

yep, even though the RIB might be doing ECMP using two BGP paths BGP still only advertises the best path to neighbors.
That1guy15
@that1guy_15
blog.movingonesandzeros.net

NetworkGroover

Quote from: that1guy15 on March 18, 2015, 12:57:08 PM
yep, even though the RIB might be doing ECMP using two BGP paths BGP still only advertises the best path to neighbors.

Gotcha - thanks for the verification. Been a while since I played around with this in depth.
Engineer by day, DJ by night, family first always

NetworkGroover

Quote from: that1guy15 on March 18, 2015, 12:57:08 PM
yep, even though the RIB might be doing ECMP using two BGP paths BGP still only advertises the best path to neighbors.

Oh hey while I have you, purely being lazy here, but what makes that route the best? Is it lowest router ID (SPINE1 is lower than SPINE2)?
Engineer by day, DJ by night, family first always

wintermute000

Note this can be changed in the latest IOS-XE. about time.... but makes life potentially very complicated

http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-3s/irg-xe-3s-book/irg-additional-paths.html#GUID-4EB13F76-7C14-4B74-AFE0-66BF07976BD5

Service providers are not sure whether to say YAY or OMG we have to rip up our entire RD:RT bag of tricks..... i.e. this traditional way of getting around this 'problem'

http://blog.ipspace.net/2012/07/bgp-route-replication-in-mplsvpn-pe.html

NetworkGroover

Quote from: wintermute000 on March 18, 2015, 05:11:27 PM
Note this can be changed in the latest IOS-XE. about time.... but makes life potentially very complicated

http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-3s/irg-xe-3s-book/irg-additional-paths.html#GUID-4EB13F76-7C14-4B74-AFE0-66BF07976BD5

Service providers are not sure whether to say YAY or OMG we have to rip up our entire RD:RT bag of tricks..... i.e. this traditional way of getting around this 'problem'

http://blog.ipspace.net/2012/07/bgp-route-replication-in-mplsvpn-pe.html

Interesting, but I don't work with IOS. ;)  I don't necessarily need to advertise all ECMP routes either, but I'll keep this in mind for future studies - thanks!
Engineer by day, DJ by night, family first always

srg

som om sinnet hade svartnat för evigt.

SimonV


that1guy15

Quote from: AspiringNetworker on March 18, 2015, 03:58:13 PM
Quote from: that1guy15 on March 18, 2015, 12:57:08 PM
yep, even though the RIB might be doing ECMP using two BGP paths BGP still only advertises the best path to neighbors.

Gotcha - thanks for the verification. Been a while since I played around with this in depth.
http://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13753-25.html
That1guy15
@that1guy_15
blog.movingonesandzeros.net

NetworkGroover

Engineer by day, DJ by night, family first always

NetworkGroover

Quote from: that1guy15 on March 19, 2015, 08:22:39 AM
Quote from: AspiringNetworker on March 18, 2015, 03:58:13 PM
Quote from: that1guy15 on March 18, 2015, 12:57:08 PM
yep, even though the RIB might be doing ECMP using two BGP paths BGP still only advertises the best path to neighbors.

Gotcha - thanks for the verification. Been a while since I played around with this in depth.
http://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13753-25.html

Thanks sir.  I didn't want to include a vendor doc reference in my paper so I found this in the RFC as well - though a little more wordy.  RFC 4271, Section 9.1.2.2, "Breaking Ties (Phase 2)"
Engineer by day, DJ by night, family first always

burnyd

Add path and PIC work out surprisingly well.  However, I have a similar setup like this in all my data centers with NXOS as leaf and spines.  I have no issues at all turn on multipathing and we are good to go I have not had to use add-paths or pic in that environments.

NetworkGroover

Quote from: burnyd on March 21, 2015, 02:46:24 PM
Add path and PIC work out surprisingly well.  However, I have a similar setup like this in all my data centers with NXOS as leaf and spines.  I have no issues at all turn on multipathing and we are good to go I have not had to use add-paths or pic in that environments.

Yeah as mentioned before, I don't have a need for it either.  Out of curiosity though - what's the use case for it?
Engineer by day, DJ by night, family first always

burnyd

Its hard to talk about it without drawing it out but I have 4 internet peerings and its all meshed between multiple data centers in one large ibgp/ospf mesh with full ipv4 tables.  So the failover once one of the internet circuits was not failing over outbound as quickly as it should because obviously that next hop would disapear.  Add paths made it possible to make the failover much faster.

NetworkGroover

#14
Quote from: burnyd on March 24, 2015, 11:28:19 AM
Its hard to talk about it without drawing it out but I have 4 internet peerings and its all meshed between multiple data centers in one large ibgp/ospf mesh with full ipv4 tables.  So the failover once one of the internet circuits was not failing over outbound as quickly as it should because obviously that next hop would disapear.  Add paths made it possible to make the failover much faster.

I think I follow - add paths made it so that routes would be advertised with reachable next hops versus the single current best path before the failover occurred, which would disappear in the failure scenario.  Makes sense - thanks.

EDIT - And actually, I think I have a use case for this with BGP in the DC with an ECMP switch fabric for that same reason - improving failover time... I'll have to dig into that.
Engineer by day, DJ by night, family first always