Ummm.... huh?

Started by NetworkGroover, December 16, 2015, 06:36:40 PM

Previous topic - Next topic

NetworkGroover

Can someone help me understand something?

I'm looking at the following for the 9372TX/PX:

http://www.cisco.com/c/en/us/products/collateral/switches/nexus-9000-series-switches/white-paper-c11-732452.html

Look at Figure 1.  Am I reading this correctly that traffic will enter a front-panel port, hit the NFE, go to the ALE-2, then out a uplink?

If so, then am I also reading this correctly that there are 6 x 42Gbps lanes between the NFE and ALE-2?  Cause that would mean 252 Gbps total.

If so, then wouldn't that mean that if you have 48 front-panel ports running at 10Gbps simultaneously sending traffic to the spine, you're oversubscribing the link between the NFE and ALE-2 almost 2:1 - just inside the box itself? It sounds too ludicrous to me to make sense and I assume I'm misinterpreting.
Engineer by day, DJ by night, family first always

NetworkGroover

I mean I guess it doesn't matter since the uplinks only provide 240 Gbps anyway... but I guess this is where the inter-ASIC buffering (?) occurs and where folks have complained about weird hidden drops...
Engineer by day, DJ by night, family first always

Reggle

I wouldn't be surprised. We once had a linecard for the 6500 here with a similar situation. Drops towards the backplane due to oversubscription on the ASIC. IIRC, the switch was decent enough to send a Syslog when it occurred.

that1guy15

Im digging through my training materials for the 9K and they go more in-depth on the 9500 than the 9300. But from what I remember there is over subscription on the 9300 and 2:1 sounds right since its geared to be a leaf switch. Dont quote me on it and Im trying to find where I saw this.
That1guy15
@that1guy_15
blog.movingonesandzeros.net

icecream-guy

my docs don't do into the 9372 specifically
for the 9396PX it's 12x42 between the NLE and ALE
for the 93128TX it 8x42 between the NLS and ALE (8 of 12 ports are active)

the 9372 are 72G ( 48 x 10G + 8 x 40 G)

for the 9300 series
The ALE has Shared 40MB buffer
The NLE has Shared 12MB buffer

Buffer on ALE (on GEM module):
Shared 10MB egress buffer for all 40GE ports on GEM module;
Shared 20MB egress buffer for traffic coming from GEM 40GE ports and going to 1/10GE front panel port.
Shared 10MB egress buffer for local NFE traffic
:professorcat:

My Moral Fibers have been cut.

NetworkGroover

Yeah front-panel to uplink ratio is 2:1 which is fine, but I just think it's weird that besides that, you have this inter-ASIC path that is oversubscribed as well, so rather than congestion occurring  at the uplink port, it can occur between the ASICs.  And then how is traffic hashed across those 6 lanes?  Could you run into an issue with more than 42G (I've been told this is HiGig(typical) so you really only get 40g) being pinned to that lane?  Probably not with enough entropy, just thought it was curious. 

I kinda knew about this stuff when I was initially researching ACI when it was new, but now revisiting this for a competitive situation.
Engineer by day, DJ by night, family first always

NetworkGroover

Quote from: ristau5741 on December 17, 2015, 10:29:18 AM
my docs don't do into the 9372 specifically
for the 9396PX it's 12x42 between the NLE and ALE
for the 93128TX it 8x42 between the NLS and ALE (8 of 12 ports are active)

the 9372 are 72G ( 48 x 10G + 8 x 40 G)

for the 9300 series
The ALE has Shared 40MB buffer
The NLE has Shared 12MB buffer

Buffer on ALE (on GEM module):
Shared 10MB egress buffer for all 40GE ports on GEM module;
Shared 20MB egress buffer for traffic coming from GEM 40GE ports and going to 1/10GE front panel port.
Shared 10MB egress buffer for local NFE traffic

Yeah for the 9372 it uses the ALE-2 ASIC, so 12MB of buffer shared among all front-panel ports at the NFE, and then an additional 25MB via the ALE-2 shared among all uplink ports, per the diagram.  Then there's the whole Buffer Boost piece, and it gets a little hard to follow with 10MB reserved for certain items... meh...
Engineer by day, DJ by night, family first always

NetworkGroover

Quote from: Reggle on December 17, 2015, 12:46:42 AM
I wouldn't be surprised. We once had a linecard for the 6500 here with a similar situation. Drops towards the backplane due to oversubscription on the ASIC. IIRC, the switch was decent enough to send a Syslog when it occurred.

Yuck - did you at least have a counter exposed so you could see the drops manually from CLI show output?

EDIT - Nevermind, I read that as "wasn't decent enough"
Engineer by day, DJ by night, family first always

burnyd

Which Arista switch are you comparing this to?  I would imagine a 7050 family would be similar?  I have quite a few 9372's in my environment. VCE is using that exact switch for all of their vblocks.  I can check later if you want on anything within that switch.

NetworkGroover

Quote from: burnyd on December 17, 2015, 06:01:34 PM
Which Arista switch are you comparing this to?  I would imagine a 7050 family would be similar?  I have quite a few 9372's in my environment. VCE is using that exact switch for all of their vblocks.  I can check later if you want on anything within that switch.

Just against one of the older 1RU Trident (Not T2) 7050 platforms, yeah.
Engineer by day, DJ by night, family first always