VSS blew up at my old job

dlots · March 14, 2017, 08:24:52 AM

Before I left I advised my old job (the one that I left cause they wouldn't listen to me) not to do VSS, they didn't need the extra fast fail-over for anything in particular, or the extra bandwidth that comes with VPC vs Spanning-tree. The Sr Engineers said I was dumb and it was super awesome and I didn't know what I was talking about.

Well they ran into a VSS bug the other week and lost both of their data-centers for a few hours and had to reboot everything.

Would it be wrong to call them and tell them "Told you so"?

LynK · March 14, 2017, 09:06:25 AM

but it is easy. we like easy.

everyone needs the extra fast failover!

deanwebb · March 14, 2017, 10:53:41 AM

Just send them this image:

Otanx · March 14, 2017, 11:13:07 AM

We did VSS in our first DC because "easy". Have not hit any bugs, but patching VSS is more involved, and with only one control plane a single configuration error can kill everything. In the process of building two new DCs, and have decided not to use VSS. Also have a side project to figure out how to break VSS and go to independent cores for DC1. Shouldn't be too hard. We were smart enough that the majority of the layer 3 interfaces have HSRP configured.

-Otanx

mlan · March 14, 2017, 11:48:47 AM

@dlots - I have seen this once myself, but the team did not have any split-brain detection configured. Any idea what type of dual-active detection they were running for this incident? I have been suggesting "fast-hello" for the VSS deployments in my environment.

Also, do you happen to know the bug ID?

dlots · March 14, 2017, 12:33:21 PM

Sorry no to both.
I wasn't very popular with the network team there, and don't talk to them much at all.

wintermute000 · March 14, 2017, 03:25:46 PM

Shared blast radius. I'm not a fan in a DC scenario. For campus where you can usually bring it down on a Sunday etc different story

icecream-guy · March 15, 2017, 07:13:07 AM

MEC

SimonV · March 15, 2017, 09:29:39 AM

So what's the opinion on stacking vs standalone core switches then?

I'm not a DC guy, doing mostly office and warehouse networks and I almost always install two separate L3 switches with RSTP and fast HSRP timers. Two-link etherchannels to both cores and load-balancing the VLANs by modifying STP cost.

I don't mind stacking in the access layer but I always get the feeling that it's too much of a risk in the core.

icecream-guy · March 15, 2017, 10:43:44 AM

Quote from: SimonV on March 15, 2017, 09:29:39 AM
So what's the opinion on stacking vs standalone core switches then?

I'm not a DC guy, doing mostly office and warehouse networks and I almost always install two separate L3 switches with RSTP and fast HSRP timers. Two-link etherchannels to both cores and load-balancing the VLANs by modifying STP cost.

I don't mind stacking in the access layer but I always get the feeling that it's too much of a risk in the core.

so who are you asking? the network engineer that has to support, or the Cisco Engineer/sales guy?

from a nework engineer it's a PITA to support, upgrading, failed modules, failover, chassis have to be identical configs/. modules and such

if you are in a DC, use Nexus and build VPC's. In a campus environment, it's a nice core with all your access switches dual homed to the VSS with MEC.

mlan · March 15, 2017, 01:26:21 PM

Quote from: ristau5741 on March 15, 2017, 10:43:44 AM
In a campus environment, it's a nice core with all your access switches dual homed to the VSS with MEC.

This is how I'm feeling about the topic.

Otanx · March 15, 2017, 03:56:50 PM

I am not a fan of stacking anywhere. I hit too many issues with stacking when it was new. I understand the idea of managing one config vs multiple, but once you automate all of that you don't care how many devices are being managed.

-Otanx

wintermute000 · March 15, 2017, 05:54:22 PM

Maintenance. Cheaper than separate devices

mlan · March 15, 2017, 06:25:21 PM

Quote from: wintermute000 on March 15, 2017, 05:54:22 PM
Maintenance. Cheaper than separate devices

Also, the cost of stack modules/cables is sometimes less expensive than using separate optics and patch cables to aggregate access switches. That might not be true for every vendor though. Just playing devil's advocate here. I enjoy these types of pros/cons discussions, and I usually learn something I didn't know/didn't think of.

wintermute000 · March 16, 2017, 12:46:21 AM

I trust stacking. I somewhat don't trust VSS.

Campus yes, dist yes, DC or massive/super-critical campus core no.