Automation Issue with Cisco Switches

Started by deanwebb, June 30, 2017, 01:02:28 PM

Previous topic - Next topic

deanwebb

This one's puzzling me... I'd like to see if anyone else is facing this...

NAC will use SNMP and/or CLI access to a switch to assign a switchport to a new VLAN. Every now and then on some of my switches, that access fails and I have to go in to clear by hand VLAN assignments that are no longer valid. This is only on a few switches - maybe one out of every 15 or 20 that we're doing NAC enforcement on.

I talked with a NAC engineer from another firm the other day, and she said that she's seen something similar in her environment on 3850 switches. I didn't check my switches for model, but I have checked for IOS version and the affected ones have a range of IOS versions, from 12.2(35) up to the 15.0 codetrain.

I'm wondering, for those of you that do Prime and/or other automation tools, do you ever see a time where a switch just stops responding properly?
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

mlan

We have dabbled in EEM scripting, and have seen random failures where the switch fails to execute the commands properly.  We had difficulty pinning a root cause, but I wonder if the switch CPU is busy at the time, and the command script times out waiting for a response.

deanwebb

Quote from: mlan on June 30, 2017, 04:16:22 PM
We have dabbled in EEM scripting, and have seen random failures where the switch fails to execute the commands properly.  We had difficulty pinning a root cause, but I wonder if the switch CPU is busy at the time, and the command script times out waiting for a response.

That's an interesting possible cause I haven't thought of. I had been considering possible hangs in the SSH or SNMP daemons on the switch.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Teeraporl

I use this system as well, it works very well.

dlots

Never had the NAC change the vlan with a snmp/ssh session.  The way I did it "years ago" was to have the radius server reply back with a vlan name, then the switch would look at it's vlan names and select the proper vlan.

So I login and the Radius server sends back "Staff", the switch looks at the vlan names and sees that vlan 101 is named "Staff" so it puts me on vlan 101.

deanwebb

ForeScout CounterACT is able to do VLAN reassignments in this way, as it is not 802.1X-dependent in the way most other (I think *all* other, but I'll hedge my bets here) NAC solutions are.

Resolution, by the way, was that the switches were stacked and the stack master wasn't the one configured in the product. Getting the stack master back to the proper switch makes things work out just fine.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.