wired 802.1x issue

Started by itech, September 25, 2017, 02:45:21 AM

Previous topic - Next topic

itech

Hi

We have got four cisco 3850 switch as stack.

Versions



WS-C3850-48T 03.06.06E

WS-C3850-48T 03.06.06E

WS-C3850-48T 03.06.06E

WS-C3850-48P   03.06.06E



I configured aaa on 3850 as following



aaa new-model

aaa group server radius x

server name x

server name x

deadtime 1

aaa authentication login default group radius local

aaa authentication login NO none

aaa authentication dot1x default group x

aaa authorization exec default group x local if-authenticated

aaa authorization network default group x

aaa accounting dot1x network start-stop group x

aaa session-id common





I configured switchport as following



switchport access vlan x

switchport mode access

authentication control-direction in

authentication port-control auto

dot1x pae authenticator

storm-control broadcast level 50.00

storm-control action shutdown

spanning-tree portfast



i have got Microsoft NPS server and other switchs have same config

there is no problem



but clients on 3850 dont authenticate and I get an error like this in the logs



dot1x-5-result override authentication result overridden for client



i updated ios to 03.03.07E but the problem is still going on




is there anyone to help me



thanks

deanwebb

I am here to help. :awesome:

OK, a few questions for you -
0. What switch is *supposed* to be the stack master? Which switch is *currently* the stack master? Sometimes, things happen in a stack and the stack master loses its status. If you reboot the stack, you can get the stack master back to the correct switch. This is important because some communications with the stack are supposed to be handled by the switch that is supposed to be the stack master and if it's not the master, the communications don't work as desired.

1. What do the RADIUS logs say on the RADIUS server? If you don't have logging on, turn it on and do a test client. The most important part of the log for the test client will be towards the end, where it shows both the error message and the cause of the error. If you can post that after scrubbing IP addresses, that would be great.

2. Is this when you also try to change VLAN with CoA? If so, do your switches support CoA changes in their firmware version? It won't be just the version number, but also version type.

3. If you are doing CoA and the firmware supports CoA, then the next question is if the RADIUS server is passing the correct vendor-specific attributes (VSAs) to the switch to change the VLAN. What VSA settings is the RADIUS server set up to send to the switches?

Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

dlots

Do you have multiple ways to authenticate?  (any MAB on any of the ports or something like that).    This often happens when you fail 802.1x, but get though on profiling or mab.

If you look at
show authentication sessions interface (interface name)
it will tell you why it failed/succeed.  That might help some

Check and see if there are failed attempts on the Radius server for that login and it's just deciding to let it on.

Here is a guide on how the authentication works
https://www.cisco.com/c/dam/en/us/solutions/collateral/enterprise/design-zone-security/howto_81_troubleshooting_failed_authc.pdf


You might also check with TAC if you can, I think that's a newer switch model, and Cisco's wired 802.1x is buggy on a lot of their boxes.


itech

Quote from: deanwebb on September 25, 2017, 09:34:31 AM
I am here to help. :awesome:

OK, a few questions for you -
0. What switch is *supposed* to be the stack master? Which switch is *currently* the stack master? Sometimes, things happen in a stack and the stack master loses its status. If you reboot the stack, you can get the stack master back to the correct switch. This is important because some communications with the stack are supposed to be handled by the switch that is supposed to be the stack master and if it's not the master, the communications don't work as desired.

1. What do the RADIUS logs say on the RADIUS server? If you don't have logging on, turn it on and do a test client. The most important part of the log for the test client will be towards the end, where it shows both the error message and the cause of the error. If you can post that after scrubbing IP addresses, that would be great.

2. Is this when you also try to change VLAN with CoA? If so, do your switches support CoA changes in their firmware version? It won't be just the version number, but also version type.

3. If you are doing CoA and the firmware supports CoA, then the next question is if the RADIUS server is passing the correct vendor-specific attributes (VSAs) to the switch to change the VLAN. What VSA settings is the RADIUS server set up to send to the switches?

thanks your post deanwebb
i reviewed Nps event logs and i noticed that
there are the following differences between success and fail authentication

<Cisco-AV-Pair data_type="1">service-type=Framed</Cisco-AV-Pair>
<Cisco-AV-Pair data_type="1">audit-session-id=C0A862290000102C3D9E5538</Cisco-AV-Pair>
<Cisco-AV-Pair data_type="1">method=dot1x</Cisco-AV-Pair>



deanwebb

What's in the RADIUS logs, are you able to share those?
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

itech

Quote from: deanwebb on September 26, 2017, 10:25:04 AM
What's in the RADIUS logs, are you able to share those?
hi deanwebb
this logs are belonging to failed pc

<Event><Timestamp data_type="4">09/26/2017 16:14:07.616</Timestamp><Computer-Name data_type="1">SERVER HOSTNAME</Computer-Name><Event-Source data_type="1">IAS</Event-Source><Service-Type data_type="0">2</Service-Type><Framed-MTU data_type="0">1500</Framed-MTU><Called-Station-Id data_type="1">C4-44-A0-FF-86-8B</Called-Station-Id><Calling-Station-Id data_type="1">C8-5B-76-FA-C3-6D</Calling-Station-Id><Framed-IP-Address data_type="3">PC IP</Framed-IP-Address><NAS-IP-Address data_type="3">SW IP</NAS-IP-Address><NAS-Port-Id data_type="1">GigabitEthernet1/0/11</NAS-Port-Id><NAS-Port-Type data_type="0">15</NAS-Port-Type><NAS-Port data_type="0">50111</NAS-Port><Client-IP-Address data_type="3">SW IP</Client-IP-Address><Client-Vendor data_type="0">9</Client-Vendor><Client-Friendly-Name data_type="1">SW HOSTNAME</Client-Friendly-Name><Cisco-AV-Pair data_type="1">method=dot1x</Cisco-AV-Pair><Cisco-AV-Pair data_type="1">service-type=Framed</Cisco-AV-Pair><Cisco-AV-Pair data_type="1">audit-session-id=C0A8621F0000003204D2ADEE</Cisco-AV-Pair><User-Name data_type="1">PC HOSTNAME</User-Name><Proxy-Policy-Name data_type="1">Local Wired 802.1X</Proxy-Policy-Name><Provider-Type data_type="0">1</Provider-Type><SAM-Account-Name data_type="1">PC HOSTNAME</SAM-Account-Name><Fully-Qualifed-User-Name data_type="1">PC HOSTNAME</Fully-Qualifed-User-Name><Authentication-Type data_type="0">5</Authentication-Type><NP-Policy-Name data_type="1">Local Wired 802.1X</NP-Policy-Name><Class data_type="1">311 1 SERVER IP 09/15/2017 20:11:23 117420</Class><Packet-Type data_type="0">1</Packet-Type><Reason-Code data_type="0">0</Reason-Code></Event>

deanwebb

You would be looking for logs that contain either the RADIUS-Accept or RADIUS-Reject message for that client.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

itech

hi
finally i resolved this issue. :smug: i have got another switch 2960x ios 15.2.6E with same problem.when i downgrade to 15.0(2a)EX5 radius problem is resolved.I have not tried it on the 3850.
but after downgrade new problem occured.when i try to connect to switch via ssh or telnet ,switch is unexpected reload. :evil: i look into to crash logs.do you have an idea about the source of the problem?

icecream-guy

Quote from: itech on October 04, 2017, 03:58:12 AM
hi
finally i resolved this issue. :smug: i have got another switch 2960x ios 15.2.6E with same problem.when i downgrade to 15.0(2a)EX5 radius problem is resolved.I have not tried it on the 3850.
but after downgrade new problem occured.when i try to connect to switch via ssh or telnet ,switch is unexpected reload. :evil: i look into to crash logs.do you have an idea about the source of the problem?

you may be hitting bug CSCud90069  work around is to Remove ''login on-failure log'' or ''login on-success log'' if configured.
otherwise upgrade to 15.2(2)E.   not sure what any of this does to the RADIUS functionality.

:professorcat:

My Moral Fibers have been cut.

dlots

Quote from: dlots on September 25, 2017, 12:27:42 PM
You might also check with TAC if you can, I think that's a newer switch model, and Cisco's wired 802.1x is buggy on a lot of their boxes.

Called it :-)

deanwebb

Quote from: ristau5741 on October 04, 2017, 06:16:55 AM
Quote from: itech on October 04, 2017, 03:58:12 AM
hi
finally i resolved this issue. :smug: i have got another switch 2960x ios 15.2.6E with same problem.when i downgrade to 15.0(2a)EX5 radius problem is resolved.I have not tried it on the 3850.
but after downgrade new problem occured.when i try to connect to switch via ssh or telnet ,switch is unexpected reload. :evil: i look into to crash logs.do you have an idea about the source of the problem?

you may be hitting bug CSCud90069  work around is to Remove ''login on-failure log'' or ''login on-success log'' if configured.
otherwise upgrade to 15.2(2)E.   not sure what any of this does to the RADIUS functionality.



Looks like that's the sweet spot where you might get both RADIUS and SSH.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

itech

yes this is a sweet bug. :)

i removed

login on-failure log
login on-success log

and log on problem is resolved and authentication success with Radius after a long effort.but all solutions for 2960x with ios 15.0(2a)EX5.

my issue updated that

which ios xe compatible with 15.0(2a)EX5? i cant find anywhere. or can i update 3850 with 15.0(2a)EX5?



icecream-guy

#12
Quote from: itech on October 06, 2017, 01:12:12 AM
yes this is a sweet bug. :)

i removed

login on-failure log
login on-success log

and log on problem is resolved and authentication success with Radius after a long effort.but all solutions for 2960x with ios 15.0(2a)EX5.

my issue updated that

which ios xe compatible with 15.0(2a)EX5? i cant find anywhere. or can i update 3850 with 15.0(2a)EX5?

the IOS XE bin image has the IOS version ebmedded it its name, the closest I could find would be. 3.6.0E which would be 15.2.2E,
all older versions have been deferred.

speaking of versions, that 15.0(2a)EX5 is pretty buggy, Cisco recommends 5.2(2)E7.
If that is not possible you might want to run through the cisco IOS software checker bug list

https://tools.cisco.com/security/center/swCheckerShowResults.x?technologySelected=&advisoriesSelected=&versionsSelected=205064&versionNamesSelected=15.0%282a%29EX5&productSelected=ios&allAdvisoriesSelectedByTree=N&advisoryType=0&iosBundleId=cisco-sa-20170927-bundle&adv_key=0.8117186554537584
(CCO LOGIN REQUIRED)

and see if you can re-mediate some of those bugs.
:professorcat:

My Moral Fibers have been cut.

itech

unfortunately the same method did not work on 3850
i am trying different ioses  one by one but i couldnt find the suitable ios  :evil: :evil: :evil:

deanwebb

Quote from: itech on October 16, 2017, 01:57:00 AM
unfortunately the same method did not work on 3850
i am trying different ioses  one by one but i couldnt find the suitable ios  :evil: :evil: :evil:

Have you tried 12.2(53) or 12.2(55)? Those IOS versions are quite reliable.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.