Quote from: AspiringNetworker on March 09, 2015, 01:08:29 PM
I used to work at a place where I discovered the enterprise root bridge was an access switch in a random wiring closet.
:zomgwtfbbq: :wtf: :eek:
My current network is a Hospital system. We have multiple independent businesses that reside on campus and have PCs and such scattered all over the place. The old admins found the best command to segregate their network from ours and still pass traffic, spanning-tree bpdu filter.
This command is all over the place in some of the most random places and not is places that it should be.
After I came on I did a core replace and about once a month the network would blip. IM would drop, phones would go offline. Internet would die for a second or two. Every time MST root was shifting away from the core. Once I figured this out I traced it to the closet and what did I find? A single 2950 of one of these companies with Spanning-tree vlan 1-4094 priority 0. My MST was 4096...
It was a chain reaction. The 2950 had two up-links to two separate closets. One had bpdu filter inplace and the other had it removed with the core upgrade. Their STP killed the non-bpdu filtered path normally but every once in a while the other link would fail and STP would converge and take over my whole campus root.
Not fun and I am almost done segregate these companies into their own space.
Quote from: that1guy15 on March 09, 2015, 04:45:26 PM
My current network is a Hospital system. We have multiple independent businesses that reside on campus and have PCs and such scattered all over the place. The old admins found the best command to segregate their network from ours and still pass traffic, spanning-tree bpdu filter.
This command is all over the place in some of the most random places and not is places that it should be.
After I came on I did a core replace and about once a month the network would blip. IM would drop, phones would go offline. Internet would die for a second or two. Every time MST root was shifting away from the core. Once I figured this out I traced it to the closet and what did I find? A single 2950 of one of these companies with Spanning-tree vlan 1-4094 priority 0. My MST was 4096...
It was a chain reaction. The 2950 had two up-links to two separate closets. One had bpdu filter inplace and the other had it removed with the core upgrade. Their STP killed the non-bpdu filtered path normally but every once in a while the other link would fail and STP would converge and take over my whole campus root.
Not fun and I am almost done segregate these companies into their own space.
wow....lol.
Its okay, when i first joined my new company we had 2-3 variants of spanning tree running. The nexus 7k was running at default priority 32xxx. /facepalm.
I have mentioned this before, but when we were doing clean up we found that all three cores had spanning-tree turned off. The only thing that saved this network is that when you turn spanning-tree off the switch will forward the bpdu.
-Otanx
Damn! Ive never walked up on a network with STP disabled...
My current network has about 3-4 major design issues that I am reverse-engineering and correcting. Each with their own level of head-desk moments. Its landmine after landmine on this guy. I will say though its not the biggest and most glamorous network but the experience gained and learning is off the charts. I will wear the scars of this network proudly for a long time!!
Quote from: that1guy15 on March 10, 2015, 02:00:11 PM
Damn! Ive never walked up on a network with STP disabled...
I have. We acquired a company that did this. Some of the guys we acquired were telling us about how the network would just randomly "blip" and then it would come back, but it's always happened so don't worry about it. Had the luxury of interviewing one of the Sr. Engineers a few months later. My boss made sure to ask STP questions. LOL the guy used the same network as an example during the interview, they turned it off to increase speed. Bahahaha :rofl: :rofl: :rofl: :rofl: :rofl:
Needless to say it was re-enabled and blips suddenly went away. Weird.
Man, I'm glad I did the Hitler Rants about STP being turned off video... :)
Sounds like this needs to be split off into a "war stories" thread. I still have my GRE/QnQ/no CDP/NAT/tunnel network to tell you guys about :)
Consider it done... War stories thread starting in 3... 2... 1...
INCOMING!
What are the best, the worst, and the weirdest networks you ever worked on?
One of the worst for me was the server/comm room that had a working sprinkler system... and then a hub overheated and started to generate smoke... :wall:
I had a client that installed a mini-split air conditioning unit above a server rack in an 8 x 12 closet. One humid day and an improperly installed drain, and that thing started to dump water. I showed up on a Monday morning and my jaw dropped as I opened the door. The saturated drywall was peeling off the wall and had fallen onto the top of the rack. Water was dripping down both sides, etc. Nobody seemed to be half as concerned as I was.
Quote from: deanwebb on March 09, 2015, 02:49:00 PM
Quote from: AspiringNetworker on March 09, 2015, 01:08:29 PM
I used to work at a place where I discovered the enterprise root bridge was an access switch in a random wiring closet.
:zomgwtfbbq: :wtf: :eek:
Default priority - everywhere.
Intermittent wireless issue... we'd have wireless access, then it would go off suddenly, then it would come back on, something was wrong with the hardware... So I asked where the wireless access point was mounted and nobody knew.
So I went to the switch, traced it down... it was IN MY OFFICE!
So I go back to my office and look around and, behind my chair, is the access point. Turns out, every time I got comfy and leaned back in my chair, the wireless went out. Someone would call, I'd sit up to answer the call and, oh, wait, nevermind, the wireless is back on. I'd check email, get bored, lean back again and then RING!
I found the problem: The power cord to the AP was getting pulled out *_just_enough_* when I tilted back for it to unseat in the outlet. When I tilted back forward, it moved back just enough to get power. So I moved it to be on top of a bookcase and that ended the intermittent wireless, which had been going on ever since it was set up, it seems. We also had better reception with it there, so I was the hero of the day.
Let me see...
1. 200+ Switch layer two domain using VTP. We never killed the network, but we were extra careful. STP root was default everywhere so root was an edge switch in a small building.
2. The one I mentioned above. Cores had STP turned off. All other switches were using default STP configurations. So some were using STP some RSTP. Depending on how old the switches OS was.
3. A "failover" configuration of GRE with two head-end routers terminating remote sites. Sounds like a good plan, and if the primary GRE router died the backup would take over. However, if the backup GRE router was rebooted it killed all traffic. We never did figure out why. We just decommissioned the entire setup.
4. While in the Army stationed in Korea. Someone took a rack out of a dish washer, you know the piece with all the plastic sticks to hold the plates, and mounted that to the wall. They then zip tied a switch into this, and used the dish rack as cable management. I used to have a photo of this, but I can't find it now.
5. Korea again. Used pair-gain DSL modems to get connectivity over copper cable installed during the Korean war. Cable ran under the runway, and could not be replaced. Whenever it rained we lost connectivity.
6. Korea again. During a field exercise we needed to link the primary operations center with the backup. Didn't have fiber. Ran an entire box of CAT5 (1000'). Put ends on it, and plugged it in. It worked, but the entire network ran at a crawl.
7. Working with the Marines. Needed a buried cable run about 400'. How do you dig a trench 400' long? You steal a fork lift use an expedited government requisition process to get a fork lift, remove one fork, and rotate the other fork so it is facing down. Jam the fork into the ground, and drive forward. It the fork lift gets stuck push it with a truck.
8. Rats chewing on an unprotected fiber run in a building? Line up about 50 glue rat traps along the path, and lay the fiber in the middle. Replace glue traps as needed.
9. Wireless signal was poor. The other group running the wireless finally asked for help. We go over in force (5 network engineers) to troubleshoot. Found they had zip tied the AP to the steel I-beam that held up the roof. This put a good half inch of metal between most of the building, and the antenna.
That is all for now.
-Otanx
few jobs ago, worked for a company that spent 500K on some beefy servers, but didn't take into account heat dissipation, or the need for A/C. it's was 110-120 in the server room on a daily basis. the servers burned themselves out in about a year.
Mystery of the down switch. Last job, we did a closet upgrade, due to the small closets the switches were mounted the switches vertically. ports up, due to short cables. every so often a switch would go down for unknown reasons, turned out the power plugs were falling out of the back (bottom) of the switch.
Years ago, I was working for a managed security provider. It was probably my 2nd week on the job. Mid-shift, the internet connection became unusable.
It turns out that the circuit was pegged and it was a bunch of SMTP traffic clogging up the link. We didn't host any SMTP servers....
After tracing some cables, a server was discovered, in the ceiling!! It was placed there by a previous admin.
I love switches and stuff in the ceiling.
Oh this is too funny... Then again working in a small company that grew fast I can see how some of this happens.
Not network related, but I used to have a server at my desk (hidden/locked in my overhead, stayed cool)... It was cheaper to req a new desktop and I reconfigure it than to get a real rack unit up. I just recently had desktop pick it up. Our IT hated me developing all kinda of apps and reports. When they complained I offered to let them to take charge, then they would leave me alone. Lol
Quote from: deanwebb on March 11, 2015, 01:05:39 PM
I love switches and stuff in the ceiling.
We name our closets by floor than assign letters such as 1A or 3D. One of my larger remote sites has a 3.5E closet and when you find the 3E closet you look up and there is a drop down ceiling tile in the hallway with a switch and patch mounted. Looks just like every other tile and you would never know it was there.
Worked in a hospital. One of the branch offices kept loosing network connectivity randomly, and for a minute at a time. The router and switch would constantly reboot. Replaced both devices, and then found out that on the other side of the wall was an X-ray machine. The walls did not have proper lining.
I used to have a picture around here somewhere that was of a couple servers in a closet sitting on the floor next to the router and switch. Next to the devices was a water heater with the drain plug pointed directly at them. :drama:
This is my new favorite thread.[emoji1]
Quote from: Nerm on March 11, 2015, 04:28:40 PM
I used to have a picture around here somewhere that was of a couple servers in a closet sitting on the floor next to the router and switch. Next to the devices was a water heater with the drain plug pointed directly at them. :drama:
They didn't call that the fire suppression system?
Quote from: that1guy15 on March 10, 2015, 02:00:11 PM
Damn! Ive never walked up on a network with STP disabled...
My current network has about 3-4 major design issues that I am reverse-engineering and correcting. Each with their own level of head-desk moments. Its landmine after landmine on this guy. I will say though its not the biggest and most glamorous network but the experience gained and learning is off the charts. I will wear the scars of this network proudly for a long time!!
This made me feel like a feral warrior, wearing a necklace made of teeth from networks that I have battled and conquered.
Active Directory.
If I knew what the problem with it was, I wouldn't be posting about it here... Working with AD issues in a large enterprise is like dealing with guerrilla warfare. Never know when, where, or how the enemy strikes next.
Our network when i took over had 5 different EIGRP ASs redistributed into one another (for no real reason), duplicate subnets in use, and no documentation at all.
6 months, I've been working to get this solution put into place. As of the last meeting, I now go exactly back to where I started and start over again. Not only that, but I will now have meetings with the same people I met with, in the order that I first met with them.
The difference this time is that I've now gotten all of their collective *approvals* to have this next cycle of meetings.
"The bureaucracy will expand to fill the needs of the bureaucracy." - anonymous
School system with about 600 users and yes that is an RV042 running the ship. Sad part is under the shelf that the RV042 is sitting on you will find a 2801 (not visible in this pic) terminating the ISP circuit and then feeding the RV042. Saddest part is the reason they were using the RV042 is because their previous IT guy couldn't get to the "web gui" of the 2801.
(http://www.benchaddix.com/images/IMAG0008.jpg)
Seems like spanning tree everywhere!
Similar issue, access switch in a factory as root, around 300 users, vlan 1 is the only vlan with 10 secondary IPs on it /23 on every IP. needless to say we had a lot of broadcast traffic! also all HP switches were EOL/EOS about 5 years ago. Had 3 fail in 1 afternoon (luckily for the business a Sunday, unlucky for me! ). No documentation, if a nother switch was needed was just plugged into the closest switch to it. daisy chains everywhere. had no idea what was connected to what or where, needless to say had issues everyday until i gradually started wining the battle. The war is not over yet though!
Also if anyone is interested, i have a pretty long DDoS story, (we are an ecommerce retalier, who were DDoS for 1 Bitcoin ransom everyweek!) i could tell you what we did, what the attacker did and what weve done since, its a long story though...
Ta
Please do share the story, good sir. That is what this thread is for.
Ok so it begins...
Backdrop: Old DC, 2800s routers, 100s of old servers, we are migrating to new fancy DC, not there yet. all current kit full of faults and fans broken etc
Saturday afternoon last year:
Panic panic, ring everyone, text everyone, the sites down! on call engineer logs into our DC routers, we keep losing connectivty, manage to get a sh int off, routers are taking huge ammounts of bandwidth. why?! We must be gettind DDoS'd. Call ISP, yup confirmed DDoS, they blackhole our traffic...
ISP says attach was 15GB
Week after:
Look into DDoS mitigation options, speak to providers/vendors etc
Ransom email from Attacker, give me 1 bitcoin and i will stop...
ignore attacker
Signed up for encapsula cloud scrubbing trial, havent changed DNS yet, still speaking to other vendors.
many vendors boxes are aimed at ISPS, their price brackets are huge, vendords baffled by our ISPs inabliity to protect us...
Saturday again:
Attack, Attack!
everyone get in asap... what can we do?!
Turn on the free trial of scrubber... DNS changes (our TTL 5 mins) login to trial dashboard, see that they are taking hits, gradually we get some site acitvity back
Mr attacker not happy about that... must've search our RIPE records or query some other addresses... Attacks the head office were we are trying to stop him (must of been luck)
Head office taken down now!
Head office is where all orders are processed, connected to live DC via VPN
Worst thing about the head office is its the head office for two sister companies, and we have kit serving both companies on the WAN here :|
Mgmt - we need to not have two companies taken out by this, do something to make sure it onyl effects 1 company if it happens again, can you do it now...
Dual ISPs at head office, shut BGP session with 1 ISP, NAT out of it and DMVPN on it, ISPs public ISP now being used, unfeffected by DDoS hits.
Luckily we have a temp L2 link between the new DC and the old DC. re route all traffic to old DC via the new DC and the L2 link there
We can now get to the live DC regardless of WAN saturation and we still have some services online if he hits head office again.
Week after:
decided on DDoS provider, pay for emergency install ($$$$) get onsite appliance that can take 1GB backed up with cloud scrubbing service (they advertise our AS more attractively than us and send clean traffic down gre tunnels to our routers)
Saturday:
Attack Attack!
on prem device and cloud scrubbing service kicks in, site is up during attack, VICTORY!!
attacks head office, we have cloud service only at head office, divert head office to cloud service too, VICTORY!
Mmeanwhile... police involved, eventually heard that they arrested someone from france, he was ransoming loads of companies for 1 bitcoin...
That was a loonnng three weeks, i think me and my colleauge worked 5 weeks time in the space of that 3 week period.
Safe to say were now at the new DC with new equipment on prem devices and cloud services and back using BGP on both our links at the head office
Phew!
Came to this job and wanted to know what the network looked like asked for documentation... There isn't any... none, 200+ routers and unknown numbers of switches and no documentation, and we test IA stuff here so no CDP or anything like that. So I started to make some, and here is what I found:
5 EIGRP AS numbers redistributed into one another (for no real reason)
Multiple over lapping IP ranges
VTP client/server mode with no passwords: found that out when I plugged in a new switch and took out the network (oops)
SIA flapps all the time
incompatible IOSs that made EIGRP flip out on a regular basis
huge amounts of the routing done with static routes
1 router that was a horrid NAT hell inside/outsides randomly slapped on interfaces with no real reason
No QoS on the backbone gear
I had to log into (and conf t some non trivial changes) a 2600 the other day... IOS 12.3 and all. Bonus points: via a BRI dial on demand circuit. Flashbacks to CCNA classes circa 2006.
I had to actually look up a RV042 lol.
Due to NDA's I couldn't take a picture of something I got to see today but will describe as best I can.
I go onsite to a customer HQ and in the wiring closet I find all the existing cable runs are documented and by documented I mean a piece of cardboard had what the cable was to written on it and then a hole poked through it with the cable going through the hole.
:zomgwtfbbq:
:wha?:
Cardboard? As in a cut-up piece of a box with marker on it?
Were the edges smooth or rough on the cardboard pieces?
Was it fire resistant cardboard? Maybe they used it to keep the EMI crosstalk down by properly spacing their non-spec from China patch cables.
Yep just cut up box kind of cardboard. One of them even still had part of a UPS shipping label on it. lol
Quote from: Nerm on August 21, 2015, 07:09:28 AM
Yep just cut up box kind of cardboard. One of them even still had part of a UPS shipping label on it. lol
:facepalm3:
i don't see how that's that bad.... maybe I'm just old skool, and actually remember what a krone tool looks like, along with all the tags that are tied to the pairs with jumper wire lol, the cardboard is just a variant of this.
When I am doing a quick and dirty cabling change and I don't have time/tools to use proper sticky labels, I write the port info on a piece of paper, make a hole in the paper and push the cable through it as an improvised tag. Many such tags still exist 'in production' LOL.
I've also installed a 4900M once where the customer rack was too short to fit the rear rails, leaving the front rack nuts clinging on for dear life at a 15 degree or so angle. As there was no shelf to be had, the customer and I tied a fistful of cat5 to hold up the back as an improvised cradle. I told him to go buy a shelf the next day. He tells me its still working fine. This customer also happened to use this comms room as a combined cleaner's closet and there was even broken glass on the floor on a regular occasion since the cleaners also throw broken furniture in there and even spare furnishings (stock paintings etc.)and folding chairs etc. so par for the course.
Paper I have seen, but cardboard? That was a first for me.
After the quick 'n' dirty, it's time to do the clean 'n' nice, not to leave the quick fix in place. Get a label gun and go to town on those wires.
AND LABEL BOTH ENDS.
New one...
Switch at main office goes down. CPU spiking bad on it. Culprit is determined to be scanning traffic.
Techs there find three devices doing the scanning: Server1, Server2, and NAC_SERVER_DEANWEBB_IS_RESPONSIBLE_FOR. OK, so that's not our naming convention, but that's kinda what they see. Two generic server names and IP addresses and one very well-named device that they immediately zero in on because it requires no imagination or investigation.
So the techs put an ACL on the switch that blocks all traffic from the NAC box.
Right about that time, people in the main office report that their wireless is dropping them.
A brief investigation uncovers that the NAC box gets the RADIUS request, but claims the WLC isn't set up for dot1x. The WLC says it sent the request, but never received a response. Well, duh!
NAC has not been a problem with its scans for over a year. It didn't need no blocking.
As for the other, generically-named servers? Turns out, they were Qualys scanners set to full blast and the switch was not exempt from their wrath. Neither did the switch have an ACL on it to block Qualys traffic to its management interface, like it was supposed to have.
:ivan:
Resolution: How about unblocking the NAC and doing a little more investigation? If a switch CPU just hit 100% and you have a vulnerability scanner set on "KILL", check the vuln. scanner FIRST, mmmkay?
:tmyk:
Quote from: dlots on July 07, 2015, 07:02:16 AM
Came to this job and wanted to know what the network looked like asked for documentation... There isn't any... none, 200+ routers and unknown numbers of switches and no documentation, and we test IA stuff here so no CDP or anything like that. So I started to make some, and here is what I found:
5 EIGRP AS numbers redistributed into one another (for no real reason)
Multiple over lapping IP ranges
VTP client/server mode with no passwords: found that out when I plugged in a new switch and took out the network (oops)
SIA flapps all the time
incompatible IOSs that made EIGRP flip out on a regular basis
huge amounts of the routing done with static routes
1 router that was a horrid NAT hell inside/outsides randomly slapped on interfaces with no real reason
No QoS on the backbone gear
That sounds almost typical.
Quote from: AnthonyC on September 29, 2015, 10:16:51 AM
Quote from: dlots on July 07, 2015, 07:02:16 AM
Came to this job and wanted to know what the network looked like asked for documentation... There isn't any... none, 200+ routers and unknown numbers of switches and no documentation, and we test IA stuff here so no CDP or anything like that. So I started to make some, and here is what I found:
5 EIGRP AS numbers redistributed into one another (for no real reason)
Multiple over lapping IP ranges
VTP client/server mode with no passwords: found that out when I plugged in a new switch and took out the network (oops)
SIA flapps all the time
incompatible IOSs that made EIGRP flip out on a regular basis
huge amounts of the routing done with static routes
1 router that was a horrid NAT hell inside/outsides randomly slapped on interfaces with no real reason
No QoS on the backbone gear
That sounds almost typical.
Uhhhh... yup. Especially in that space, dlots. I tried not to laugh the other day when VTP was listed as selection criteria for new vendors. I would never, ever, use VTP. I'll take pushing VLANs with Ansible or hell even manual copy/paste over VTP any day of the week for exactly that reason. I understand VTP3 is better but... *shudder* - No.. just.. no....
Recently had this fall in my "projects" folder.
Client bought new server running Windows Server Std 2012 R2 and is having trouble getting AD migrated from old server. *note: Onsite only as old server is not connected to the internet.
I get onsite and find that the old server they are trying to migrate from is running Windows Server 2000. WTF? This client has 3 full-time in-house IT personnel and not one of them thought to check the migration path. Worse why are they still running a 2000 server to begin with?
Quote from: Nerm on November 11, 2015, 03:39:37 PM
Recently had this fall in my "projects" folder.
Client bought new server running Windows Server Std 2012 R2 and is having trouble getting AD migrated from old server. *note: Onsite only as old server is not connected to the internet.
I get onsite and find that the old server they are trying to migrate from is running Windows Server 2000. WTF? This client has 3 full-time in-house IT personnel and not one of them thought to check the migration path. Worse why are they still running a 2000 server to begin with?
:haha3: :facepalm4:
I haven't been here much but there were a couple of things I've had to crawl through recently.
We have 2 x 300mb replication circuits for DRBCP backup & duplication. In July I started noticing they were heavy unbalanced and CIFS replication traffic was coming to a crawl for no reason. fast forward 6 weeks and endless testing I nailed it down to our dell open storage DR4100 appliances. The replication buffer was written to use best effort shared memory (long story). They wrote us a patch that created a round robin replication buffer for unequal link speed within 3 days. Just one of those things.
Replication speeds on a 20Gb teams were passing at a speed of 4-5Gb per flow across our VCF.. Turns out the 45drive storinator storage device we purchased 11 month ago came with a supermicro motherboard sporting 2 x PCIe 2.0 x4.. Apparently the Intel X540 dual 10GB cards require PCIe 3 x 8.. So were stuck at a 10Gb max flow rate across that one
Setup some new Dell M1000e bladecenters last week, stacked IO aggregator's dual quad 10Gb base-t cards, 80Gb LACP on our QFX switches.. Tested 31Gb/s across the fabrics today on the new blade servers..
It's 4pm on a friday!!!
"Can we export a private key and send it to a vendor so he can decrypt Wireshark captures?"
:no:
Quote from: deanwebb on August 10, 2016, 08:43:08 PM
"Can we export a private key and send it to a vendor so he can decrypt Wireshark captures?"
:no:
well ya can, if you want to trash the while PKI, generate a new private key and public key, then whoever asked this silly question has to contact everyone with the old public key and provide them with the new public key.
Quote from: deanwebb on August 10, 2016, 08:43:08 PM
"Can we export a private key and send it to a vendor so he can decrypt Wireshark captures?"
:no:
Not sending to a Vendor, but I have made our server teams give me private keys to decrypt traffic when they try to blame the network. One of our security guys didn't know this was a thing, and his head exploded the first time he saw me do it. Also note our normal procedure is create new keys, import them, do troubleshooting, gen new keys, import those. Then I only get keys to decrypt traffic for a short time.
The few times I have gone this far it is after they refuse to believe me that the network is fine, and tell management they can't meet a deadline because the network team can't fix the network. Then I have to go into "Do the server teams job for them" mode. :matrix:
-Otanx
Quote from: Otanx on August 11, 2016, 09:41:27 AM
Quote from: deanwebb on August 10, 2016, 08:43:08 PM
"Can we export a private key and send it to a vendor so he can decrypt Wireshark captures?"
:no:
Not sending to a Vendor, but I have made our server teams give me private keys to decrypt traffic when they try to blame the network. One of our security guys didn't know this was a thing, and his head exploded the first time he saw me do it. Also note our normal procedure is create new keys, import them, do troubleshooting, gen new keys, import those. Then I only get keys to decrypt traffic for a short time.
The few times I have gone this far it is after they refuse to believe me that the network is fine, and tell management they can't meet a deadline because the network team can't fix the network. Then I have to go into "Do the server teams job for them" mode. :matrix:
-Otanx
Already there with you about doing the server guys' jobs...
:notthefirewall:
not just the server guys, it's doing everyone else s job, just to prove it's not the network..
Fun little tidbit.
Team lead - "People are having issues with the SVN."
Me - "SVN is fine; server is up. Did they at least try to checkout the repo?"
Team lead - "Lemme ask... {5 mins later} Uhh, Nope."
(https://www.networking-forums.com/Smileys/aaron/902.gif)
"We have to reduce headcount, but want to keep the same project schedule. You think you can manage that?"
:no:
So this happened this past Friday.
I work at a local community college as a part-time tutor in the game department. I'm helping the current ProjectDev class create a game as their Technical Director; I basically am in charge of the server and the svn repo. I host it, I maintain it and I also help anyone in the class with issues they have with the game engine inside said repo. I have it set up to where people use a standard key pair to tunnel over ssh to the repo through a combination of Putty and TortoiseSVN clients in order to not have to deal with passwords. This server has been live for about 3 weeks or so. The reason we decided to go this route was because of 4 main reasons; control, 24/7 access, cheap, and for flexibility.
So now that that is out of the way, here is the actual story.
We start class on Friday morning and people are connecting to the server without issues. Everything is fine up until around 11 when a few students are reporting that they can't commit their changes. Other students start to report that they can't download the repo at all. Something is wrong. I check all the logs I can think of and find no issue server-side. A few minutes later, my boss and the tech guy for our department, come into the class and go directly to the instructor. They, basically, say that they just got an urgent email AND a phone call from the head of the DISTRICT-WIDE networking office accusing someone in our room of hacking on the network. (lol) They have a list of IP addresses for the computers that were reporting this issue. We inventory those computers and find that low and behold those were the same computers having issues committing. The instructor, the project lead and I set up a conference call with the guy and explain to him that we aren't doing anything nefarious and eventually we're able to get the computers unblocked. We were also able to whitelist my IP so that we won't have this issue in the future so long as I can get my server behind a static IP proxy of some sorts.
First of all, how did it take them 3 weeks to find my server? lol Second of all, why do they care about people SSHing out of the network? Finally, I'm pissed at the Instructor because he could've just called the guy on day 1 and explain the situation and this embarrassment never would've happened.
1. It took that long because they had to place the operation under observation, to make sure you weren't connected to radical moose or lambs. :problem?:
2. They care because everyone in education cares. Don't you care? I used to care when I was teaching. :wub:
3. Agreed, the guy should have informed others about what was going on so you wouldn't have a skunkworks going on.
3 weeks is quick. I don't remember exactly, but typical time frame for a company to detect a breach is over 200 days. They care because that SSH session could have anything in it. It bypasses any DLP or scanning they are doing to prevent data theft. Of course they should know their network, and I doubt you have credit card or PII data on random classroom computers, but that is the idea. IF they knew the network better they could have figured that out.
-Otanx
Quote from: ristau5741 on August 11, 2016, 11:29:41 AM
not just the server guys, it's doing everyone else s job, just to prove it's not the network..
QFT
Quote from: Otanx on February 07, 2017, 12:36:34 PM
typical time frame for a company to detect a breach is over 200 days.
That's comforting.
Quote from: Hunterman1043 on February 08, 2017, 08:21:54 AM
Quote from: Otanx on February 07, 2017, 12:36:34 PM
typical time frame for a company to detect a breach is over 200 days.
That's comforting.
Well, that's for breaches that are detected. Undetected breaches can go on being undetected for much longer than that.
Quote from: Hunterman1043 on February 08, 2017, 08:21:54 AM
Quote from: Otanx on February 07, 2017, 12:36:34 PM
typical time frame for a company to detect a breach is over 200 days.
That's comforting.
Download the Verizon Data Breach Report. It requires a free registration to download. It covers this stuff and is very well written.
-Otanx
Quote from: that1guy15 on March 09, 2015, 04:45:26 PM
My current network is a Hospital system. We have multiple independent businesses that reside on campus and have PCs and such scattered all over the place. The old admins found the best command to segregate their network from ours and still pass traffic, spanning-tree bpdu filter.
This command is all over the place in some of the most random places and not is places that it should be.
After I came on I did a core replace and about once a month the network would blip. IM would drop, phones would go offline. Internet would die for a second or two. Every time MST root was shifting away from the core. Once I figured this out I traced it to the closet and what did I find? A single 2950 of one of these companies with Spanning-tree vlan 1-4094 priority 0. My MST was 4096...
It was a chain reaction. The 2950 had two up-links to two separate closets. One had bpdu filter inplace and the other had it removed with the core upgrade. Their STP killed the non-bpdu filtered path normally but every once in a while the other link would fail and STP would converge and take over my whole campus root.
Not fun and I am almost done segregate these companies into their own space.
Wow, I will pretend that I understood anything.
https://www.reddit.com/r/networking/comments/715q8s/all_of_our_ipsec_vpn_tunnels_from_china_went_down/
That must have caused some international headaches :eek:
At the very least... but, proper connections to/from China should be via properly terminated WAN links...
A parenting war story...
My daughter, 15, is serious about doing computer animation. It's a real passion for her, and that's a good thing in my mind. So, as a dad, I want to help out.
Well, I'm not using my lab *all* the time, so I ask her how long her last rendering project took. She says it ran for a day, got only to 10%, and then she canceled it and started clicking through specs for beefier boxes as she dreamed of spare CPU cycles.
I tell her that I've got an 8-core box with 64GB RAM, how about I spin up a VM for her to do the processing?
Her response: OH HELL YES
Gave her a TB of hard drive space, what the hell. Took about 30 minutes to get it all going and to teach her about how to RDP to another box.
She thanked me, went back to finish her school and then started copying files to the server after she was done with classes.
Around 6:18 PM, I hear my server go whhhhhrrrrrrrrrRRRRRRRRRRRRHEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
Ah, she's started her rendering!
3 hours later, she was at 100% and I was out in front for "Dad of the Year" honors. The server is back to 3% overall CPU utilization, but I know it'll get fired up again when she needs to render again in our family private cloud.
So today a colleague from a sister company learned why VTP can be dangerous. Especially when misconfigured or not configured at all and left in the default state.
Quote from: Nerm on August 17, 2018, 06:24:31 PM
So today a colleague from a sister company learned why VTP can be dangerous. Especially when misconfigured or not configured at all and left in the default state.
:drama:
Go on...
They were migrating some legacy systems from an old environment to a new one. They decided to accomplish this by spanning some VLAN's via an L2 cross-connect between the environments. They configured it all up and brought up the cross-connect interfaces. As soon as they did the new environment they were migrating to went completely dark.
:whatudo:
I got called in to help sort it out as this environment was running some extremely important production systems. We console into the core switch for the new environment and immediately they were like "what are those VLAN's and where did they come from". Since I already mentioned VTP you can already imagine where this is going.
:bole:
The core switch in the new environment had been left mostly in default config. Basically unboxed it added some VLAN's and IP's and built a production environment on top of it. This meant VTP was left in its default "empty" config. The old environment they were migrating from was actively using VTP. When they brought up the cross-connect VTP did what it was supposed to do and wiped the VLAN's from the new environments core switch and injected all the VLAN's from the legacy environment.
:explosion1:
I got to be the hero and save the day, but the worst part is I had been consulted on this project a while back and I suggested the cross-connect be L3. They had picked L2 because it was "less work".
:facepalm2:
Got paged middle of the day, yesterday,
customer couldn't communicate between policy servers, say 172.16.142.x to 172.16.143.x
Important first question, is this a new service? (new services are not considered emergencies), customer says no
so hour or so troubleshooting, firwall looks find, don't see anything in my capture, so I engage network engineer
so he's troubleshooting for like and hours, and asks the second most important question, how long has this service been running. customer says since Saturday
Third most important question, has it ever worked? customer says yes. so routing and switching all looks good arp tables populated, CAM too. all looks good.
so we start with the end user confoguration (which we souldn't not our job), we ask about the default gateway, yes that set correctly. ask about network mask.
we determine not correct. the 172.16.142 clients has a 255.255.224.0 netmask rather than the correct 255.255.255.224. no wonder it did't work, so the client
rings up the hosting and demand to know why the netmask changed. gotta run getting paged again
Ouch. Netmask screwed up.
Was this an automation problem or a bad admin problem?
Quote from: deanwebb on October 23, 2018, 11:53:36 AM
Ouch. Netmask screwed up.
Was this an automation problem or a bad admin problem?
I don't know. I disconnected when the netmask issue was discovered, not a firewall issue.
They said it did work, but I don't think it ever did.
I don't know which is worse, apps guys that lie, or apps guys that know zilch about networking and how their apps work.
When your all like...
:notthefirewall:
...but then it is the firewall.
:frustration:
^ I feel your pain. It can not be the firewall 999 times out of 1000, but your credibility's blown on that one time that it is...
99 of the 100 times it is the firewall it actually is someone failed to submit the change ticket correctly. Firewall is configured per change ticket. Go get your "new" change approved, and we will make it work.
-Otanx