Real world use?

Started by vito_corleone, January 03, 2015, 11:23:48 PM

Previous topic - Next topic

vito_corleone

Anyone currently deploying some type of SDN (ACI, NSX, ODL, etc, etc) in production? I'm seeing some people playing with stuff in the lab, but no production deployments yet.

wintermute000

#1
Sad to say, I've not seen one non-lab instance of true SDN deployment here in Australia (mid-large enterprise market, my team alone runs 20+ customer environments for large orgs). By true, I mean software defined control plane, not deploying virtual appliances and/or scripting no matter how complex.

If you're talking NFV, I've done quite a few PFsense deployments and re-engineered one of my private customers to an exclusively pfsense + private VLAN hosting architecture (vs his previous unholy mismash of NAT / FW rules overlaid with VTI IPSEC tunnels... eek). I also deployed such a topology @ my old ISP as a quick and dirty enterprise IaaS stack (i.e. here's a pfsense, here's a vlan, here's tools to spin up your VMs within your private VLAN, go nuts). I'm seeing increasing acceptance of NFV with every vendor and his dog virtualising everything from WAN accelerators to WLCs. I think this is where the low hanging fruit is as its easier to migrate, can be done piecemeal (router by router, device by device etc.) and is conceptually easier to explain to mgt (though it really complicates your L2 diagrams and you better have good relationships with the vmware teams!!!).

Its getting to the point where I'm almost inclined to (if I were in charge) deploy a NFV only 2x host 2x vcenter 2x SAN 'dual layer collapsed edge' cluster within the DC where all the network appliances can be hosted in a nice redundant, separate topology from the 'client' servers/vcenters. Turn up the MTU and run anything from Q-in-Q to full MPLS into the prod environments - I'd do something like have CSRs or Vyattas as the inside WAN layer and Juniper Fireflies vSRXs on the outside - though having said that I haven't seen the numbers on switching performance on a large scale NFV. I'd still leave the core switching on ASICs for now so really it should only be WAN traffic (internal or external) relying on software routing. If this was a SP or hosting environment you could run MPLS straight to the CSRs and break the customer VRFs directly into virtual, i'd love to stand up such an environment.

Having your R&S dependent upon virtual appliances raises interesting and complex questions re: redundancy and behaviour in any combination of hw/sw/vm-stack failures or changes, hence my idea of dedicating a cluster to virtual R&S (no different really from the idea of having separate distinct vcenter servers or a separate ESXi cluster for vcenter/mgt only). Coming from a SP design background I don't like the idea of a host/client layer issue or failure or change (i.e. Vsphere host environment) affecting the underlying R&S that the entire VLAN or subnet hangs off.

I have started reading Ivan P's free books on ipspace.net re: openflow / overlay DC networks and I would love to get a shot at NSX. The whole idea of tunnels everywhere for control plane traffic kind of scares me + the very good scaling issues that are raised (eek too much multicasting lol).

Rambling now but heck i need out of my current ops gig and back into solutions/design lol

deanwebb

We've had mention of SDN, and always the response, "What if the SDN controller screws up on the routing to and from itself? The whole network goes down, right?" This being from a school that says hardware eventually breaks and software eventually works, which is firmly planted in our R&S group.

Coming at it from a security side, though, I see that as a big push for SDN, along with our application teams. The app guys are tired of tying access to host names or IP addresses and the security guys are tired of opening and closing firewall ports as the app access needs change over time. Tie the access to an app to an Active Directory account, and now nobody cares where the guy is located, he's getting access to his apps.

And that last sentence is a huge key. That integration with a user directory needed to control user access to apps means that AD becomes another potential point of failure or weakness in an SDN system. It's not just what if the SDN controller throws a piston and can't function: what if the AD connection is broken or the AD itself not managed properly? It's like looking behind the drywall in a house. If you don't look, you don't have to fix anything and the house will pretty much pass inspection. If you do look, and you see a problem, then you HAVE to fix it, with all the attendant expenses.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

NetworkGroover

I will say that I've definitely seen it in action, but only regularly in large environments that have a strong DevOps staff, or in academia - scattered elsewhere.  Otherwise, it's mostly "parts" or "flavors" of SDN.  Maybe not the same sense of SDN that you're speaking of here, but could still certainly be considered software-defined.
Engineer by day, DJ by night, family first always

AnthonyC

#4
I have seen some NSX initiatives from clients and interests from government entities.  The company I work for has also deployed VCAC for big organizations.
"It can also be argued that DNA is nothing more than a program designed to preserve itself. Life has become more complex in the overwhelming sea of information. And life, when organized into species, relies upon genes to be its memory system."

dlots

Nope

Right now it seems more like a buzz word than anything else to me, lots of hype, and people saying how awesome it is... but no one is implementing it that I can see.

that1guy15

Quote from: deanwebb on January 04, 2015, 11:20:45 AM
We've had mention of SDN, and always the response, "What if the SDN controller screws up on the routing to and from itself? The whole network goes down, right?" This being from a school that says hardware eventually breaks and software eventually works, which is firmly planted in our R&S group.

All solutions that I have seen demos from (Big Switch, Cisco ACI, Plurabus, Nuage) clam they can lose the controller and not stop the flow of traffic. This makes sense as the controller does to operate the data plan just establishes it. But if the controller does go down nothing on the network can change.

Vito Ive not heard of anyone deploying. Closest I get is hearing about NSX deployments. burneeed runs a pretty advanced DC and I know he is running NSX.
That1guy15
@that1guy_15
blog.movingonesandzeros.net

javentre

Vito - Hit me up next week (via IM), I have some data points for you.
[url="http://networking.ventrefamily.com"]http://networking.ventrefamily.com[/url]

Atrum

I've heard a lot of talk from various companies considering actual implementations.
They seem to be getting stuck at the "How does this save us money right now?" question.

scottsee

I am in the middle of re-hauling our SAN and VM environment, I'll be implement the Nexus 1000v for Hyper-V and VMM shortly. I'm not a fan on Microsoft's approach to extensible switching.
scott see

wintermute000

#10
Just sat through an in house N9K/ACI (APIC) demo. Looks a LOT like a tarted up Prime TBH - ACI is differentiating from Openflow via leaving the control plane local not centralised. Our team laughed long and hard when we prodded for 'whats underneath' and got the answer of 'well basically lots of ISIS and MP-BGP for separating virtual domains, and VXLAN overlay for signalling'. Pretty sure the instructor didn't actually know lol.... The more things change.....


I laughed even harder when the answer to 'will it integrate with N1k' is a flat out no (oh yeah it has its own new dvswitch).


BTW there is a migration path for 5k/7k planned. Probably coming in with the latest round of nexus bugs baked into the next firmware releases. lol


The reality is that for most Enterprises, there isn't a lot of benefit right now. Yeah you provision faster... how much is that worth your typical medium enterprise with 5k headcount and 200 virtual servers, nevermind the fact that you're only getting the benefit in the DC and you still gotta run all your traditional WAN and campus LANs. I'm in no way saying this will not change in the future - but right now the use case seems squarely focused on service providers and large multi-tenanted hosting/cloud providers (and the REALLY larges enterprises).


its also running into two opposite trends - how cloud (like all utilities) is converging rapidly due to economies of scale, and at the same time how all the big cloud providers have their own SDN stack (e.g. Google, Amazon, Facebook etc.).


Interesting times

that1guy15

Funny I am hearing a ton of mixed emotions and reports of the 9K/ACI/APIC demos going on. Some are good and very informative and other say the same as you, they are horrible. If you want a good solid technical overview of what ACI is all about check out the Network Field Day 9 presentations done last week.

http://techfieldday.com/appearance/cisco-presents-at-networking-field-day-9/

Also while there check out Brocades Vyatta controller solution. Im very interested in what they have going on right now.

I am with you, I cant wait to see how the next 3-5 years play out\!
That1guy15
@that1guy_15
blog.movingonesandzeros.net

wintermute000

Well I didn't mean to say it was horrible, rather that the details of what was behind the curtain was fairly sparse. Anybody can use a GUI and drag and drop a few network components around, if you get my drift.
Frankly speaking I would love to take the time to properly drive ACI and / or NSX and make a proper evaluation.


cheers for the link and a big brocade fan (double so re: vyatta now that I'm down with the JunOS syntax lol)

that1guy15

Nah its all good dude, no one really knows what is gonna happen in this space and even the big vendors are new to it so its pretty up in the air.

I am actually a big fan of the couple vendors at are using BGP for VXLAN transport. Gives tons of control and flexibility in deployments and also allows communications between distributed controllers and VTEPS across DCs. Those not using BGP right now are pretty stuck on intra-DC communication and flows currently. Might not be a bad thing though. IDK.

When I saw the NFD8 presentation for Cisco (and also a handful of other vendors) cisco had the feel that it was half-baked. But that was over 6+ months ago and they seemed pretty good last week. I would have to get my hands on it more to check it out. They also had one of the more complex solutions but it looked like there was the ability for a shit-ton of flexibility.

I see the light at the end of the CCIE tunnel and one of the things Im getting most excited about is ripping out my Cisco CSR1K VMs and dropping in OpenStack, and the Broacade controllers. Need to get my hands on some hardware VTEPs (where are you Arista guy?) and start messing around.
That1guy15
@that1guy_15
blog.movingonesandzeros.net

burnyd

Quote from: wintermute000 on February 16, 2015, 12:02:06 AM
Just sat through an in house N9K/ACI (APIC) demo. Looks a LOT like a tarted up Prime TBH - ACI is differentiating from Openflow via leaving the control plane local not centralised. Our team laughed long and hard when we prodded for 'whats underneath' and got the answer of 'well basically lots of ISIS and MP-BGP for separating virtual domains, and VXLAN overlay for signalling'. Pretty sure the instructor didn't actually know lol.... The more things change.....


I laughed even harder when the answer to 'will it integrate with N1k' is a flat out no (oh yeah it has its own new dvswitch).


BTW there is a migration path for 5k/7k planned. Probably coming in with the latest round of nexus bugs baked into the next firmware releases. lol


The reality is that for most Enterprises, there isn't a lot of benefit right now. Yeah you provision faster... how much is that worth your typical medium enterprise with 5k headcount and 200 virtual servers, nevermind the fact that you're only getting the benefit in the DC and you still gotta run all your traditional WAN and campus LANs. I'm in no way saying this will not change in the future - but right now the use case seems squarely focused on service providers and large multi-tenanted hosting/cloud providers (and the REALLY larges enterprises).


its also running into two opposite trends - how cloud (like all utilities) is converging rapidly due to economies of scale, and at the same time how all the big cloud providers have their own SDN stack (e.g. Google, Amazon, Facebook etc.).


Interesting times

I had the 2 day primer I guess you can say. The web gui is not that great. Traffic cannot redirect properly ie into a 3rd party load balancer or firewall you need to actually IP things inside of ACI you cant direct it at the ACI fabric. 

You need multiple EPG groups to accomplish the same thing in most cases due to the leaf switches not being able to understand if packets are encapsulated or not encapsulated in their special header.  Its kind of a mess right now.  They talk all the shit in the world about VMware/NSX but there product is barely able to work in a production environment.  Just sayin...