Syslog

Started by LynK, November 20, 2015, 12:40:24 PM

Previous topic - Next topic

deanwebb

http://www.sans.org/reading-room/whitepapers/logging/ins-outs-system-logging-syslog-1168
https://tools.ietf.org/html/rfc5424

^ Gonna read those today before diving into marketing slicks. Always best to be armed with knowledge.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

jinxer

We/I use syslog-ng and ELK. Its much like splunk just opensource and doesnt have all the pre-made plugins..


Sent from my iPhone using Tapatalk

deanwebb

Reading over the product descriptions for Arcsight, Splunk, Secnology... now I'm wondering if what we really want is a network monitoring solution, whether or not it accepts syslog as an input. These products all seem like they're set up as big data solutions and what we want most is something that will fire off an alert when a switch fan is about to fail or when a router interface goes down.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

wintermute000

I've generally seen all-in-one type solutions in smaller networks (10k endpoints etc.), but it sounds like your network is huge, and you're going to want a proper distributed DB backend anyway - make sure your integrated monitoring solution scales to the appropriate size and availability.

Don't discount the issue of performance/indexing/searching either. with a huge DB thats taking tens of thousands of syslogs a day (or hundreds of thousands a day if someone turns on a debug? LOL).  It may be a wiser move long term to have a separate back end and front end.

burnyd


Otanx

Syslog is something I work with a lot at my current job. A good document on this is actually from Cisco;

Building Scalable Syslog Management Solutions
http://www.cisco.com/c/en/us/products/collateral/services/high-availability/white_paper_c11-557812.html

We use syslog for troubleshooting, cyber incident correlation, alerting and triggering tasks, etc. Based on what you posted above you are probably going to want a distributed model where you put a collector at each location. Then you can aggregate that back to a main central system from each collector. The main reason to do this is because syslog is by default UDP. Some systems will allow you to change this, but not everything.

Even the systems that do support syslog over TCP have some issues. I have seen two fairly major applications fail TCP syslog. One basically treated it as UDP just using TCP. So it would send the log message, but not buffer the message until it was acknowledged. So what would happen is a packet would get dropped, and the receiver would constantly request retransmission which the sending system couldn't do (it already flushed the message). Eventually the connection would reset. The second one would buffer the messages, but never had a timeout. So if the TCP session would not establish the messages would buffer till the system ran out of memory. Both of these were reported to the vendors (and I hope fixed), but we stopped trusting TCP syslog implementations.

We drop a collector at major sites, and have everything at that site pointed there using normal syslog. Small sites that don't rate their own collector get pointed to the most reliable major site. Then the systems guys do magic, and pull the files back to the central server which then gets forwarded to different systems. Our SIEM gets a copy, a generic syslog-ng box gets a copy so us geeks can grep stuff, Splunk gets a copy so management types can see pretty graphs, etc. The downside is the central collector is not real time. I think the lag is about five minutes because of the "systems magic" that collects everything. This isn't a big headache as if I am trying to do something in real time I am looking at local logs.

Some other issues you will run into.

1. There is a very common system in use almost everywhere that does not support syslog nativity. Windows. This means installing an agent on all your Windows systems. We all love installing more agents right?
2. Cisco rate limiting syslog. Normally not an issue, but Cisco will log at the rate of the slowest interface it is set to log to. This means if you are logging console syslog will rate limit to 9.6kbps. If you are logging ACLs you can easily exceed this. Easy fix, just do "no logging console". There is a log message that shows up when syslog is dropped because of this. Grep for rate-limit in your Cisco logs.
3. Make sure time is set correctly on everything. It really sucks to try correlating messages when one system is in PST, one EST, and one is not synced to time at all, and is 12 minutes off UTC.

-Otanx

deanwebb

Great pointers and points, thanks. This syslog will only be for the network devices. Server guys can go pound sand if they want to use it for anything. :)
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

Otanx

If you only do network devices you are missing out. Lets look at a common issue. Server guy calls and says your firewall is blocking him. Surprisingly he knows the source and destination IP and ports. You look at your logs that only have network devices. No logs, and you tell him it isn't the firewall because there are no denies being logged. However, if you had system logs as well, you might see a log message that says;

"Aug  4 13:23:00 centos kernel: IPTables-Dropped: IN=em1 OUT= MAC=a2:be:d2:ab:11:af:e2:f2:00:00 SRC=192.168.2.115 DST=192.168.1.23 LEN=52 TOS=0x00 PREC=0x00 TTL=127 ID=9434 DF PROTO=TCP SPT=58428 DPT=443 WINDOW=8192 RES=0x00 SYN URGP=0"

Then you get to point out to the systems guy that it was his own system blocking the connection, and that he will need to fix that. Then you get to ask why you as a network guy are better at troubleshooting system issues than he is. I find that shaming systems guys is the only way to train them to troubleshoot first before blaming the network.

Being serious the more things you can get logs from the better. Otherwise you will be missing important information that may help solve a problem.

-Otanx

deanwebb

I agree 100% about the more devices logged, the better.

In my firm, however, we're so big that the server guys are a tower, network is a tower, client is a tower, etc. Towers everywhere. Backups and antimalware are towers, that's how many towers we got. Getting cross-tower cooperation is culturally... difficult... I know because I'm doing NAC and having to build all those bridges and keep the ones that are built up from burning down.

So, the solution is network device-only. We need the monitoring set up soon, as there's another solution (Orion) about to expire.
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

wintermute000

Quote from: Otanx on January 23, 2016, 11:21:05 AM
Server guy calls and says your firewall is blocking him. Surprisingly he knows the source and destination IP and ports.

What is this fantastic parallel universe of which you speak?  :glitch:

wintermute000

Attached a screenshot off my lab splunk - as you can see its very much a db interface

SimonV

We will be rolling out Graylog, small one-VM setup to begin with to index our switches and some FWs. Have it running at home and already alerting me for a handful of events.

You can have it listen on different ports for different classes of devices, then create streams, counters and statistics based on every aspect of the log messages. Everything you can configure is just an Elasticsearch syntax which you can copy and paste to other parts of the system. Takes some linux-fu to get it up and running but once the basics are in it's very straightforward.

And it's opensource  :professorcat:

deanwebb

Fun twist... looks like I won't be doing a syslog vendor comparison, after all.

Someone didn't just consider Splunk. He executed a PO and bought Splunk. Lots of it. Because, Splunk!

Looks like I get to read a lot of Splunk documentation in the next few days.

:challenge-accepted:
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!
Air gaps are high-latency Internet connections.

AnthonyC

Quote from: jinxer on January 22, 2016, 02:12:00 PM
We/I use syslog-ng and ELK. Its much like splunk just opensource and doesnt have all the pre-made plugins..


Sent from my iPhone using Tapatalk

Neat, we are using similar toolchains to build our own but we use fluentd instead of syslog-ng.
"It can also be argued that DNA is nothing more than a program designed to preserve itself. Life has become more complex in the overwhelming sea of information. And life, when organized into species, relies upon genes to be its memory system."

icecream-guy

Quote from: deanwebb on January 28, 2016, 06:36:42 PM
Fun twist... looks like I won't be doing a syslog vendor comparison, after all.

Someone didn't just consider Splunk. He executed a PO and bought Splunk. Lots of it. Because, Splunk!

Looks like I get to read a lot of Splunk documentation in the next few days.

:challenge-accepted:

splunk is very memory and CPU intensive,  without it, runs like a snail.
:professorcat:

My Moral Fibers have been cut.