Archive for the Sguil Category

Virtual Private Onions

Posted in Crazy Plans, Crypto, NSM, Sguil on 8 October, 2012 by Alec Waters

If you’ve not checked out Security Onion (SO) yet, you really should. It’s a powerhouse Linux distro, running everything an analyst could need to carry out effective Network Security Monitoring (NSM). The latest beta is looking fantastic; watch the video and try it out, you won’t be sorry!

I’m not going to talk about SO itself (Mother Google has plenty of good things to say); instead I’m going to look at the underlying network infrastructure.

A major component of SO is Sguil, a distributed monitoring system that has sensor and server components that are used for monitoring the network and aggregating output from multiple sensors (you can see a Sguil schematic here to understand how it all fits together). The comms between the sensors and the server are protected by SSL, and SO builds on this with autossh tunnels to protect other traffic streams from interception. Ubuntu’s Uncomplicated Firewall (UFW) opens only the necessary ports to allow SO sensors to talk to their server, so all the installations are protected to a high degree.

If all of the connections between the sensors and the server are on your own infrastructure, this is fine (perhaps you have a management VLAN specifically for this purpose, for example). There is however another use case I can think of, where we like to use a slightly different method of sensor-to-server communications.

NSM As A Service?

Let’s say you’re in the business of selling NSM to clients, either short-term (“help me understand what’s on my network”) or long-term (“please be our NSM operators in return for huge bushels of cash”). This will probably involve dropping off SO sensor boxes on customer networks and having them all report in to an SO server back at base. The SO sensors will probably be behind the customer’s NAT and firewalls, but this doesn’t matter, the comms will still work. The network of sensors will look a bit like this:

This particular use case isn’t without issue, though – here is where I think that a slightly different approach to the underlying SO infrastructure has value. Consider the following:

  • The Sguil interface shows you that one of the agents on one of the sensors is down. How do you ssh in to fix it if the sensor is behind someone else’s NAT and firewalls? How do you even know which Internet IP address you should use to try?
  • You suspect one of the sensors is under excessive load. It’d be great to install an snmp agent on the sensor and point Munin at it to try and find the stressed metric. Again, how do we contact the sensor from back at base?
  • You want to implement a custom information flow from the sensor back to the server. You need to think about the security of the new flow – can you implement another autossh tunnel? What if the flow is UDP (e.g., snmp-trap)?
  • One of the sensors is reported lost or stolen. How can you easily stop it from connecting back to base when it’s possibly now in the hands of threat actor du jour?

One possible answer is to build a custom “underlay” network to carry all of the traffic between SO sensors and the server, catering for existing flows and anything else you think of in the future. As a side benefit, the sensors will all have static and known IP addresses, and contacting them from the server will be no problem at all.

Virtual Private Onions

We’ll accomplish this by using OpenVPN to create a secure, privately- and statically-addressed cloud that the sensors and server will use to talk to each other (sorry for using the word “cloud”!). Take our starting point above, showing our distributed sensors at client sites behind client firewalls. By “underlaying” the SO comms with an OpenVPN cloud we’ll end up with is something like this:

The client’s firewalls and NAT are still there, but they’re now hidden by the static, bi-directional comms layer provided by OpenVPN. The sensors all talk over the VPN to the server at; the server in turn is free to contact any of the sensors via their static addresses in the subnet.


I don’t intend for this to be a HOWTO; I want to wait until people tell me that what I’m proposing isn’t a galactic waste of time and completely unnecessary! Conceptually, it works like this:

The SO Server takes on the additional role of an OpenVPN server. Using OpenVPN’s “easy-rsa” feature, the Server also becomes a Certificate Authority capable of issuing certificates to itself and to the sensors.

OpenVPN gives the SO Server a routed tun0 interface with the IP address (note routed – if we wanted to, we can give other subnets back at base that are “behind” the Server access to the SO Sensors via the VPN).

The Sensors become OpenVPN clients, using certificates from the Server to mutually authenticate themselves to it. The Sensors know to point their OpenVPN clients at the static Internet-facing IP address of the server back at base, getting a tun0 interface and a static IP address out of in return.

When setting up the SO components on the Sensor boxes, the sosetup script is told that the server is at via tun0, ensuring that all SO comms go over the VPN.

The VPN carries all traffic between Sensors and Server, be it Sguil’s SSL connections, autossh tunnels, SNMP traps or queries, raw syslog forwarding, psychic lottery predictions, anything. The Server is also free to ssh onto the Sensors via their static VPN addresses.


We’ve implemented OpenVPN with SO in this manner, and it works well for us. I hope this approach has some advantages:

  • It makes for easier firewalling at the Server and Sensor ends. You only need to open up one port, which is by default 1194/UDP. All of the VPN traffic is carried over this port.
  • You have deterministic sensor IP addressing, something you wouldn’t otherwise have had when your sensors were deployed on foreign networks behind NAT, firewalls, and dynamic IP addresses on customer DSL routers.
  • You don’t need to worry about setting up additional autossh tunnels when you dream up a new flow of information  between Sensor and Server.
  • You can easily talk to the sensors from the server, for ssh sessions, SNMP polls, or anything else.
  • If a sensor is lost or stolen, all you need to do is revoke its certificate at the Server end and it won’t be able to connect to the VPN (more on sensor security in a moment).

There are of course some disadvantages:

  • Complexity, which as we all know is the enemy of security.
  • Double encryption. OpenVPN is encrypting stuff, including stuff already encrypted by Sguil and autossh.
  • <insert other glaring flaws here>

Security considerations

On the Server side, keep in mind that it is now a Certificate Authority capable of signing certificates that grant access to your SO VPN. Guard the CA key material well – maybe even use a separate system for certificate signing and key generation. Also, once you’ve generated and deployed certificates and private keys for your sensors, remove the private keys from the Server – these belong on each specific sensor and nowhere else.

On the Sensor side, we have a duty of care towards the clients purchasing the “NSM As A Service”. OpenVPN should be configured to prohibit comms between Sensor nodes, so that a compromise of one Sensor machine doesn’t automatically give the attacker a path onto other customer networks.

If a Sensor machine is lost or stolen whilst deployed at a customer site, we can of course revoke its certificate to prevent it from connecting to the VPN. That’s great, but the /nsm directory on it is still going to contain a whole bunch of data that the client wouldn’t want turning up on eBay along with the stolen sensor.

I have a plan for this, but I’ve not implemented it yet (comments welcome!). The plan is to have the /nsm directory encrypted (a TrueCrypt volume? something else?), but to have the decryption key stored on the SO Server rather than on the Sensor. When the Sensor starts up, it builds its OpenVPN connection to the Server before the SO services start. Using the VPN, the Sensor retrieves the decryption key (one unique key per Sensor) from the Server (wget? curl? something else?), taking care never to store it on the Sensor’s local disk. Decryption key in hand, the Sensor can decrypt /nsm and allow the SO services to kick off.

In this way, a stolen Sensor box represents minimal risk to the client as the /nsm directory is encrypted, with the keymat necessary for decryption stored on the Server. Provided that the theft is communicated to the NSMAAS provider in an expedient fashion, that keymat will be out of reach of the thief once the OpenVPN certificate is revoked.

(Other, crazier, plans involve giving the Sensors GPS receivers and 3G comms for covert phone-home-in-an-emergency situations…)

So there you have it…

…Virtual Private Onions. Any comments?

Alec Waters is responsible for all things security at Dataline Software, and can be emailed at

The Adventures of Packet Tracy, PI

Posted in NSM, Packet Tracy, Sguil on 14 May, 2012 by Alec Waters

My name’s Tracy, Packet Tracy. I’m a PI. It says so on my door. Last Monday morning began like every other with a fumble through the fog of a hangover looking for new and unexplained bruising. The pounding in my head was aided and abetted by the pounding on my door, and aspirin couldn’t fix both. I dragged myself towards the noise and let in a dame wearing stilletos and a scowl that could shatter concrete at fifty paces. She said her rat of a boyfriend had been flirting with some internet floozy, and all the proof was in the pcap on the USB stick that left another bruise after bouncing off my head. No problem, I said, come back Friday for chapter and verse.

The case was tougher than I thought; the pcap was bigger than my last bar tab and there’s no way I want to go over that line by line either. So I ran it past my Bro and a couple buddies of mine, Suri and Carter, but none of us could come up with the dirt. The dame’s been on my back all week for results, and now it’s Sunday and the pounding is back with a vengeance. She’s at the door, calling me names I’m pretty sure my mother never gave me. I considered leaving by the window until I remembered my office is in a basement and there aren’t any.

Time’s up. Gotta face the music… 

So how come PT couldn’t come up with the goods? The evidence was there, surely it was just a matter of time before he found it?

I’m sure a good many of us have been in needle-in-a-haystack situations where we’ve had to rely on some tool or other to process a gigantic pcap. If the tool doesn’t find what we’re looking for, either it’s not there to find or the analyst or tool itself wasn’t up to the job.

So are all tools created equal? Given the same pcap and the same task, will they return the same results?

I thought I’d run a number of URL extraction tools against a standard 128MB pcap, the kind you might find on a Security Onion box, and I set up a quick deathmatch cage fight between httpry, urlsnarf, Suricata, tshark and Bro. All versions are the ones shipped with SO, and the config files haven’t been tweaked in any way from standard.

I realise that this is hardly a thorough and scholarly test, but I think that the results show a point worth making.

The tools were invoked like this:

user@sensor:~/httpexperiment$ httpry -o httpry.log -r capture.pcap
httpry version 0.1.6 -- HTTP logging and information retrieval tool
Copyright (c) 2005-2011 Jason Bittel <>
Writing output to file: httpry.log
5230 http packets parsed

user@sensor:~/httpexperiment$ urlsnarf -n -p capture.pcap > urlsnarf.log
urlsnarf: using capture.pcap [tcp port 80 or port 8080 or port 3128]

user@sensor:~/httpexperiment$ suricata -c /etc/nsm/sensor-bond0/suricata.yaml -r capture.pcap
<tons of output>
4/4/2012 -- 15:35:58 - <Info> - HTTP logger logged 2206 requests

user@sensor:~/httpexperiment$ tshark -r capture.pcap -R http.request -T fields -e frame.time -e ip.src -e ip.dst -e tcp.srcport -e tcp.dstport -e http.request.method -e -e http.request.uri -e http.request.version > tshark.log

user@sensor:~/httpexperiment$ bro -r capture.pcap

Now, we need to process httpry’s output a little, since it includes server responses as well as client requests. Once done, we can see how many HTTP requests were seen by each tool. The results were a little surprising to begin with:

httpry: 2567 requests seen
tshark: 2557 requests seen
Bro: 2240 requests seen
Suricata: 2206 requests seen
urlsnarf: 2198 requests seen

httpry and urlsnarf are often (probably quite rightly) viewed as less sophisticated than Suricata and Bro, and tshark might not be the tool of choice for this kind of work, especially when you’re constantly monitoring traffic on an interface rather than processing a pcap. Bro and Suricata certainly picked up things that httpry didn’t (HTTP on non-standard ports, for example), but there’s quite a gulf between the apparently less-sophisticated and the cutting-edge.

So why? Each tool does the job in a different way, and with its own set of pros and cons. httpry, for example, only listens on ports 80 and 8080 by default, has no support for VLAN-tagged frames, and performs no TCP session reassembly so it is vulnerable to evasion. On the other hand, the more sophisticated Bro and Suricata have, to my knowledge, no such shortcomings.

However, it’s httpry’s brute simplicity that I suspect put it on top of the pile for this particular test. All it looks for is one of a nominated set of HTTP method verbs at the beginning of a TCP segment. It doesn’t care about proper TCP session establishment or teardown, all it’s looking for are a few bytes in the proper place. In contrast, other tools may not log any URLs at all if the TCP session isn’t complete and well-formed.

Each tool’s results were a subset of the others’ collective results – no one tool captured the superset, and one could probably achieve “better” (or at very least, “different”) results by tweaking each tool. Even then I reckon there would still be a pretty good chance that there are undiscovered URLs lying there that a different tool could pick up.

I don’t mean for this to be seen as some kind of ranking of tools, or a statement that httpry and tshark are “better” URL extractors than Bro and Suricata; all I’m trying to say is that you’re almost certain to miss something by only using a single tool for a job like this. On the flip side of course, running two or more URL extractors might seem like a galactic waste of resources…

Your mileage will almost certainly vary, but it’s worth keeping this in mind when the dame is at the door demanding answers!

Alec Waters is responsible for all things security at Dataline Software, and can be emailed at

net-entropy Sguil agent and wiki

Posted in net-entropy, NSM, Sguil on 6 October, 2009 by Alec Waters

The story so far:

I’ve written a basic Sguil agent that will upload net-entropy’s RISING ALARM messages into Sguil. You can download the agent here, and the config file here.

On a Sguil sensor that has net-entropy installed, copy the agent to wherever your other agents live (/usr/local/bin on my system), and the config file to where your other config files live (/etc/nsm/sensor1/ on my system). Then fire it up:

   -c /etc/nsm/sensor1/net-entropy_agent.conf

With a bit of luck, you’ll see the agent register in the Sguil client:

net-entropy sguilAnd we’ll start to see net-entropy messages appear, too:

net-entropy sguil eventsThe bottom right pane of the Sguil client will behave as it does for the PADS agent, and will show you the event detail:

net-entropy sguil detailSguil will correlate these events in the usual fashion, and allow you to right-click and say “Transcript” or “Wireshark”. It all seems to work pretty well!

Finally, the net-entropy project has a new wiki – it’s here. This is the place to go for the latest source code, which now includes a Paninski entropy estimator in addition to the original Shannon estimator. Have fun!

Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters(at)

Detecting encrypted traffic with frequency analysis – Update

Posted in Crazy Plans, net-entropy, NSM, Sguil on 2 September, 2009 by Alec Waters

I recently wrote about a plan for detecting encrypted traffic, where I mentioned in the comments that I’d come across a package called net-entropy (very detailed writeup here). I’ve been in touch with Julien Olivain, one of the authors, and he’s kindly given me the sources to experiment with.

And experiment I shall – I’ll post my findings when I’ve got some!

Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters(at)

Detecting encrypted traffic with frequency analysis

Posted in Crazy Plans, net-entropy, NSM, Sguil on 12 August, 2009 by Alec Waters

Let’s start with a little disclaimer:

I am not a cryptanalyst. I am not a mathematician. It is quite possible that I am a complete idiot. You decide.

With that out of the way, let’s begin.

NSM advocates the capture of, amongst other things, full-content data. It is often said that there’s no point in performing full-content capture of encrypted data that you can’t decrypt – why take up disk space with stuff you’ll never be able to read? It’s quite a valid point – one of the networks I look after carries quite a bit of IPSec traffic (tens of gigabytes per day), and I exclude it from my full content capture. I consider it enough, in this instance, to have accurate session information from SANCP or Netflow which is far more economical on disk space.

That said, you can still learn quite a bit from inspecting full-content captures of encrypted data – there is often useful information in the session setup phase that you can read in clear text (e.g., a list of ciphers supported, or SSH version strings, or site certificates, etc.). It still won’t be feasible to decrypt the traffic, but at least you’ll have some clues about its nature.

A while ago, Richard wrote a post called “Is it NSM if…” where he says:

While we’re talking about full content, I suppose I should briefly address the issue of encryption. Yes, encryption is a problem. Shoot, even binary protocols, obscure protocols, and the like make understanding full content difficult and maybe impossible. Yes, intruders use encryption, and those that don’t are fools. The point is that even if you find an encrypted channel when inspecting full content, the fact that it is encrypted has value.

That sounds reasonable to me. If you see some encrypted stuff and you can’t account for it as legitimate (run of the mill HTTPS, expected SSH sessions, etc.) then what you’re looking at is a definite Indicator, worthy of investigation.

So, let’s just ask our capture-wotsits for all the encrypted traffic they’ve got, then, shall we? Hmm. I’m not sure of a good way to do that (if you do, you can stop reading now and please let me know what it is!).


…I’ve got an idea.

Frequency analysis is a useful way to detect the presence of a substitution cipher. You take your ciphertext and draw a nice histogram showing the frequency of all the characters you encounter. Then you can make some assumptions (like the most frequent character was actually an ‘e’ in the plaintext) and proceed from there.

However, the encryption protocols you’re likely to encounter on a network aren’t going to be susceptible to this kind of codebreaking. The ciphertext produced by a decent algorithm will be jolly random in nature, and a frequency analysis will show you a “flat” histogram.

So why am I talking about frequency analysis? Because this post is about detecting encrypted traffic, not decrypting it.

Over at Security Ripcord, there’s a really nifty tool for drawing file histograms. Take a look at the example images – the profile of the histograms is pretty “rough” in nature until you get down to the Truecrypt example – it’s dead flat, because a decent encryption algorithm has produced lots and lots of nice randomness (great terminology, huh? Like I said, I’m not a cryptanalyst or a mathematician!)

So, here’s the Crazy Plan for detecting encypted traffic:

  1. Sample X contiguous bytes of a given session (maybe twice, once for src->dst and once for dst->src). A few kilobytes ought to be enough to get an idea of the level of randomness we’re looking at.
  2. Make your X-byte block start a little way into the session, so that we don’t include any plaintext in the session startup.
  3. Strip off the frame/packet headers (ethernet, IP, TCP, UDP, ESP, whatever) so that you’re only looking at the packet payload.
  4. Perform your frequency analysis of your chunk of payload, and “measure the resultant flatness”.
  5. Your “measure of flatness” equates to the “potential likelihood that this is encrypted”.

Perhaps one could assess the measure of flatness by calculating the standard deviation of the character frequencies? Taking the Truecrypt example, this is going to be pretty close to zero; the TIFF example is going to yield a much higher standard deviation.

Assuming what I’ve babbled on about here is valid, wouldn’t it be great to get this into Sguil? If SANCP or a Snort pre-processor could perform this kind of sampling, you’d be able to execute some SQL like this:

select [columns] from sancp where src_randomness < 1 or dst_randomness < 1

…and you’d have a list of possibly encrypted sessions.

How’s that sound?

This post has been updated here.

Check out InfoSec Institute for IT courses
including computer forensics boot camp training.

Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters(at)

Visualising Sguil session data with NetFlow

Posted in NSM, Sguil on 10 July, 2009 by Alec Waters

I think this is the first time I’ve explicitly mentioned Sguil, and I’m not going to talk too much about the package itself as many others have already done it for me. Basically, Sguil is a nicely integrated suite of (free) tools that will help you put NSM principles into practice. It has a wonderful investigative interface, and allows easy access to all the alert, session, and full-content data you’ve been capturing via a TAP or SPAN port.

I want to talk about Sguil’s session data capture mechanism. SANCP is used for this purpose, and it’s nicely tied into the alert and full-content data captured by Sguil. It does a jolly good job, too.

SANCP isn’t not the only tool for handling session data, though – Cisco’s NetFlow is another. What I’m going to propose might sound a little odd, but I find it useful – I hope you will, too. Here goes:

My Sguil sensors maintain two stores of session data. One is the standard set as captured by SANCP, the other is a database of NetFlow records that express the same traffic.

“Why?”, I hear you ask. Isn’t this a duplication of data? Don’t you need some fancy Cisco router to export flows? The answers to these are “sort of” and “no”, respectively. Before we get onto how we make this happen, let’s talk about why we want to bother in the first place.

  • There is no easy and cheap visualisation of SANCP data. If you’ve got a TAP on a link and you’re feeding it into a Sguil sensor, it’d be great to see a nice graph of how much traffic is passing in each direction. If you can break it down by IP address and port, so much the better. Yes, you can export the SANCP data into something like Excel to graph it, but that’s hardly “easy” or “cheap”. If we can somehow extract NetFlow data from our TAPped link, we can use any number of NetFlow analysis tools to look at what’s going on.
  • SANCP isn’t (to my knowledge) timely. If you’ve got a long running session (e.g., a huge download) it won’t get logged by SANCP until it has finished (possibly many hours later). NetFlow devices by contrast can be instructed to prematurely export flows for sessions that haven’t finished yet, giving you a more real-time view of the data.
  • SANCP logs one row of data for every session it observes, in terms of source IP/port, destination IP/port, src->dest bytes/packets, dst->src bytes/packets, etc. By storing these “bidirectional” records, you sometimes get entries like this: -> -> -> ->

    All of the rows above show sessions to or from a webserver on Look at the last row, though – somehow is listed as the source rather than the destination (you’d have to ask someone familiar with the SANCP codebase why this happens from time to time). What this means is that if I want to write some SQL to show me all of the sessions hitting my webserver, I can’t just say “WHERE dst_ip = INET_ATON(‘’) and dst_port = 80” because I would miss out on the last row. I need to alter my SQL to include rows where “src_ip = INET_ATON(‘’) and src_port = 80” as well (I guess this is why there are UNION options in Sguil’s query builder!). IMHO SANCP’s “source” and “destination” columns might be better referred to as “peer A” and “peer B”, since there’s no consistency in the way sessions are listed. NetFlow, by contrast, will store unidirectional flows with separate database rows for traffic from a->b and from b->a, thereby avoiding this confusion.

Irrespective of whether you agree with the latter points, I think the visualisation aspect of this is worth it’s weight in gold. Here’s how to do it, and actually make it work in a useful fashion.

Firstly, we need some way of getting NetFlow exports from our TAPped traffic stream. softflowd is your friend here – get it installed on your Sguil box, and run it like this:

/usr/sbin/softflowd -i eth1 -t maxlife=60 -n

Let’s look at the parameters in turn:

“-i eth1” tells softflowd to listen on eth1, which is also Sguil’s monitoring interface in this case.

“-t maxlife=60” tells softflowd to export a flow record for non-expired flows after sixty seconds. This gives us the “timeliness” that SANCP lacks.

“-n” tells softflowd where to send its NetFlow exports to, in this case a NetFlow collector running locally on the Sguil sensor on port 9996.

So now softflowd is looking at Sguil’s traffic stream, and is converting what it sees to NetFlow exports which it is sending to a collector listening on port 9996. All we need now is a NetFlow collector to receive the exports.

There are many collectors available, but the one I’m going to show here is ManageEngine’s Netflow Analyzer (NFA). I use NFA because the free version will work perfectly in this scenario, and because it has one crucial feature that makes all of this useful (we’ll get to that later on).

So, download and install the free version of NFA on your Sguil sensor, and tell it to collect flows on port 9996. In very short order, NFA should start receiving flows and drawing a nice graph. A graph which will have one obvious and fatal flaw about it:


Whoops. We’re only showing “in” traffic, and it’s actually the sum of the “in” and “out” traffic on our monitored link. When you think about it, what else was softflowd supposed to do? After all, the only traffic it sees is coming “in” to eth1, so that’s all it can report.

However, NFA has a nifty feature we can use to make sense of the data – IP Groups. Set up a new IP Group (call it whatever you like), and add the IP addresses that Sguil/Snort considers to be HOME_NET. NFA will then use these IP addresses to determine which flows should be interpreted as “in” and which as “out”. If we navigate to our IP group in NFA we now get a nice graph that looks like this:


Woohoo! Now we’re talking! Now we can use NFA’s other cool features like drag-to-zoom and breakdowns by IP address and port.

Yes, we’ve bloated up our Sguil sensor in the process. Yes, we’re doubling up on session information. But if you’ve just arrived at a customer site and have plugged your sensor into what is (for you) an unfamiliar network, what better amd quicker way to get an idea of the kind of traffic you’re monitoring?

Comments welcome. I’m sure there’s a gigantic flaw in the plan somewhere!


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters(at)