Archive for the NSM Category

Virtual Private Onions

Posted in Crazy Plans, Crypto, NSM, Sguil on 8 October, 2012 by Alec Waters

If you’ve not checked out Security Onion (SO) yet, you really should. It’s a powerhouse Linux distro, running everything an analyst could need to carry out effective Network Security Monitoring (NSM). The latest beta is looking fantastic; watch the video and try it out, you won’t be sorry!

I’m not going to talk about SO itself (Mother Google has plenty of good things to say); instead I’m going to look at the underlying network infrastructure.

A major component of SO is Sguil, a distributed monitoring system that has sensor and server components that are used for monitoring the network and aggregating output from multiple sensors (you can see a Sguil schematic here to understand how it all fits together). The comms between the sensors and the server are protected by SSL, and SO builds on this with autossh tunnels to protect other traffic streams from interception. Ubuntu’s Uncomplicated Firewall (UFW) opens only the necessary ports to allow SO sensors to talk to their server, so all the installations are protected to a high degree.

If all of the connections between the sensors and the server are on your own infrastructure, this is fine (perhaps you have a management VLAN specifically for this purpose, for example). There is however another use case I can think of, where we like to use a slightly different method of sensor-to-server communications.

NSM As A Service?

Let’s say you’re in the business of selling NSM to clients, either short-term (“help me understand what’s on my network”) or long-term (“please be our NSM operators in return for huge bushels of cash”). This will probably involve dropping off SO sensor boxes on customer networks and having them all report in to an SO server back at base. The SO sensors will probably be behind the customer’s NAT and firewalls, but this doesn’t matter, the comms will still work. The network of sensors will look a bit like this:

This particular use case isn’t without issue, though – here is where I think that a slightly different approach to the underlying SO infrastructure has value. Consider the following:

  • The Sguil interface shows you that one of the agents on one of the sensors is down. How do you ssh in to fix it if the sensor is behind someone else’s NAT and firewalls? How do you even know which Internet IP address you should use to try?
  • You suspect one of the sensors is under excessive load. It’d be great to install an snmp agent on the sensor and point Munin at it to try and find the stressed metric. Again, how do we contact the sensor from back at base?
  • You want to implement a custom information flow from the sensor back to the server. You need to think about the security of the new flow – can you implement another autossh tunnel? What if the flow is UDP (e.g., snmp-trap)?
  • One of the sensors is reported lost or stolen. How can you easily stop it from connecting back to base when it’s possibly now in the hands of threat actor du jour?

One possible answer is to build a custom “underlay” network to carry all of the traffic between SO sensors and the server, catering for existing flows and anything else you think of in the future. As a side benefit, the sensors will all have static and known IP addresses, and contacting them from the server will be no problem at all.

Virtual Private Onions

We’ll accomplish this by using OpenVPN to create a secure, privately- and statically-addressed cloud that the sensors and server will use to talk to each other (sorry for using the word “cloud”!). Take our starting point above, showing our distributed sensors at client sites behind client firewalls. By “underlaying” the SO comms with an OpenVPN cloud we’ll end up with is something like this:

The client’s firewalls and NAT are still there, but they’re now hidden by the static, bi-directional comms layer provided by OpenVPN. The sensors all talk over the VPN to the server at 10.0.0.1; the server in turn is free to contact any of the sensors via their static addresses in the 10.0.0.0/24 subnet.

Details

I don’t intend for this to be a HOWTO; I want to wait until people tell me that what I’m proposing isn’t a galactic waste of time and completely unnecessary! Conceptually, it works like this:

The SO Server takes on the additional role of an OpenVPN server. Using OpenVPN’s “easy-rsa” feature, the Server also becomes a Certificate Authority capable of issuing certificates to itself and to the sensors.

OpenVPN gives the SO Server a routed tun0 interface with the IP address 10.0.0.1/24 (note routed - if we wanted to, we can give other subnets back at base that are “behind” the Server access to the SO Sensors via the VPN).

The Sensors become OpenVPN clients, using certificates from the Server to mutually authenticate themselves to it. The Sensors know to point their OpenVPN clients at the static Internet-facing IP address of the server back at base, getting a tun0 interface and a static IP address out of 10.0.0.0/24 in return.

When setting up the SO components on the Sensor boxes, the sosetup script is told that the server is at 10.0.0.1 via tun0, ensuring that all SO comms go over the VPN.

The VPN carries all traffic between Sensors and Server, be it Sguil’s SSL connections, autossh tunnels, SNMP traps or queries, raw syslog forwarding, psychic lottery predictions, anything. The Server is also free to ssh onto the Sensors via their static VPN addresses.

Advantages?

We’ve implemented OpenVPN with SO in this manner, and it works well for us. I hope this approach has some advantages:

  • It makes for easier firewalling at the Server and Sensor ends. You only need to open up one port, which is by default 1194/UDP. All of the VPN traffic is carried over this port.
  • You have deterministic sensor IP addressing, something you wouldn’t otherwise have had when your sensors were deployed on foreign networks behind NAT, firewalls, and dynamic IP addresses on customer DSL routers.
  • You don’t need to worry about setting up additional autossh tunnels when you dream up a new flow of information  between Sensor and Server.
  • You can easily talk to the sensors from the server, for ssh sessions, SNMP polls, or anything else.
  • If a sensor is lost or stolen, all you need to do is revoke its certificate at the Server end and it won’t be able to connect to the VPN (more on sensor security in a moment).

There are of course some disadvantages:

  • Complexity, which as we all know is the enemy of security.
  • Double encryption. OpenVPN is encrypting stuff, including stuff already encrypted by Sguil and autossh.
  • <insert other glaring flaws here>

Security considerations

On the Server side, keep in mind that it is now a Certificate Authority capable of signing certificates that grant access to your SO VPN. Guard the CA key material well – maybe even use a separate system for certificate signing and key generation. Also, once you’ve generated and deployed certificates and private keys for your sensors, remove the private keys from the Server – these belong on each specific sensor and nowhere else.

On the Sensor side, we have a duty of care towards the clients purchasing the “NSM As A Service”. OpenVPN should be configured to prohibit comms between Sensor nodes, so that a compromise of one Sensor machine doesn’t automatically give the attacker a path onto other customer networks.

If a Sensor machine is lost or stolen whilst deployed at a customer site, we can of course revoke its certificate to prevent it from connecting to the VPN. That’s great, but the /nsm directory on it is still going to contain a whole bunch of data that the client wouldn’t want turning up on eBay along with the stolen sensor.

I have a plan for this, but I’ve not implemented it yet (comments welcome!). The plan is to have the /nsm directory encrypted (a TrueCrypt volume? something else?), but to have the decryption key stored on the SO Server rather than on the Sensor. When the Sensor starts up, it builds its OpenVPN connection to the Server before the SO services start. Using the VPN, the Sensor retrieves the decryption key (one unique key per Sensor) from the Server (wget? curl? something else?), taking care never to store it on the Sensor’s local disk. Decryption key in hand, the Sensor can decrypt /nsm and allow the SO services to kick off.

In this way, a stolen Sensor box represents minimal risk to the client as the /nsm directory is encrypted, with the keymat necessary for decryption stored on the Server. Provided that the theft is communicated to the NSMAAS provider in an expedient fashion, that keymat will be out of reach of the thief once the OpenVPN certificate is revoked.

(Other, crazier, plans involve giving the Sensors GPS receivers and 3G comms for covert phone-home-in-an-emergency situations…)

So there you have it…

…Virtual Private Onions. Any comments?


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

The Adventures of Packet Tracy, PI

Posted in NSM, Packet Tracy, Sguil on 14 May, 2012 by Alec Waters

My name’s Tracy, Packet Tracy. I’m a PI. It says so on my door. Last Monday morning began like every other with a fumble through the fog of a hangover looking for new and unexplained bruising. The pounding in my head was aided and abetted by the pounding on my door, and aspirin couldn’t fix both. I dragged myself towards the noise and let in a dame wearing stilletos and a scowl that could shatter concrete at fifty paces. She said her rat of a boyfriend had been flirting with some internet floozy, and all the proof was in the pcap on the USB stick that left another bruise after bouncing off my head. No problem, I said, come back Friday for chapter and verse.

The case was tougher than I thought; the pcap was bigger than my last bar tab and there’s no way I want to go over that line by line either. So I ran it past my Bro and a couple buddies of mine, Suri and Carter, but none of us could come up with the dirt. The dame’s been on my back all week for results, and now it’s Sunday and the pounding is back with a vengeance. She’s at the door, calling me names I’m pretty sure my mother never gave me. I considered leaving by the window until I remembered my office is in a basement and there aren’t any.

Time’s up. Gotta face the music… 

So how come PT couldn’t come up with the goods? The evidence was there, surely it was just a matter of time before he found it?

I’m sure a good many of us have been in needle-in-a-haystack situations where we’ve had to rely on some tool or other to process a gigantic pcap. If the tool doesn’t find what we’re looking for, either it’s not there to find or the analyst or tool itself wasn’t up to the job.

So are all tools created equal? Given the same pcap and the same task, will they return the same results?

I thought I’d run a number of URL extraction tools against a standard 128MB pcap, the kind you might find on a Security Onion box, and I set up a quick deathmatch cage fight between httpry, urlsnarf, Suricata, tshark and Bro. All versions are the ones shipped with SO, and the config files haven’t been tweaked in any way from standard.

I realise that this is hardly a thorough and scholarly test, but I think that the results show a point worth making.

The tools were invoked like this:

user@sensor:~/httpexperiment$ httpry -o httpry.log -r capture.pcap
httpry version 0.1.6 -- HTTP logging and information retrieval tool
Copyright (c) 2005-2011 Jason Bittel <jason.bittel@gmail.com>
Writing output to file: httpry.log
5230 http packets parsed

user@sensor:~/httpexperiment$ urlsnarf -n -p capture.pcap > urlsnarf.log
urlsnarf: using capture.pcap [tcp port 80 or port 8080 or port 3128]

user@sensor:~/httpexperiment$ suricata -c /etc/nsm/sensor-bond0/suricata.yaml -r capture.pcap
<tons of output>
4/4/2012 -- 15:35:58 - <Info> - HTTP logger logged 2206 requests

user@sensor:~/httpexperiment$ tshark -r capture.pcap -R http.request -T fields -e frame.time -e ip.src -e ip.dst -e tcp.srcport -e tcp.dstport -e http.request.method -e http.host -e http.request.uri -e http.request.version > tshark.log

user@sensor:~/httpexperiment$ bro -r capture.pcap

Now, we need to process httpry’s output a little, since it includes server responses as well as client requests. Once done, we can see how many HTTP requests were seen by each tool. The results were a little surprising to begin with:

httpry: 2567 requests seen
tshark: 2557 requests seen
Bro: 2240 requests seen
Suricata: 2206 requests seen
urlsnarf: 2198 requests seen

httpry and urlsnarf are often (probably quite rightly) viewed as less sophisticated than Suricata and Bro, and tshark might not be the tool of choice for this kind of work, especially when you’re constantly monitoring traffic on an interface rather than processing a pcap. Bro and Suricata certainly picked up things that httpry didn’t (HTTP on non-standard ports, for example), but there’s quite a gulf between the apparently less-sophisticated and the cutting-edge.

So why? Each tool does the job in a different way, and with its own set of pros and cons. httpry, for example, only listens on ports 80 and 8080 by default, has no support for VLAN-tagged frames, and performs no TCP session reassembly so it is vulnerable to evasion. On the other hand, the more sophisticated Bro and Suricata have, to my knowledge, no such shortcomings.

However, it’s httpry’s brute simplicity that I suspect put it on top of the pile for this particular test. All it looks for is one of a nominated set of HTTP method verbs at the beginning of a TCP segment. It doesn’t care about proper TCP session establishment or teardown, all it’s looking for are a few bytes in the proper place. In contrast, other tools may not log any URLs at all if the TCP session isn’t complete and well-formed.

Each tool’s results were a subset of the others’ collective results – no one tool captured the superset, and one could probably achieve “better” (or at very least, “different”) results by tweaking each tool. Even then I reckon there would still be a pretty good chance that there are undiscovered URLs lying there that a different tool could pick up.

I don’t mean for this to be seen as some kind of ranking of tools, or a statement that httpry and tshark are “better” URL extractors than Bro and Suricata; all I’m trying to say is that you’re almost certain to miss something by only using a single tool for a job like this. On the flip side of course, running two or more URL extractors might seem like a galactic waste of resources…

Your mileage will almost certainly vary, but it’s worth keeping this in mind when the dame is at the door demanding answers!


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

glTail parsers for Snort, net-entropy and viewssld

Posted in net-entropy, NSM on 28 October, 2011 by Alec Waters

glTail is a tool for realtime log visualisation, which according to the website allows you to “view real-time data and statistics from any logfile on any server with SSH, in an intuitive and entertaining way.”

glTail can read from any text logfile you like, and via a set of parsers can extract information such as IP addresses for graphical display. Each row from the logfile may trigger several blobs, e.g. source IP, dest IP, etc, as you can see in the video below:

I’ve written some parsers for Snort, net-entropy and viewssld. A screenshot of them all in action is shown below (click for full size view):

The red blobs are related to Snort, cyan ones to net-entropy, and the yellow shades are from viewssld. The numeric columns show the rate at which each item is appearing, and the length of the coloured highlight bars show the proportion of occurences of a given item relative to the others.

The parser files and a sample config.yaml file that uses them can be found here (snort.rb, net-entropy.rb, viewssld.rb and config.yaml).

Useful?

So, it’s a pretty visualisation of interesting stuff, but is it useful and actionable? It’s certainly hopeless for correlation – when a signature fires, it’s more or less impossible to tell the associated IP addresses and ports even if you have a very quiet sensor. At the other end of the scale, if you’re inundated with blobs you can alter the regexes in snort.rb to match on a specific IP/protocol/signature etc to be a little more selective.

Where I think this may prove most useful is when you’re learning from an incident. If you’ve investigated an incident where someone compromised your webserver, you could pull all the relevant log entries that show:

  • Snort alerts (when the attacker was probing for vulnerabilities)
  • Apache/IIS log entries (showing everything else they did to your server)
  • net-entropy logs (showing the attacker’s outbound backdoor SSH tunnel).

If you were to pump all of these logs through gltail you’d have an effective visualisation of the attack. For inspiration, check this out:


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

A Tale of Two Routers

Posted in Cisco, NSM on 14 September, 2011 by Alec Waters

Take a look at the diagram below, showing two (Cisco) routers. HugeCorpCoreRouter is a mighty behemoth with a six figure price tag. It has redundant route processors, handles many gigabits per second of business-critical traffic, has all sorts of esoteric connections and requires a squad of elite ninja black-ops CCIEs to keep it all running.

TinySOHORouter, by comparison, is a trivial speck on the corporate network diagram. It has a single ADSL connection and performs the usual SOHO tasks of NAT, firewall, DSL dialup, etc. Both routers export Netflow data to a central collector.

As you ponder my da Vinci-like Visio skills, consider the following question. Which router will pose the greater Netflow analysis challenge to the security team?


You’ve probably guessed it by now – the troublesome router is TinySOHORouter. HugeCorpCoreRouter, whilst powerful and complex, has a relatively easy job when it comes to Netflow. TinySOHORouter however has three sticking points that could prove to be troublesome for a Netflow analyst. None of the following features are typically running on your average big beefy HugeCorpCoreRouter:

  1. The firewall process (or any kind of filtering ACL). HugeCorpCoreRouter is concerned with forwarding datagrams as fast as possible through the core – firewall operarions do not live here
  2. The NAT process
  3. The dialer interface associated with the ADSL connection

Let’s look at each of these in turn.

Sponsor Alec!
I’m running the Brighton Half Marathon in aid of Help for Heroes – please sponsor me if you can by clicking the link to the right:

The firewall process

Netflow is, by default, an ingress-based technology, which means that the router’s flow cache is updated when datagrams are received by an interface. However, a datagram doesn’t have to enter and leave the router to leave an impression in the flow cache. This manifests itself in an interesting way when a firewall is sticking its oar in.

The Netflow v5 flow record format has fields that describe the SNMP interface indexes of the input and output interfaces for any given flow. This is useful, because it means that your Netflow analysis tools can tell you that when 10.11.12.13 spoke to the webserver on 192.168.0.1, the traffic from 10.11.12.13 entered the router on FastEthernet4/23 and left it on GigabitEthernet0/2. This also makes it possible to draw pretty per-interface graphs of Netflow traffic. (BTW, you’ll want to use the “snmp-server ifindex persist” command otherwise the SNMP interface indexes could change when the router reloads, which can really confuse analysis!)

But what if there were an ACL in place that drops all traffic to port 80 on 192.168.0.1? Dropped datagrams are one of the byproducts of any kind of firewall or ACL – how does Netflow handle those?

Let’s say a datagram from 10.11.12.13 is received, destined for 192.168.0.1:80. As this destination is denied by an ACL, the router duly drops it. Netflow, being an ingress technology, will still put an entry into the flow cache to describe the flow, despite the fact that the datagram was dropped by an ACL (even if the ACL is applied in the inbound direction on the receiving interface). There is no output interface for the flow in this case, so what does the router put into the flow record to denote this?

Flows that are either a) dropped by the router or b) destined for the router itself (SSH sessions, for example) will have zero in the output interface field, to show that the flow entered the router but did not leave.

So why is this a problem for the analyst?

Let’s say I run a report that shows all destination ports for destination IP address 192.168.0.1 (in a naive attempt to find out “what services have people been using on my server?”). Much to my surprise, port 80 features prominently. Why’s it in the report? Isn’t it blocked by an ACL? Have we been hacked? Has the APT Bogeyman paid us a visit?

Fortunately, we’re safe. Port 80 features because 10.11.12.13 tried to talk to it, causing a flow to be logged despite the fact that the ACL dropped the traffic. If you were to re-run the report asking for the number of bytes transferred between 10.11.12.13 and 192.168.0.1:80, we’d see 40 bytes in the client->server direction (the size of an IP datagram with a TCP SYN in it) and zero bytes in the server->client direction, which describes the ACL drop nicely.

Keep this in mind when designing reports based on Netflow data. Certain products like Netflow Analyser are able to take this behaviour into account to a certain degree (“Suppress Access Control List related drops”). Alternatively, you could use the Netflow v9 flow record format if your router and analysis tools support it. There is a useful field called “FORWARDING STATUS” which tells you if a flow was forwarded, dropped or consumed, allowing the analyst to differentiate between traffic dropped by the router and traffic destined for the router. Very handy.

The NAT process

Our second bugbear can also cause problems, especially if we want to ask questions like “show me all the traffic destined for the single PC behind TinySOHORouter” – the report in this case will be totally blank, even if the PC has been hitting Facebook all day long. But why?

Take the simple case of an HTTP flow between our single PC at 10.11.12.13 (a private IP address on a router’s FastEthernet0 interface) and 123.123.123.123 (a public webserver on the Internet via FastEthernet1). On its way out of the router, the private 10.11.12.13 gets NATted into 111.111.111.111, the IP address of FastEthernet1.

From Netflow’s point of view, it goes like this:

  • A TCP segment from 10.11.12.13 destined for 123.123.123.123 is received on Fa0. An entry in the Netflow cache accounts for this.
  • The router decides that the traffic should be sent out via Fa1, and does a source IP address NAT translation from 10.11.12.13 to 111.111.111.111 before it sends it on its way.
  • The TCP response is eventually received on Fa1 from 123.123.123.123 destined for 111.111.111.111, which is 10.11.12.13′s “outside” address. An entry in the Netflow cache accounts for this.
  • The NAT translation from 111.111.111.111 to 10.11.12.13 takes place, and the TCP response is sent out of Fa0.

Therefore, all of the returning traffic will be shown as destined for 111.111.111.111 and never 10.11.12.13 – this is because input accounting (including Netflow) occurs on the router before the NAT outside-to-inside translation takes place:

http://www.cisco.com/en/US/tech/tk648/tk361/technologies_tech_note09186a0080133ddd.shtml

There are three ways to either get around or assist with this problem:

  1. If your router and Netflow collector support it, disable ingress Netflow accounting on Fa1 and enable both ingress and egress Netflow accounting on Fa0 (the inside interface). This means that all flows will be accounted for on the “inside” of the NAT process. Take care, though – by doing this we are causing Netflow to “ignore” all traffic that does not cross Fa0. This may or may not be a problem, depending on your topology and requirements. Also, think very carefully about this approach if your router has many layer 3 interfaces. If ingress and egress Netflow were to be enabled on both Fa0 and Fa1, there’s a chance your Netflow collector could see duplicated flows.
  2. If your router and Netflow collector support it, you can use the “ip nat log translations flow-export” command. This will log all NAT translations in a flow template that looks like this:
    templateId=259: id=259, fields=11
        field id=8 (ipv4 source address), offset=0, len=4
        field id=225 (natInsideGlobalAddress), offset=4, len=4
        field id=12 (ipv4 destination address), offset=8, len=4
        field id=226 (natOutsideGlobalAddress), offset=12, len=4
        field id=7 (transport source-port), offset=16, len=2
        field id=227 (postNAPTSourceTransportPort), offset=18, len=2
        field id=11 (transport destination-port), offset=20, len=2
        field id=228 (postNAPTDestinationTransportPort), offset=22, len=2
        field id=234 (ingressVRFID), offset=24, len=4
        field id=4 (ip protocol), offset=28, len=1
        field id=230 (natEvent), offset=29, len=1

    This will give you a log of all NAT translations that you can use to find out the actual destination for the traffic from 123.123.123.123 to 111.111.111.111. Your Netflow collector may even be smart enough to correlate this information onto other “standard” flow exports, which would be a very neat trick indeed.

  3. If your router supports it, you can use the “ip nat log translations syslog” command. This will dump all NAT translations to syslog like this:
    Sep 14 12:31:39.740 BST: %IPNAT-6-CREATED:
    tcp 192.168.0.88:4021 212.74.31.235:4021
    192.150.8.200:443 192.150.8.200:443
    Sep 14 12:32:53.733 BST: %IPNAT-6-DELETED:
    tcp 192.168.0.88:4021 212.74.31.235:4021
    192.150.8.200:443 192.150.8.200:443

    Take care, though – this approach has the possibility to add significant load to your router, your syslog server, and your syslog analysis mechanisms – it becomes a manual task to correlate the NAT translations from syslog to the Netflow exports from your router.

The ADSL link’s dialer interface

It varies with platform and configuration, but when using a DSL line with PPPoE/PPPoA a plethora of virtual interfaces get created by the router. Of these, only the following are really of interest:

interface ATM 0/0/0
The physical ADSL interface

interface dialer 0
The dialer interface created by the user in order to connect to the DSL provider

interface virtual-access XX
A virtual interface created by the router, cloned from and bound to interface dialer0

Of these, only the dialer and virtual-access interfaces are layer 3 interfaces that can participate in Netflow, and of these the user only has direct control over the configuration of the dialer interface. So we just enable Netflow on TinySOHORouter’s dialer0 and inside ethernet interfaces and we’re done, right?

Not quite.

If you were to use your Netflow analysis tools to look at an interface graph for dialer0, all you will see is outbound traffic. You’ll also notice that the virtual-access interface has popped up as well, showing only inbound traffic. No one interface has the complete picture.

This is, interestingly enough, the expected behaviour. Traffic from the ethernet network leaves the router via dialer0 because that’s what the default route says to do (“ip route 0.0.0.0 0.0.0.0 dialer0″). Therefore, when the ethernet interface receives a datagram destined for the Internet, Netflow will put the SNMP interface index of dialer0 into the flow cache. However, the router doesn’t actually use dialer0 to send or receive traffic, it uses the virtual-access interface cloned from it. This means that when datagrams are received from the Internet, they enter the router on virtual-accessXX instead of dialer0 or any of the other associated interfaces. This is why the dialer shows only outbound traffic and the virtual-access shows only inbound. All very logical and intuitive, I’m sure you’ll agree…

How to get around this? Either just “keep in it mind” when performing analysis, or hope that your Netflow analysis tools have some way to cater for it by plotting the outbound traffic on dialer0 and the inbound traffic on virtual-accessXX on the same graph.

Those are all the Netflow analysis “gotchas” that spring to mind – can anyone think of any others?


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

Eyesight to the Blind – SSL Decryption for Network Monitoring

Posted in Crypto, NSM on 28 June, 2011 by Alec Waters

Here’s another post I wrote for the InfoSec Institute. This time, the article shows how to add SSL decryption to your NSM infrastructure, restoring the eyesight of sensors blinded by the use of SSL.

You can read the article here; comments welcome, as always!


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

Indiana Jones and the Contextless Artefact of Doom

Posted in NSM on 16 March, 2011 by Alec Waters

If archaeology floats your boat, in the UK there’s a TV series called “Time Team“. Each episode is fronted by the ever-enthusiastic Tony Robinson (best known for playing Baldrick in Blackadder), and documents a three-day excavation of interesting-site-du-jour.

Archaeology is a bit like tech forensics, in that it is the search for truth amidst a gigantic pile of stuff, some of which may be useful, some of which may not. Features (such as walls, ditches, post holes, etc) and finds (pottery, jewellery, medieval trash, etc) take the place of logs and NSM data, but the investigative methodologies have many parallels.

One astonishing episode was called “Celtic Spring”, which featured the team investigating a highly dubious site containing a hotch-potch of different things that really ought not be in such close proximity to one another. With open minds and their usual professionalism, they proceeded to expose what amounted to a hoax perpetrated by a nineteenth century cleric and twentieth century persons unknown. You can watch the episode here (have patience with the advertising, it soon passes!)

The point of this post concerns a spectacular find – an Iron Age sword, of which only two or three have ever been found in Wales. It wasn’t an ordinary iron age sword, either (if such a thing exists!), it was confirmed as a genuine La Tène sword from Switzerland – none of these have ever been found so far from home.

Dr Jones would have been ecstatic. He’d have rushed it back to Marcus and got it in the museum, probably after using it to escape from some dastardly trap.

However, the Time Team archaeologists weren’t so happy. They were cross. Some of them were absolutely livid. As they excavated the sword, they started to get the feeling that things weren’t quite right. It wasn’t buried very deep down. It was in an odd place to find such a thing. It was alone, with no other finds nearby. And most damning of all, it was above a buried strand of barbed wire, meaning it could only have got there after the wire did.

It turns out that barbed wire is a remarkably datable thing. The gauge and metal used, the nature of the twist and the pattern of the barbs all contribute to identification. This particular barbed wire was no more than twenty years old, meaning the sword had been in the ground for less time than this.

This was the cause of the archaeologists’ dismay. Yes, the sword in itself is a wonderful artefact, but, despite the antics of Dr Jones, archaeologists want to understand how people lived more than they want shiny trinkets for the museum. The sword had been removed from its original La Tène context and dumped unceremoniously in a Welsh ditch, presumably so the perpetrator could get his fifteen minutes of fame. Out of context, the sword is useless for understanding the La Tène culture. The archaeologists want to know who owned the sword, how they lived, how they died, and how they were prepared for their journey to the next world. Had the sword been found in context where it was left, these questions could have been answered. As it is, the sword is useless for these purposes.

An archaeologist ignoring the context of a find

An archaeologist ignoring the context of a find

Finally getting back to the NSM domain, the importance of establishing and investigating context is just as clear. Having Snort tell you it’s seen an instance of “Suspicious Inbound AlphaServer UserAgent” isn’t terribly useful on its own. It needs to be placed into context – when did it happen? What was the source? What was the destination? What exactly was the HTTP conversation all about? Did it have any impact? Only by taking the alert in the context of other indicators can answers be had. Even seemingly open-and-shut cases need to have their indicators put into context and investigated fully.

Indicators are hardly ever standalone entities – get out your trowel and brush, open a trench, and don’t stop digging until you have all the answers!


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

Next Generation Naughtiness at the Dead Beef Cafe

Posted in General Security, IPv6, Networking, NSM on 23 December, 2010 by Alec Waters

The IPocalypse is nearly upon us. Amongst the FUD, the four horsemen are revving up their steeds, each bearing 32 bits of the IPv6 Global Multicast Address of Armageddon, ff0e::dead:beef:666.

Making sure that the four horsemen don’t bust into our stables undetected is something of a challenge at the moment; IPv6 can represent a definite network monitoring blind spot or, at worst, an unpoliced path right into the heart of your network. Consider the following:

Routers and Firewalls

Although a router may be capable of routing IPv6, are all the features you use on the router IPv6-enabled? Is the firewall process inspecting IPv6 traffic? If it is, is it as feature-rich as the IPv4 equivalent (e.g., does it support application-layer inspection for protocols like FTP, or HTTP protocol compliance checking?)

End hosts

If you IPv6-enable your infrastructure, you may be inadvertently assigning internal hosts global IPv6 addresses (2001::) via stateless address autoconfiguration. If this happens (deliberately or accidentally), are the hosts reachable from the Internet directly? There’s no safety blanket of NAT for internal hosts like there is in IPv4, and if your network and/or host firewalls aren’t configured for IPv6 you could be wide open.

IDS/IPS

Do your IDS/IPS boxes support IPv6? Snort’s had IPv6 support since (I think) v2.8; the Cisco IPS products are also IPv6-aware, as I’m sure are many others.

Session tracking tools

SANCP has no IPv6 support; Argus does, as does cxtracker. netflow can also be configured to export IPv6 flows using v9 flow exports.

Reporting tools

Even if all of your all-seeing-eyes support IPv6, they’re of little use if your reporting tools don’t. Can your netflow analyser handle IPv6 exports? What about your IDS reporting tools – are they showing you alerts on IPv6 traffic? What about your expensive SIEM box?

The IPv6 Internet is just as rotten as the IPv4 one

We’ve seen some quite prolific IPv6 port scanning just as described here, complete with scans of addresses like 2001:x:x:x::c0:ffee and 2001:x:x:x::dead:beef:cafe. The same scanning host also targeted UDP/53 trying to resolve ‘localhost’, with the same source port (6689) being used for both TCP and UDP scans. I have no idea if this is reconnaissance or part of some kind of research project, but there were nearly 13000 attempts from this one host in the space of about three seconds.

Due to the current lack of visibility into IPv6, it can also make a great bearer of covert channels for an attacker or pentester. Even if you’re not running IPv6 at all, an attacker who gains a foothold within your network could easily set up a low-observable IPv6-over-IPv4 tunnel using one of the many IPv6 transition mechanisms available, such as 6in4 (uses IPv4 protocol 41) or Teredo (encapsulates IPv6 in UDP, and can increase the host’s attack surface by assigning globally routable IPv6 addresses to hosts behind NAT devices, which are otherwise mostly unreachable from the Internet).

The IPocalypse is coming…

…that’s for certain; we just have to make sure we’re ready for it. Even if you’re not using IPv6 right now, you probably will be to some degree a little way down the road. Now’s the time to check the capability of your monitoring infrastructure, and to conduct a traffic audit looking for tunneled IPv6 traffic. Who knows what you might find!


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

Cap’n Quagga’s Pirate Treasure Map

Posted in Cisco, Networking, NSM on 23 November, 2010 by Alec Waters

Avast, me hearties! When a swashbucklin’ pirate sights land whilst sailin’ uncharted waters, the first thing he be doin’ is makin’ a map. Ye can’t be burying ye treasure if ye don’t have a map, yarrr!

PUBLIC SERVICE ANNOUNCEMENT

For everyone’s sanity, the pirate speak ends now. Save it for TLAP day!

When searching for booty on a network, it’s often useful to have a map. If you’ve got a foothold during a pentest, for example, how far does your conquered domain stretch? Is it a single-subnet site behind a SOHO router, or a tiny outpost of a corporate empire spanning several countries?

To get the answer, the best thing to do is ask one of the locals. In this case, we’re going to try to convince a helpful router to give up the goods and tell us what the network looks like. The control plane within the enterprise’s routers contains the routing table, which is essentially a list of destination prefixes (i.e., IP networks) and the next-hop to be used to get there (i.e., which neighbouring router to pass traffic on to in order to reach a destination).

The routing table is populated by a routing protocol (such as BGP, OSPF, EIGRP, RIP, etc), which may in turn have many internal tables and data structures of its own. Interior routing protocols (like OSPF) are concerned with finding the “best” route from A to B within the enterprise using a “technical” perspective; they’re concerned with automatically finding the “shortest” and “fastest” route, as opposed to exterior routing protocols like BGP which are more interested in implementing human-written traffic forwarding policies between different organisations.

The key word above automatic. Interior routing protocols like to discover new neighbouring routers without intervention – it can therefore cater for failed routers that come back online, and allow the network to grow and have the “best” paths recomputed automatically.

So, how are we going to get our treasure map so that we know how far we can explore? We’re going to call in Cap’n Quagga!

Technically, it's James the Pirate Zebra, but seriously man, you try finding a picture of a pirate quagga!! They're extinct, for starters!

Pirate Cap'n Quagga aboard his ship, "Ye Stripy Scallywag"

Quagga is a software implementation of a handful of routing protocols. We’re going to use it to convince the local router that we’re a new member of the pirate fleet, upon which the router will form a neighbour relationship with us. After this has happened we’ll end up with our pirate treasure map, namely the enterprise’s routing table. Finally, we’ll look at ways in which the corporate privateers can detect Cap’n Quagga, and ways to prevent his buckle from swashing in the first place.

For the purposes of this article we’re going to use OSPF, but the principles hold for other protocols too. OSPF is quite a beast, and full discussion of the protocol is well beyond the scope of this article – interested parties should pick up a book.

Step One – Installing and configuring Quagga

I’m using Debian, so ‘apt-get install quagga’ will do the job quite nicely. Once installed, we need to tweak a few files:

/etc/quagga/daemons

This file controls which routing protocols will run. We’re interested only in OSPF for this example, so we can edit it as follows:

zebra=yes
bgpd=no
ospfd=yes
ospf6d=no
ripd=no
ripngd=no

As shown above, we need to turn on the zebra daemon too – ospfd can’t stand alone.

Next, we need to set up some basic config files for zebra and ospfd:

/etc/quagga/zebra.conf

hostname pentest-zebra
password quagga
enable password quagga

/etc/quagga/ospfd.conf

hostname pentest
password quagga
enable password quagga
log stdout

Now we can force a restart of Quagga with ‘/etc/init.d/quagga restart’.

For more information, the Quagga documentation is here, the wiki is here, and there’s a great tutorial here.

Step Two – Climb the rigging to the crow’s nest and get out ye spyglass

We need to work out if there’s a router on the local subnet that’s running OSPF. This step is straightforward, as OSPF sends out multicast “Hello” packets by default every ten seconds – all we have to do is listen for it. As far as capturing this traffic goes, it has a few distinguishing features:

  • The destination IP address is 224.0.0.5, the reserved AllSPFRouters multicast address
  • The IP datagrams have a TTL of one, ensuring that the multicast scope is link local only
  • OSPF does not ride inside TCP or UDP – it has its own IP Protocol number, 89.

The easiest capture filter for tshark/tethereal or their GUI equivalents is simply “ip proto 89″; this will capture OSPF hellos in short order:

Ahoy there, matey!

Apart from confirming the presence of a local OSPF router, this information is critical in establishing the next step on our journey to plunderville – we need Quagga’s simulated router to form a special kind of neighbour relationship with the real router called an “adjacency”. Only once an adjacency has formed will routing information be exchanged. Fortunately, everything we need to know is in the hello packet:

Ye're flying my colours, matey!

For a text only environment, “tshark -i eth0 -f ‘ip proto 89′ -V” provides similar output.

Step Three – configure Quagga’s OSPF daemon

For an adjacency to form (which will allow the exchange of LSAs, which will allow us to populate the OSPF database, which will allow us to run the SPF algorithm, which will allow us to populate the local IP routing table…), we need to configure Quagga so that all of the highlighted parameters above match. The command syntax is very Cisco-esque, and supports context sensitive help, abbreviated commands and tab completion. I’m showing the full commands here, but you can abbreviate as necessary:

# telnet localhost ospfd
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.

Hello, this is Quagga (version 0.99.17).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

User Access Verification

Password:
pentest> enable
Password:
pentest# configure terminal
pentest(config)# interface eth0
! Make the hello and dead intervals match what we’ve captured
pentest(config-if)# ospf hello-interval 10
pentest(config-if)# ospf dead-interval 40
pentest(config-if)# exit
pentest(config)# router ospf
! eth0 on this machine was given 192.168.88.49 by DHCP
! The command below will put any interfaces in
! 192.168.88.0/24 into area 0.0.0.4, effectively
! therefore “turning on” OSPF on eth0
! The area id can be specified as an integer (4) or
! as a dotted quad (0.0.0.4)
pentest(config-router)# network 192.168.88.0/24 area 0.0.0.4
pentest(config-router)# exit
pentest(config)# exit

We can check our work by looking at the running-config:

pentest# show running-config

Current configuration:
!
hostname pentest
password quagga
enable password quagga
log stdout
!
!
!
interface eth0
!
interface lo
!
router ospf
network 192.168.88.0/24 area 0.0.0.4
!
line vty
!
end

The Hello and Dead intervals of 10 and 40 are the defaults, which is why they don’t show in the running-config under ‘interface eth0′.

Step Four – Start diggin’, matey!

With a bit of luck, we’ll have formed an OSPF adjacency with the local router:

pentest# show ip ospf neighbor

Neighbor ID Pri  State    Dead Time Address        Interface
172.16.7.6   1  Full/DR  32.051s   192.168.88.1 eth0:192.168.88.49

If we exit from Quagga’s OSPF daemon and connect to zebra instead, we can look at our shiny new routing table. Routes learned via OSPF are prefixed with O:

# telnet localhost zebra
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.

Hello, this is Quagga (version 0.99.17).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

User Access Verification

Password:
pentest-zebra> show ip route
Codes: K – kernel route, C – connected, S – static, R – RIP, O – OSPF,
I – ISIS, B – BGP, > – selected route, * – FIB route

O   0.0.0.0/0 [110/1] via 192.168.88.1, eth0, 00:04:45
K>* 0.0.0.0/0 via 192.168.88.1, eth0
O>* 10.4.0.0/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 10.4.0.64/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 10.4.0.128/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 10.4.0.192/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 10.4.2.0/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 10.4.3.0/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.6.0/30 [110/15] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.6.4/30 [110/16] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.6.8/30 [110/11] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.6.12/30 [110/110] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.7.1/32 [110/12] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.7.2/32 [110/13] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.7.3/32 [110/16] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.7.4/32 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.7.5/32 [110/1012] via 192.168.88.1, eth0, 00:04:46

We clearly are not just sitting on a single-subnet LAN! Here are some of the things we can learn from the routing table:

  • Firstly, we’ve got a few more subnets than merely the local one to enumerate with nmap etc!
  • We can make some kind of estimation on how far away the subnets are by looking at the route metrics. An example above is the ’1012′ part of ‘[110/1012]‘. 1012 is the metric for the route, with the precise meaning of “metric” varying from routing protocol to routing protocol. In the case of OSPF, by default this is the sum of the interface costs between here and the destination, where the interface cost is derived from the interface’s speed. The 110 part denotes the OSPF protocol’s “administrative distance“, which is a measure of trustworthiness of a route offered for inclusion in the routing table by a given routing protocol. If two protocols offer the routing table exactly the same prefix (10.4.3.0/26, for example), the routing protocol with the lowest AD will “win”.
  • A good number of these routes have a prefix length of /26 (i.e., a subnet mask of 255.255.255.192), meaning that they represent 64 IP addresses. These are likely to be host subnets with new victims on them.
  • The /30 routes (4 IP addresses) are likely to be point-to-point links between routers or even WAN or VPN links between sites.
  • The /32 routes (just one IP address) are going to be loopback addresses on individual routers. If you want to target infrastructure directly, these are the ones to go for.

If you want to start digging really deeply, you can look at the OSPF database (show ip ospf database), but that’s waaay out of scope for now.

Step Five – Prepare a broadside!

If we’ve got to this point, we are in a position not only to conduct reconnaissance, but we could also start injecting routes into their routing table or manipulate the prefixes already present in an effort to redirect traffic to us (or to a blackhole). Originating a default route is always fun, since it will take precedence over legitimate static default routes that have been redistributed into OSPF (redistributed routes are “External” in OSPF terminology, and are less preferable to “internal” routes such as our fraudulent default). If we had a working default route of our own, this approach could potentially redirect Internet traffic for the entire enterprise through our Quagga node where we can capture it. Either that or you’ll bring the network to a screaming halt.

Anyway, it’s all moot, since we’re nice pirates and would never consider doing anything like that!

Privateers off the starboard bow, Cap’n!

How can we detect such naughtiness, and even better, prevent it?

The first step is to use the OSPF command ‘log-adjacency-changes’ on all the enterprise’s OSPF routers. This will leave log messages like this:

Nov 23 15:11:24.666 UTC: %OSPF-5-ADJCHG: Process 2, Nbr 192.168.88.49 on Gi­gabitEthernet0/0.2 from LOADING to FULL, Loading Done

Keeping track of adjacency changes is an excellent idea – it’s a metric of the stability of the network, and also offers clues when rogue devices form adjacencies.

Stopping rogue adjacencies altogether can be accomplished in two ways. The first is to make OSPF interfaces on host-only subnets “passive“, which permits them to participate in OSPF without allowing adjacencies to form.

The second method is to use OSPF authentication, whereby a hash of a preshared key is required before an adjacency can be established. Either method is strongly recommended!

As always, keep yer eyes to the horizon, mateys! :)


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

The Analyst’s Creed

Posted in NSM on 2 November, 2010 by Alec Waters

These are my logfiles. There are many like them, but these ones are mine. My logfiles are my best friends. They are my life. I must master them as I must master my life. My logfiles, without me, are useless. Without my logfiles, I am useless. I must comprehend my logfiles’ every word. I must be more vigilant than my enemy who is trying to invade me. I must detect him before he compromises me.

I will…

My network and myself know that what counts in this war is not the products we buy, nor the number of our certifications. We know that it is the detections and investigations that count.

We will detect and investigate…

My network is human, even as I, because it is my life. Thus, I will learn it as a brother. I will learn its weaknesses, its strength, its parts, its accessories, its services and its users. I will assume nothing and verify everything. I will ever guard it against the ravages of opportunistic attack and determined infiltration as I will ever guard my legs, my arms, my eyes and my heart against damage. I will keep my CSIRT trained and ready. We will become part of each other.

We will…

Before $Deity, I swear this creed. My logfiles and myself are the defenders of my enterprise. We are the masters of our enemy. We are the saviors of my life. So be it, until victory is ours and there is no enemy, but peace!

Adapted from the Rifleman’s Creed.


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

The Cisco Kid and the Great Packet Roundup, part two – session data

Posted in Cisco, General Security, NSM on 26 October, 2010 by Alec Waters

In part one, I covered how to use Cisco routers and firewalls to perform full packet capture. This exciting installment will cover how to get network session data out of these devices.

Network session data can be likened to a real-world itemised telephone bill. It tells you who “called” who, at what times, for how long, and how much was said (but not what was said). It’s an excellent lightweight way to see what’s going on armed only with a command prompt.

There are several ways to extract such information from Cisco kit; we’ll look at each in turn, following Part One’s support/troubleshooting/IR scenario of accessing remote devices where you’re not able to make topological changes or install any extra software or hardware.

Netflow

The richest source of session information on Cisco devices is Netflow (I’ll leave it to Cisco to explain how to turn it on). If you’re able to set up a Netflow collector/analyser (like this one (free for two-interface routers), or many others) you can drill down into your session info as far as you like. If you haven’t got an analyser or you can’t install one in time of need, it’s still worth switching on Netflow because you can interrogate the flow cache from the command line.

The command is “show ip cache flow”, and the output is split into two parts. The first shows some statistical information about the flows that the router has observed:

router#sh ip cache flow
IP packet size distribution (3279685 total packets):
 1-32   64   96  128  160  192  224  256  288  320  352  384  416  448  480
 .000 .184 .182 .052 .072 .107 .004 .005 .000 .000 .000 .000 .000 .000 .000

 512  544  576 1024 1536 2048 2560 3072 3584 4096 4608
 .000 .000 .001 .020 .365 .000 .000 .000 .000 .000 .000

IP Flow Switching Cache, 278544 bytes
 57 active, 4039 inactive, 418030 added
 10157020 ager polls, 0 flow alloc failures
 Active flows timeout in 1 minutes
 Inactive flows timeout in 15 seconds
IP Sub Flow Cache, 34056 bytes
 57 active, 967 inactive, 418030 added, 418030 added to flow
 0 alloc failures, 0 force free
 1 chunk, 1 chunk added
 last clearing of statistics never
Protocol     Total    Flows   Packets Bytes  Packets Active(Sec) Idle(Sec)
--------     Flows     /Sec     /Flow  /Pkt     /Sec     /Flow     /Flow
TCP-WWW       6563      0.0       186  1319      1.2       4.7       1.4
TCP-other    16163      0.0         1    47      0.0       0.0      15.4
UDP-DNS         12      0.0         1    67      0.0       0.0      15.6
UDP-NTP       1010      0.0         1    76      0.0       0.0      15.0
UDP-Frag         2      0.0         6   710      0.0       0.2      15.3
UDP-other   316602      0.3         2   156      0.8       0.6      15.4
ICMP         31165      0.0         6    63      0.2      53.4       2.2
IP-other     46438      0.0        21   125      1.0      58.0       2.1
Total:      417955      0.4         7   574      3.3      11.0      12.7

In absence of a graphical Netflow analyser, the Packets/Sec counter is a good barometer of what’s “using up all the bandwidth”. To clear the stats so that you can establish a baseline, you can use the command “clear ip flow stats”.

After the stats comes a listing of all the flows currently being tracked by the router:

SrcIf     SrcIPaddress    DstIf     DstIPaddress    Pr SrcP DstP  Pkts
Fa4       xxx.xxx.xxx.xxx Local     yyy.yyy.yyy.yyy 32 3FAF 037C    16
Tu100     10.7.1.250      BV3       10.4.1.3        06 0051 C07A   663
Tu100     10.7.1.250      BV3       10.4.1.3        06 0050 C0AC   120
BV3       10.4.1.3        Tu100     10.7.1.250      06 C0AC 0050   116
Tu100     192.168.88.20   Local     172.16.7.10     01 0000 0800     5
BV3       10.4.1.3        Fa4       zzz.zzz.zzz.zzz 06 C0A2 0050   429
BV3       10.4.1.3        Tu100     10.7.1.250      06 C07A 0051   366
Fa4       bbb.bbb.bbb.bbb BV3       yyy.yyy.yyy.yyy 06 0050 C0A0     1
BV3       10.4.1.3        Fa4       ddd.ddd.ddd.ddd 06 C07E 0050     1
Tu100     192.168.88.56   Local     172.16.7.10     06 8081 0016     7
Fa4       zzz.zzz.zzz.zzz BV3       yyy.yyy.yyy.yyy 06 0050 C0A2   763
Tu100     192.168.88.28   Local     172.16.7.10     11 04AC 00A1     1
Tu100     192.168.88.28   Local     172.16.7.10     11 04A6 00A1     1
Fa4       aaa.aaa.aaa.aaa Local     yyy.yyy.yyy.yyy 32 275F BD8A     5
Fa4       ccc.ccc.ccc.ccc Local     yyy.yyy.yyy.yyy 32 97F1 E9BE     5
Tu100     10.7.1.242      Local     172.16.7.10     01 0000 0000     3
Fa4       ddd.ddd.ddd.ddd BV3       yyy.yyy.yyy.yyy 06 0050 C07E     1

The tempting simplicity of the table above hides a plethora of gotchas for the unwary:

  • The Pr (IP protocol number),  SrcP (source port) and DstP columns are in hex, but we can all do the conversion in our heads, right? ;)
  • Netflow is a unidirectional technology. That means that if hosts A and B are talking to one another via a single TCP connection, two flows will be logged – one for A->B and one for B->A. For example, these two rows in the table above are talking about the same TCP session (the four-tuple of addresses and ports is the same for both rows):
Tu100     10.7.1.250      BV3       10.4.1.3        06 0051 C07A   663
BV3       10.4.1.3        Tu100     10.7.1.250      06 C07A 0051   366
  • Unless you configure it otherwise, Netflow is an ingress technology. This means that flows are accounted for as they enter the router, not as they leave. You can determine what happens on the egress side of things because when a flow is accounted for the output interface is determined by a FIB lookup and placed in the DstIf column; in this way, you can track a flow’s path through the router. I mention this explicitly because…
  • Netflow does not sit well with NAT. Take a look at these two rows, which represent an HTTP download (port 0×0050 is 80 in decimal) requested of non-local server zzz.zzz.zzz.zzz by client 10.4.1.3:
BV3       10.4.1.3        Fa4       zzz.zzz.zzz.zzz 06 C0A2 0050   429
Fa4       zzz.zzz.zzz.zzz BV3       yyy.yyy.yyy.yyy 06 0050 C0A2   763

So what’s yyy.yyy.yyy.yyy, then? It’s the NAT inside global address representing 10.4.1.3. As Netflow is unidirectional and is recorded as it enters an interface, the returning traffic from zzz.zzz.zzz.zzz will have the post-NAT yyy.yyy.yyy.yyy as its destination address, and will be recorded as such.

Provided that you keep that lot in mind, the flow cache is a powerful tool to explore the traffic your router is handling.

NAT translations

A typical border router may well perform NAT/PAT tasks. If so, you can use the NAT database as a source of session information. On a router, the command is “show ip nat translations [verbose]“; on a PIX/ASA, it’s “show xlate [debug]“:

router#show ip nat translations
Pro Inside global         Inside local   Outside local    Outside global
tcp yyy.yyy.yyy.yyy:49314 10.4.1.3:49314 94.42.37.14:80   94.42.37.14:80
tcp yyy.yyy.yyy.yyy:49316 10.4.1.3:49316 92.123.68.49:80  92.123.68.49:80

If you’ve got a worm on your network that’s desperately trying to spread, chances are you’ll see a ton of NAT translations (which could overwhelm a small router). Rather than paging through thousands of lines of output, you can just ask the device for some NAT statistics. On a router, it’s “show ip nat statistics”; on a PIX/ASA, it’s “show xlate count”.

Keeping tabs on the number of active NAT translations is a worthwhile thing to do. I wrote a story for Security Monkey’s blog a while back which tells the tale of a worm exhausting a router’s memory with NAT translations; you can even graph the number of translations to look for anomalies over time.

Firewall sessions

Another way of extracting session information is to ask the router or PIX about the sessions it is currently tracking for firewall purposes. On a router it’s “show ip inspect sessions [detail]“; on the PIX/ASA, it’s “show conn [detail]“.

router#show ip inspect sessions detail
Established Sessions
 Session 842064A4 (10.4.1.3:49446)=>(92.123.68.81:80) http SIS_OPEN
  Created 00:00:59, Last heard 00:00:58
  Bytes sent (initiator:responder) [440:4269]
  In  SID 92.123.68.81[80:80]=>y.y.y.y[49446:49446] on ACL outside-fw (6 matches)
 Session 84206FC4 (10.4.1.3:49443)=>(92.123.68.81:80) http SIS_OPEN
  Created 00:00:59, Last heard 00:00:59
  Bytes sent (initiator:responder) [440:2121]
  In  SID 92.123.68.81[80:80]=>y.y.y.y[49443:49443] on ACL outside-fw (4 matches)
 Session 8420728C (10.4.1.3:49436)=>(92.123.68.81:80) http SIS_OPEN
  Created 00:01:01, Last heard 00:00:50
  Bytes sent (initiator:responder) [1343:48649]
  In  SID 92.123.68.81[80:80]=>y.y.y.y[49436:49436] on ACL outside-fw (44 matches)

This has the advantage of not being complicated by NAT, but still showing useful bytecounts and session durations.

Last resorts

If none of the above can help you out, there are a couple of last resort options open to you. The first of these is the “ip accounting” interface configuration command on IOS routers. To quote Cisco:

The ip accounting command records the number of bytes (IP header and data) and packets switched through the system on a source and destination IP address basis. Only transit IP traffic is measured and only on an outbound basis; traffic generated by the router access server or terminating in this device is not included in the accounting statistics. Traffic coming from a remote site and transiting through a router is also recorded.

Also note that this command will likely have a performance impact on the router. You may end up causing more problems than you solve by using this! The output of “show ip accounting” will look something like this:

router# show ip accounting
 Source          Destination            Packets      Bytes
 172.16.19.40    192.168.67.20          7            306
 172.16.13.55    192.168.67.20          67           2749
 172.16.2.50     192.168.33.51          17           1111
 172.16.2.50     172.31.2.1             5            319
 172.16.2.50     172.31.1.2             463          30991
 172.16.19.40    172.16.2.1             4            262

If “ip accounting” was a last resort, “debug ip packet” is what you’d use as an even lasterer resort, so much so that I leave it as an exercise for the reader to find out all about it. Don’t blame me when your router chokes to the extent that you can’t even enter “undebug all”…!


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

Follow

Get every new post delivered to your Inbox.