My name’s Tracy, Packet Tracy. I’m a PI. It says so on my door. Last Monday morning began like every other with a fumble through the fog of a hangover looking for new and unexplained bruising. The pounding in my head was aided and abetted by the pounding on my door, and aspirin couldn’t fix both. I dragged myself towards the noise and let in a dame wearing stilletos and a scowl that could shatter concrete at fifty paces. She said her rat of a boyfriend had been flirting with some internet floozy, and all the proof was in the pcap on the USB stick that left another bruise after bouncing off my head. No problem, I said, come back Friday for chapter and verse.
The case was tougher than I thought; the pcap was bigger than my last bar tab and there’s no way I want to go over that line by line either. So I ran it past my Bro and a couple buddies of mine, Suri and Carter, but none of us could come up with the dirt. The dame’s been on my back all week for results, and now it’s Sunday and the pounding is back with a vengeance. She’s at the door, calling me names I’m pretty sure my mother never gave me. I considered leaving by the window until I remembered my office is in a basement and there aren’t any.
Time’s up. Gotta face the music…
So how come PT couldn’t come up with the goods? The evidence was there, surely it was just a matter of time before he found it?
I’m sure a good many of us have been in needle-in-a-haystack situations where we’ve had to rely on some tool or other to process a gigantic pcap. If the tool doesn’t find what we’re looking for, either it’s not there to find or the analyst or tool itself wasn’t up to the job.
So are all tools created equal? Given the same pcap and the same task, will they return the same results?
I thought I’d run a number of URL extraction tools against a standard 128MB pcap, the kind you might find on a Security Onion box, and I set up a quick deathmatch cage fight between httpry, urlsnarf, Suricata, tshark and Bro. All versions are the ones shipped with SO, and the config files haven’t been tweaked in any way from standard.
I realise that this is hardly a thorough and scholarly test, but I think that the results show a point worth making.
The tools were invoked like this:
user@sensor:~/httpexperiment$ httpry -o httpry.log -r capture.pcap httpry version 0.1.6 -- HTTP logging and information retrieval tool Copyright (c) 2005-2011 Jason Bittel <email@example.com> Writing output to file: httpry.log 5230 http packets parsed user@sensor:~/httpexperiment$ urlsnarf -n -p capture.pcap > urlsnarf.log urlsnarf: using capture.pcap [tcp port 80 or port 8080 or port 3128] user@sensor:~/httpexperiment$ suricata -c /etc/nsm/sensor-bond0/suricata.yaml -r capture.pcap <tons of output> 4/4/2012 -- 15:35:58 - <Info> - HTTP logger logged 2206 requests user@sensor:~/httpexperiment$ tshark -r capture.pcap -R http.request -T fields -e frame.time -e ip.src -e ip.dst -e tcp.srcport -e tcp.dstport -e http.request.method -e http.host -e http.request.uri -e http.request.version > tshark.log user@sensor:~/httpexperiment$ bro -r capture.pcap
Now, we need to process httpry’s output a little, since it includes server responses as well as client requests. Once done, we can see how many HTTP requests were seen by each tool. The results were a little surprising to begin with:
httpry: 2567 requests seen
tshark: 2557 requests seen
Bro: 2240 requests seen
Suricata: 2206 requests seen
urlsnarf: 2198 requests seen
httpry and urlsnarf are often (probably quite rightly) viewed as less sophisticated than Suricata and Bro, and tshark might not be the tool of choice for this kind of work, especially when you’re constantly monitoring traffic on an interface rather than processing a pcap. Bro and Suricata certainly picked up things that httpry didn’t (HTTP on non-standard ports, for example), but there’s quite a gulf between the apparently less-sophisticated and the cutting-edge.
So why? Each tool does the job in a different way, and with its own set of pros and cons. httpry, for example, only listens on ports 80 and 8080 by default, has no support for VLAN-tagged frames, and performs no TCP session reassembly so it is vulnerable to evasion. On the other hand, the more sophisticated Bro and Suricata have, to my knowledge, no such shortcomings.
However, it’s httpry’s brute simplicity that I suspect put it on top of the pile for this particular test. All it looks for is one of a nominated set of HTTP method verbs at the beginning of a TCP segment. It doesn’t care about proper TCP session establishment or teardown, all it’s looking for are a few bytes in the proper place. In contrast, other tools may not log any URLs at all if the TCP session isn’t complete and well-formed.
Each tool’s results were a subset of the others’ collective results – no one tool captured the superset, and one could probably achieve “better” (or at very least, “different”) results by tweaking each tool. Even then I reckon there would still be a pretty good chance that there are undiscovered URLs lying there that a different tool could pick up.
I don’t mean for this to be seen as some kind of ranking of tools, or a statement that httpry and tshark are “better” URL extractors than Bro and Suricata; all I’m trying to say is that you’re almost certain to miss something by only using a single tool for a job like this. On the flip side of course, running two or more URL extractors might seem like a galactic waste of resources…
Your mileage will almost certainly vary, but it’s worth keeping this in mind when the dame is at the door demanding answers!
Alec Waters is responsible for all things security at Dataline Software, and can be emailed at firstname.lastname@example.org