Friday, November 8, 2013

Handy Tshark Expressions

Tshark is the CLI version of Wireshark, and it's amazing. I'm going to start collecting some of my favorite tshark one-liners here. Check back often.

Find All Unique Filenames Referenced in SMB2
tshark -r file.pcap -Tfields -e ip.src -e ip.dst -e text smb2 | grep -oP "GUID handle File: .*?," | sort | uniq | awk -F: '{print $2}' | sed 's/,//'

You don't actually need to include the ip.src and ip.dst fields, since they're not extracted by the grep command. I include them in case I want to do an ad-hoc grep for an IP address during the analysis process. Another way to do the same thing would be to modify the display filter to look only for certain addresses, e.g.:

tshark -r file.pcap -Tfields -e text smb2 and ip.addr== | grep -oP "GUID handle File: .*?," | sort | uniq | awk -F: '{print $2}' | sed 's/,//'

Friday, November 1, 2013

How to Tell if TCP Payloads Are Identical

I was working on a problem today in which vendor tech support was suggesting that a firewall was subtly modifying TCP data payloads. I couldn't find any suggestion of this in the firewall logs, but seeing as how I've seen that vendor's firewall logs lie egregiously in the past, I wanted to verify it independently.

I took a packet capture from both hosts involved in the conversation and started thinking about how to see if the data sent by the server was the same as the data received by the client. I couldn't just compare the capture files themselves, because elements like timestamps, TTLs, and IP checksums would be different.

After a bunch of fiddling around, I came up with the idea of using tshark to extract the TCP payloads for each stream in the capture file and hash the results. If the hashes matched, the TCP payloads were being transferred unmodified. Here are the shell commands to do this:

tshark -r server.pcap -T fields -e | sort -u | sed 's/\r//' | xargs -i tshark -r server.pcap -q -z follow,tcp,raw,{} | md5sum
2cfe2dbb5f6220f29ff8aff82f7f68f5 *-

You then run exactly the same commands on the "client.pcap" file and compare the resulting hashes. Let's break this down a bit more:

tshark -r server.pcap -T fields -e

This invokes tshark to read the "server.pcap" file and output the TCP stream indexes of each packet. This is just a long series of integers:


The next command, sort -u, produces a logical set of the unique (hence the "-u") stream indexes. In other words, it removes duplicates from the previous list. Not all Unix-like operating systems have the "sort -u" option; if yours is missing it, you can use "| sort | uniq" instead.

Next, sed 's/\r//' removes the line break from the end of the resulting stream indexes. If you don't do this, you'll get an error from the next command.

The next one's a bit of a doozy: xargs -i takes each stream index (remember, these are just integers) and executes the tshark -r server.pcap -q -z follow,tcp,raw,{}command once for each stream index, substituting the input stream index for the {} characters.

The tshark -r server.pcap -q -z follow,tcp,raw,{} command itself reads the capture file a second time, running the familiar "Follow TCP Stream in Raw Format" command from Wireshark on the specified TCP stream index that replaces the {} characters. If you're rusty on Wireshark, "Follow TCP Stream" just dumps the TCP payload data in one of a variety of formats, such as "raw" or ASCII. If you've never used this option in Wireshark, make sure you try it today!

The final command, md5sum, runs a MD5 hash on the preceding input.

To summarize, we've done this: taken a file, extracted all the raw TCP data payloads from its packets (without headers), and hashed the data with MD5. If we do this on two files and the hashes are the same, we know they contain exactly the same TCP data (barring the infinitesimally small probability of a MD5 hash collision).

In my case, both capture files produced the same hash, proving that the firewall was (for once) playing nice.

Wednesday, October 16, 2013

Java is to JavaScript as Car is to Carpet - a Beginner's Guide

Some recent discussions at work have led me to the surprising realization that lots of people working in IT don't understand that Java and JavaScript are almost completely unrelated to each other. This is actually a fairly important misunderstanding to correct:  it leads to wasted troubleshooting efforts, such as downgrading or upgrading Windows Java installations in response to browser JavaScript errors.

I found the title of this blog entry in a StackOverflow post: "Java is to JavaScript as Car is to Carpet". That's pretty much it, in a nutshell. For the record, the only things that Java and JavaScript have in common are:
  1. They are both programming languages.
  2. The word "Java".
  3. Both came out of the web technology explosion of the early 1990s.
  4. Both are frequently encountered in the context of web browsers.
Java is a compiled programming language that was originally developed with a major goal of allowing similar or identical codebases to run on different platforms without needing to be recompiled. It does this by compiling to "bytecode" rather than platform-specific machine code, which then typically runs inside a so-called "Java Virtual Machine". Java was originally developed and controlled by Sun Microsystems (now Oracle), but it has since been re-licensed under the GNU Public License. Numerous open-source Java implementations now exist, but the Oracle/Sun version is still the most familiar to the average user.

Java is associated with the web browser experience because of the widespread use of Java "applets" that are embedded in browser windows. Applets are not technically part of the browser; the compiled Java bytecode is downloaded by the browser and executed in a Java Virtual Machine (JVM) as a separate process. Applets are frequently transferred as a compressed "Java archive", or JAR file. Applets downloaded by a browser do not necessarily need to run in a browser window, but the fact that they are frequently embedded there leads to some confusion.

Neither is Java necessarily a client-side technology: many popular server-side applications are written in Java and execute in a server-side JVM. Google's Android platform extends things even further, using Java as the programming language but compiling the bytecode to execute on their own proprietary virtual machine.

JavaScript, on the other hand, is an interpreted (i.e., non-compiled) programming language that was originally developed to run inside web browsers. It was developed at Netscape and was later adopted by Microsoft and standardized as "ECMAScript". The use of "Java" in the name "JavaScript" was probably an attempt to piggyback on the popularity of Java; the two languages have almost nothing in common from a technical perspective.

JavaScript is most frequently used to control the web browser experience, but there are many projects that use JavaScript completely outside the browser. My first experience with this dates back to the late 1990s, when I used a JavaScript-based commercial tool to automate software deployments to Windows workstations. Today, there are many interesting non-browser-embedded JavaScript platforms, such as Node.js and PhantomJS.

Friday, August 2, 2013

Understanding Flow Export Terminology

The variety of terms used to describe network flow export technologies and components can be pretty confusing. Just last year I wrote a post on web usage tracking and NetFlow that is already a bit obsolete, so here's an attempt to explain some of the newer terms and capabilities in use today.

NetFlow Version 5
NetFlow v5 is sort of the least common denominator in flow technologies. Almost all vendors and devices that support a flow export technology will do NetFlow v5. Because it's only capable of exporting information about packet fields up to layer 4, however, it's not flexible enough to use for analytics that require information about the application layer. NetFlow v5 tracks only the following data:
  • Source interface
  • Source and destination IP address
  • Layer 4 protocol
  • TCP flags
  • Type of Service
  • Egress interface
  • Packet count
  • Byte count
  • BGP origin AS
  • BGP peer AS
  • IP next hop
  • Source netmask
  • Destination netmask
Netflow Version 9
Netflow v9 was Cisco's first attempt at defining an extensible flow export format, defined in RFC 3954 back in 2004. It provides a flexible format for building customizable flow export records that contain a wide variety of information types. Many of the goals for flexible flow export were defined in RFC 3917:
  •  Usage-based accounting
  • Traffic profiling
  • Traffic engineering
  • Attack/Intrusion Detection
  • QoS monitoring
The RFC defines 79 field types that may be exported in NetFlow v9 packets, and directs the reader to the Cisco website for further field types. The latest document I could find there defines 104 field types, several of which are reserved for vendor proprietary use and some of which are reserved for Cisco use.

IPFIX is the IETF standard for extensible flow export. The basic protocol is specified in RFC 5101, but details are included in many other RFCs (Wikipedia has a partial list). IPFIX is based directly on NetFlow v9 and is generally interoperable, but since it's an open standard it is extensible without Cisco involvement. Hundreds of field types are defined in the IANA IPFIX documentation.

RFC 6759 defines an extension of IPFIX to include application-specific information in IPFIX export  packets. This allows deep-packet-inspection technologies (such as Cisco's NBAR) to send information about non-standardized, tunneled, or encrypted application layer protocols to IPFIX collectors.

IPFIX is being used by various vendors (Plixer, Lancope, and nProbe/nTop come to mind) to export HTTP header data, making it capable of being used as a web usage tracker or web forensics tool with the appropriate collector/analyzer software.

Flexible NetFlow
As far as I can tell, Flexible NetFlow is a marketing term used by Cisco to encompass everything about their approach to configuring and implementing NetFlow v9 and IPFIX.

NSEL (NetFlow Security Event Logging)
NSEL is a proprietary extension of NetFlow v9 to used by Cisco's ASA firewalls to export firewall log data. It's not clear to me why Cisco didn't use IPFIX for this purpose.

Cisco AVC (Application Visibility and Control)
AVC is another Cisco marketing term that encompasses a variety of technologies surrounding the DPI and application-based routing capabilities in its routers, such as IPFIX, NetFlow v9, NBAR, PfR, ART (Application Response Time), and more.

Other Vendors
As mentioned above, most network technology vendors support NetFlow v5 and/or v9. IPFIX support is now becoming very common. Some vendors use proprietary extensions of NetFlow v9; Riverbed's CascadeFlow is one example of this.

In a followup post, I'll take a look at some tools that produce flow data without using export technologies.

Monday, May 6, 2013

Convert Hex to Decimal in IOS

Lots of IOS commands produce output in hex that I sometimes want to convert to decimal. Common ones for me are stuff like:

show ip cache flow
show ip flow top-talkers

and various debug commands. For example:

Router#sh ip cache flow | i Fa1/0.6
Fa0/1.6   Tu101*  06 0DA7 6D61  345

I have no idea what the port numbers in columns six and seven are. Fortunately, if the IOS device has the TCL or Bash shells available, we can quickly convert them.

Method 1: Tcl Shell

Most routers have the Tcl shell available:

Router(tcl)#puts [expr 0xda7]

Router(tcl)#puts [expr 0x6d61]

You could write a callable Tcl script to make this permanently available from normal EXEC mode too.

Method 2: Bash Shell

The Bash shell came out in one of the early IOS 15.0 versions, so you may or may not have it available. You need to explicitly enable it by entering "shell processing full" in global configuration mode.

Router#sh run | i shell
shell processing full < required to enable Bash in IOS 15+
Router#printf "%d" 0xda7
Router#printf "%d" 0x6d61

Friday, April 12, 2013

Building a Ghetto WAN Emulation Network

I wanted a way to do some controlled tests of WAN acceleration products, using a production network. You can buy or rent commercial WAN emulators, but for my purposes it seemed like an improvised solution would suffice. I had a couple of Cisco 2800 routers, a switch, and an ESXi box in my lab that I could press into service, so I built a test network that looks like this:

R1 acts like the WAN router at a branch site. It has a QoS policy with a "shape average" statement on its "WAN" interface to change the bandwidth to whatever we want to test.

R2 simply NATs the test traffic onto an IP address in the production network, since I didn't feel like configuring a new production subnet just for the test.

The ESXi box is where the fun part lives: I created two vSwitches and connected one physical NIC to each. I then spun up a simple Ubuntu 12.04 VM with eth0 and eth1 connected to each of the two vSwitches, giving me a separate network connected to each Cisco router. I then enabled routing on the Linux VM and created the appropriate static routes to enable the test and production networks to communicate. Finally, I used the "netem" WAN emulator built into the Linux kernel to inject delay, jitter, and packet loss into the network. Voila -- a network de-optimizer!

For testing the WAN accelerators, we'll just install one in the test VLAN and one between the Linux router and R2.

Here are the basic steps required to set up the Linux de-optimizer, in case you want to try to build your own:

1) Basic Ubuntu 12.04 VM. I used 4GB RAM and 24GB disk, but you could get away with less.

2) sudo apt-get install openssh-server Install SSH server so you can still get to the VM from the test network when you break the rest of the test environment. Do this before you break something... ask me how I know. Don't forget to enable SSH on your lab routers too...

3)  Turn off Network Manager so it doesn't mess with your static addressing and routing config: edit the file /etc/NetworkManager/NetworkManager.conf and change "managed=false" to "managed=true"

4) Configure static addressing on your two NICs by editing /etc/network/interfaces:

auto eth0
iface eth0 inet static
auto eth1
iface eth1 inet static

5) Reboot or restart networking.

6) Enable IPv4 routing:

sudo sysctl -w net.ipv4.ip_forward=1

7) Configure static routes for the production and test networks:

sudo route add -net netmask gw
sudo route add -net netmask gw 

In this case is the network connecting to R1, connects to R2, and is the test VLAN.

You may also need to delete other routes if they were autoconfigured.

8) After testing that everything works, add some latency:

sudo tc qdisc add dev eth0 root netem delay 50ms 5ms

Do a search for "Linux netem" for a wider array of commands to change delay, jitter, and packet loss.

With this setup, the routing configuration and WAN emulation settings will NOT persist after a reboot, so you can always reboot if you screw something up. Start over at step 6.

Friday, March 29, 2013

Friday Distraction: Who's Leaking >/24 to Global BGP?

[It occurred to me after finishing this that I should have done everything based on ASN, but play time is over for the day...]

An interesting conversation with my friend @denise_donohue led to this question: what providers are leaking prefixes longer than /24 to the global Internet?

Following my continuing theme of "fun stuff you can do by combining IOS and Bash", I ran a two step process via one of my BGP routers to get the answer:

$ ssh routername 'show ip bgp prefix-list GT24' > /tmp/gt24.txt

$ grep "^*" /tmp/gt24.txt | awk '{print $1}' | sed 's/\*>i//g' | awk -F. '{OFS=".";print $1,$2 ".0.0"}'  | sort -u | xargs -i whois {} | grep netname | sort -u

Here's the breakdown:

 Extract just valid BGP prefixes from the router output:

grep "^*" /tmp/gt24.txt | awk '{print $1}'
Extract just the prefix itself and substitute ".0.0" for the last two octets, normalizing to the parent /16, then remove duplicates:

| awk '{print $1}' | sed 's/\*>i//g' | awk -F. '{OFS=".";print $1,$2 ".0.0"}'  | sort -u

Send those prefixes one-by-one to the "whois" command, extract the "netname" field, and remove duplicates again:

| xargs -i whois {} | grep netname | sort -u
Note that this takes a while to run because of the time it takes the Whois server to respond.

The prefix-list that I used to get the output from the first step is as follows:

ip prefix-list GT24 permit ge 25

Note that I used this as an argument to "show ip bgp", not as part of the config!

Now, this obviously isn't entirely accurate, because it only shows the providers that are leaking long prefixes that aren't being filtered by any of my providers, but it's interesting. I also searched based only on the parent /16, so there could be lower-level providers that I'm missing.

Some of them are clearly the same provider tagged with different whois records (e.g., "TBROAD" and "TBROAD-KR").

netname:        AFRINIC-NET-TRANSFERRED-200909

netname:        ASI

netname:        ASIANET

netname:        AquaRaySARL-2

netname:        BIDCMain
(about 100 more)

etc. Run it yourself to get the full list from your router's perspective!

Saturday, March 9, 2013

Quick Tip: Improvised File Transfer

The Python module "SimpleHTTPServer" is traditionally a quick and dirty way to test web code without the overhead of installing a full webserver. You can also use it as a quick way to transfer files between systems, if you have Python available:

jswan@ubuntu:~$ python -m SimpleHTTPServer 8080
Serving HTTP on port 8080 ...

This causes Python to make all the files in the current directory available over HTTP on port 8080. From the client system, you can then use a browser, curl, or wget to transfer the file.

No, it's not secure, and yes, it may violate data exfiltration policies (and as an aside, Bro detects this by default). But I've used it fairly often to move files between Windows and Mac or Linux in situations where SCP isn't available and I don't feel like setting up a fileshare.

I've also used this in conjunction with the "time" command to test the effects of latency on various network protocols, and as an improvised way to test WAN optimization software.

Sunday, March 3, 2013

An Operational Reason for Knowing Trivia

I've been largely out of touch with the IT certification scene lately, but I'm sure that people are still complaining incessantly about the fact that they need to memorize "trivia" in order to pass certification tests. Back when I was teaching Cisco classes full-time, my certification-oriented students were particularly bitter about this. Of course, this is a legitimate debate and the definition of "trivia" varies from person to person.

When I saw this article about CloudFlare's world-wide router meltdown, however, I immediately felt a bit smug about all those hours spent learning and teaching about packet-level trivia. If you don't want to read the article, here's the tl;dr:

  • their automated DDoS detection tool detected an attack against a customer using packets sized in the 99,000 byte range.
  • their ops staff pushed rules to their routers to drop those packets
  • their routers crashed and burned
So at this point you should be saying what some of the commenters did: huh? IP packets have a maximum size of 65,536 bytes, because the length field is 16 bits long.

In order for this meltdown to happen, they had to have a compounded series of errors:

  • the attack detection tool was coded to allow detection of packet sizes that can't actually occur: no bounds checking.
  • the ops staff didn't retain the "trivia" that they learned in Networking 101, and thus couldn't see the problem with the output generated by the detection tool.
  • the router OS didn't do input validation, and blew up when attempting to configure itself to do something crazy.
There's lots of blame to go around here, and my intent isn't to add to that; rather, my point is to tell everybody who dutifully memorized and retained stuff like the maximum IP packet size to feel good about yourself for a few minutes! And next time you write networking code: do input validation and bounds checking.

Sunday, January 27, 2013

Baby Bro, Part 3: Containers and Loops

Bro has four main container types, which I'm going to cover in somewhat nontraditional order:
  • tables
  • sets
  • vectors
  • records
A table is a collection of indexed key-value pairs: the same idea is referred to as a dictionary, associative array, or hash table in other languages. Here's a simple example that pairs letters with their place in the alphabet:

event bro_init()
    local letters = table([1] = "a", [2] = "b", [3] = "c");
    print letters;

Running it, we get this:

jswan@so12a:~/bro$ bro tables.bro
[3] = c,
[1] = a,
[2] = b

 Note that the output isn't in the same order as the script; in Bro, like in most other languages, hash tables are unordered.

Iterating over a table with a "for" loop returns the key, again like other languages:

event bro_init()
    local letters = table([1] = "a", [2] = "b", [3] = "c");
    for (key in letters)
        print letters[key];

And the output:

jswan@so12a:~/bro$ bro tables.bro

Because we printed only the value associated with the key, we never see the key in the output.

It's common in programming to need a data type that allows one to make a collection of distinct objects, without containing multiple instances of identical objects. This is the mathematical notion of a set. Bro has a native set type. Consider an example where you have a large list of IP addresses that you got from some other source: an intel feed, a firewall log, a web server log, etc. You want to get a unique set of addresses that have appeared, but you don't care how many times they appeared. This is the perfect use for a set:

event bro_init()
    local a1 =;
    local a2 =;
    local a3 =;
    local a4 = [fe80::abcd:1];
    local a5 =;
    local a6 =;
    local unique = set(a1,a2,a3,a4,a5,a6);
    print unique;

And the output:

jswan@so12a:~/bro$ bro sets.bro

Note that appears only once in the set, despite having been included three times as different variables.

A vector is Bro's version of a one dimensional array. Having spent most of my recent programming time in Python, I was surprised to find that Bro vectors work like hash tables for iteration: when you loop over a vector, you get the index into the vector rather than the object itself. Here's an example:

event bro_init()
    local animals = vector("cat","dog","dinosaur","rat");
    for (animal in animals)
        print animal;

The output:
jswan@so12a:~/bro$ bro vectors.bro

If you want to iterate a vector and get the object, you have to specify the index:

event bro_init()
    local animals = vector("cat","dog","dinosaur","rat");
    for (index in animals)
        print animals[index];

jswan@so12a:~/bro$ bro vectors.bro

Bro's vectors work pretty much exactly like a table with "counts" as keys (a count is yet another native Bro type that we haven't discussed yet; it's the same as an integer but it's always unsigned; it can't be a negative). In fact, some of the earlier Bro documentation doesn't even show vectors as valid types, so I wonder if they are actually implemented internally as tables.

The last Bro container type is the record, which is sort of the meat and potatoes of Bro's wonderful logging tools. Records are discussed in detail in most of the other beginner-Bro material out there, so I'm not going to cover them here.

This will probably be the last "Baby Bro" post that I do with just the raw language features demonstrated inside the bro_init() event. Any further Bro posts will probably be using Bro in its intended context. Hope this was helpful!

Thursday, January 24, 2013

Cisco Ironport WSA with WCCP and IP Spoofing

Recently I had to set up a transparent proxy with the Cisco Ironport Web Security Appliance (WSA) using WCCP on a Catalyst 6500 with a Sup720, with IP spoofing and web cache ACLs enabled. Like with many technologies, this turned out to be pretty simple but I couldn't find it documented all in one place. Perfect blog fodder!

The network topology looked like this (simplified, but not by much):

Normally when you set up a transparent proxy with WCCP, the IP address of the proxy server is used as the source of the HTTP requests. The problem in this topology is that I wanted the real source address of the client to appear in the firewall logs. The IP spoofing feature on the WSA allows this to happen, but it requires configuring bidirectional WCCP redirection on the Cat6k. If this had been a Cisco ASA firewall, we could have enabled WCCP there and saved some trouble, but in this case the network was using a firewall from another vendor that didn't support WCCP.

One important thing to realize about WCCP on the Catalyst 6500 with the Sup720 is that WCCP egress redirection is done with software switching rather than in hardware, so if you find yourself wanting to use the command "ip wccp redirect out", you're virtually guaranteeing that you're going to redline the CPU on your supervisor. Thus, we want to do only ingress redirection.

The 6500 configuration is as follows:

! this ACL prevents web traffic to internal servers (not shown in diagram)
! from being inspected
ip access-list extended ACL_WCCP
 deny   ip any
 deny   ip any
 permit ip any any

! define the web cache, referencing the ACL above 
! this WCCP service handles only standard HTTP traffic
ip wccp web-cache redirect-list ACL_WCCP
! a second WCCP service is used for reverse-path redirection, which
! is required for IP spoofing to work
ip wccp 90

interface Vlan 10
 description to client networks
 ip wccp web-cache redirect in

interface Vlan 30
 description to firewall cluster 
 ! WCCP service group 90 is applied inbound on the return path
 ! from the Internet so that IP spoofing will work
 ip wccp 90 web-cache redirect in

If you have multiple layer 3 interfaces going to client networks, you need to enable WCCP on all of them if reverse-path redirection is enabled. If we weren't using reverse redirection (i.e., if we weren't using IP spoofing), this wouldn't be the case: we could simply leave WCCP disabled on interfaces whose traffic shouldn't be proxied. With reverse redirection, though, the return traffic is always sent to the proxy server; if the proxy server sees the return traffic but not the egress traffic, it gets dropped. If you need to use IP spoofing and still have certain types of web traffic bypass the proxy, you would need to do this with ACLs applied to the WCCP service, rather than simply by leaving WCCP disabled on certain interfaces.

On the WSA, you create two WCCPv2 services under the Network-->Transparent Redirection menu, one with service group 0 (the default), and one with service group 90 (matching the the IOS configuration above). On group 90, you enable "redirect based on source port". For both groups, you enter the IP address of the SVI as the upstream router address (the IP address of VLAN 20 in this case). The WCCP service on the WSA then registers automatically with the Sup720.

Under the Security Services--> Web Proxy Settings menu, you enable transparent mode with IP spoofing for transparent connections only.

That's it: now you have a transparent proxy, with IP spoofing so that your firewall logs show accurate client IP addresses. Handling HTTPS traffic or HTTP traffic on non-standard ports is beyond the scope of this post.

Saturday, January 19, 2013

Baby Bro, Part 2: Conditionals, Address Types

Bro has native types for addresses and networks, making it much easier to work with network data. Today's Baby Bro script shows global variable definition, the use of the address and subnet types, and a simple conditional:

# declaring global variables
# no need to put quotes around addr or subnet variable definitions
global ipv4_host:addr =; 
global ipv4_net:subnet =;
event bro_init()
    if (ipv4_host in ipv4_net)
        # addr and subnet types are autoconverted to strings with fmt 
        print fmt("%s is in network %s",ipv4_host,ipv4_net);
        print fmt("host %s is not in network %s",ipv4_host,ipv4_net);

Running this from the CLI, we get the expected output:

jswan@so12a:~/bro$ bro addr_net_types.bro is in network

Bro also has several interesting built-in functions for working with network data that we'll explore in upcoming posts. For now, we'll take a look at the mask_addr function, which allows you to use Bro as an improvised subnet calculator. You can run a Bro micro-script from the CLI with with  the -e option, just like the -e flag in Perl or the -c flag in Python:

jswan@so12a:~/bro$ bro -e "print mask_addr(,14);"
jswan@so12a:~/bro$ bro -e "print mask_addr(,31);"

Great for those late-night subnetting sessions after too many microbrews!

Just in case you were wondering: all of this works natively for IPv6, with some changes to the syntax:

jswan@so12a:~/bro$ bro -e "print [fe80::1db9] in [fe80::]/64;"
# T is the way Bro outputs "True" in a Boolean test

We'll look at some more IPv6 stuff in an upcoming post.

Tuesday, January 15, 2013

Baby Bro, Part 1: Functions Etc.

[Note: Blogger seems to have done something nasty to my new blog template, so it's back to the old one at least temporarily]

Here's my first "Baby Bro" post. Before getting into using Bro scripting for its intended use of network traffic analysis, I wanted to figure out how to accomplish basic tasks common to most programming languages:

  • Functions
  • Common types and variable definitions
  • Loops
  • Conditionals
  • Iteration of container types
  • Basic string and arithmetic operations
This is the kind of stuff that many programmers can figure out instantly by looking at a language reference sheet, but I think it helps the rest of us to have explicit examples.

I'm not sure if I'll get through all of them in this series, but here's a start: a main dish of functions, with a side of string formatting and concatenation.

  1 # "add one" is the function name
  2 # (i:int) is the variable and type passed into the function
  3 # the final "int" is the type returned by the return statement
  4 function add_one(received_value:int): int
  5     {
  6     local returned_value = received_value + 1;
  7     return returned_value;
  8     }
 10 # this function shows two strings passed in, returning a string
 11 function concat(a:string,b:string): string
 12     {
 13     return a + " " + b; # one way of doing string concatenation
 14     }
 16 event bro_init() # bro_init() fires when Bro starts running
 17     { 
 18     local x = 3; # defining a local variable
 19     local y = add_one(x); # using the first function defined above
 20     print fmt("%d + 1 = %d",x,y); # formatted printing as in printf
 22     print concat("first","second"); # using the second function defined above
 23     }

I think this is fairly self explanatory, given the comments. We have two functions:

  • add_one: adds one to whatever integer is passed into the function, and returns the resulting integer.
  • concat: concatenates two strings, separated by a space, and returns the result. There is a built-in string function for this, but I wanted to show that you can also do it with "+".
I also show local variable definition (Bro also has globals, defined with the global keyword) and string formatting. String formatting is basically the same as printf in other languages.

We can run this from the CLI with no PCAP ingestion just to get the standard output:

jswan@so12a:~/bro$ bro test.bro
3 + 1 = 4
first second

Monday, January 14, 2013

Basic Bro Language References

Finding simple examples of Bro language features is somewhat difficult: the scripts that come packaged with Bro are written by experts in the language and are quite idiomatic.

Here are some of the basic Bro language references I've found so far. In upcoming blog posts, I'll show some "Baby Bro" that is even more basic than these examples.

From Ryesecurity by Scott Runnels(@srunnels):
Solving Network Forensic Challenges with Bro, Part 1
Solving Network Forensic Challenges with Bro, Part 2
Solving Network Forensic Challenges with Bro, Part 3
Logging YouTube Titles with Bro

Justin Azoff's (@JustinAzoff) Bro Presentation on Github

The Official Bro Workshop 2011 Pages

The Bro Language Cheat Sheet

Sunday, January 13, 2013


Yee-haw! New blog template!

Trying to figure out how to do source code highlighting in a way that doesn't suck or rely on external JavaScript hosting.

Friday, January 11, 2013

"Hello World" in Bro IDS

One of the reasons I don't blog that much is that I generally assume that everything worth blogging has already been done, and that everyone reading is probably smarter than me and doesn't need me to explain things. I'm going to pretend that those are non-issues and try to blog more, no matter how basic the topic. 

I just got back from FloCon 2013, which was quite interesting. The highlight for me was some informal after-hours knowledge-dumps from Seth Hall (@remor) and Liam Randall (@hectaman) on the subject of Bro (@Bro_IDS).

Before these mini-lessons, I didn't really have a good idea of how to get started with Bro scripting. Now I do!

The stuff we covered was actually a lot more complex than Hello World, but in the spirit of beginning coders everywhere, here's how you do "Hello World" in Bro (and a little more):

ubuntu@ip-10-73-25-224:~/bro$ cat hello.bro 
    print "Hello World!";

    print "Goodbye World!";

The fundamental idea behind Bro is that it's a scripting language that responds to events that are derived from packet streams. When Bro is monitoring a raw packet feed or ingesting a pcap file, it fires events whenever something interesting happens: FTP sessions, HTTP sessions, SSH sessions, etc.

The script above responds to the two simplest Bro events: the startup and shutdown of the software. If we run Bro from the CLI with a dummy pcap file to ingest, it writes the output to the terminal:

ubuntu@ip-10-73-25-224:~/bro$ bro -C -r foo.pcap hello.bro
Hello World!
Goodbye World!

Of course, this isn't something we'd ever do in a real Bro environment; we'd want to be actually looking at the packet stream and taking actions in response to it. Stay tuned for more.