Tshark is the CLI version of Wireshark, and it's amazing. I'm going to start collecting some of my favorite tshark one-liners here. Check back often.
Find All Unique Filenames Referenced in SMB2
tshark -r file.pcap -Tfields -e ip.src -e ip.dst -e text smb2 | grep -oP "GUID handle File: .*?," | sort | uniq | awk -F: '{print $2}' | sed 's/,//'
Notes:
You don't actually need to include the ip.src and ip.dst fields, since they're not extracted by the grep command. I include them in case I want to do an ad-hoc grep for an IP address during the analysis process. Another way to do the same thing would be to modify the display filter to look only for certain addresses, e.g.:
tshark
-r file.pcap -Tfields -e text smb2 and ip.addr==1.1.1.1 | grep -oP "GUID
handle File: .*?," | sort | uniq | awk -F: '{print $2}' | sed 's/,//'
Friday, November 8, 2013
Friday, November 1, 2013
How to Tell if TCP Payloads Are Identical
I was working on a problem today in which vendor tech support was suggesting that a firewall was subtly modifying TCP data payloads. I couldn't find any suggestion of this in the firewall logs, but seeing as how I've seen that vendor's firewall logs lie egregiously in the past, I wanted to verify it independently.
I took a packet capture from both hosts involved in the conversation and started thinking about how to see if the data sent by the server was the same as the data received by the client. I couldn't just compare the capture files themselves, because elements like timestamps, TTLs, and IP checksums would be different.
After a bunch of fiddling around, I came up with the idea of using tshark to extract the TCP payloads for each stream in the capture file and hash the results. If the hashes matched, the TCP payloads were being transferred unmodified. Here are the shell commands to do this:
tshark -r server.pcap -T fields -e tcp.stream | sort -u | sed 's/\r//' | xargs -i tshark -r server.pcap -q -z follow,tcp,raw,{} | md5sum
2cfe2dbb5f6220f29ff8aff82f7f68f5 *-
You then run exactly the same commands on the "client.pcap" file and compare the resulting hashes. Let's break this down a bit more:
tshark -r server.pcap -T fields -e tcp.stream
This invokes tshark to read the "server.pcap" file and output the TCP stream indexes of each packet. This is just a long series of integers:
0
0
1
2
1
etc.
The next command, sort -u, produces a logical set of the unique (hence the "-u") stream indexes. In other words, it removes duplicates from the previous list. Not all Unix-like operating systems have the "sort -u" option; if yours is missing it, you can use "| sort | uniq" instead.
Next, sed 's/\r//' removes the line break from the end of the resulting stream indexes. If you don't do this, you'll get an error from the next command.
The next one's a bit of a doozy: xargs -i takes each stream index (remember, these are just integers) and executes the tshark -r server.pcap -q -z follow,tcp,raw,{}command once for each stream index, substituting the input stream index for the {} characters.
The tshark -r server.pcap -q -z follow,tcp,raw,{} command itself reads the capture file a second time, running the familiar "Follow TCP Stream in Raw Format" command from Wireshark on the specified TCP stream index that replaces the {} characters. If you're rusty on Wireshark, "Follow TCP Stream" just dumps the TCP payload data in one of a variety of formats, such as "raw" or ASCII. If you've never used this option in Wireshark, make sure you try it today!
The final command, md5sum, runs a MD5 hash on the preceding input.
To summarize, we've done this: taken a file, extracted all the raw TCP data payloads from its packets (without headers), and hashed the data with MD5. If we do this on two files and the hashes are the same, we know they contain exactly the same TCP data (barring the infinitesimally small probability of a MD5 hash collision).
In my case, both capture files produced the same hash, proving that the firewall was (for once) playing nice.
I took a packet capture from both hosts involved in the conversation and started thinking about how to see if the data sent by the server was the same as the data received by the client. I couldn't just compare the capture files themselves, because elements like timestamps, TTLs, and IP checksums would be different.
After a bunch of fiddling around, I came up with the idea of using tshark to extract the TCP payloads for each stream in the capture file and hash the results. If the hashes matched, the TCP payloads were being transferred unmodified. Here are the shell commands to do this:
tshark -r server.pcap -T fields -e tcp.stream | sort -u | sed 's/\r//' | xargs -i tshark -r server.pcap -q -z follow,tcp,raw,{} | md5sum
2cfe2dbb5f6220f29ff8aff82f7f68f5 *-
You then run exactly the same commands on the "client.pcap" file and compare the resulting hashes. Let's break this down a bit more:
tshark -r server.pcap -T fields -e tcp.stream
This invokes tshark to read the "server.pcap" file and output the TCP stream indexes of each packet. This is just a long series of integers:
0
0
1
2
1
etc.
The next command, sort -u, produces a logical set of the unique (hence the "-u") stream indexes. In other words, it removes duplicates from the previous list. Not all Unix-like operating systems have the "sort -u" option; if yours is missing it, you can use "| sort | uniq" instead.
Next, sed 's/\r//' removes the line break from the end of the resulting stream indexes. If you don't do this, you'll get an error from the next command.
The next one's a bit of a doozy: xargs -i takes each stream index (remember, these are just integers) and executes the tshark -r server.pcap -q -z follow,tcp,raw,{}command once for each stream index, substituting the input stream index for the {} characters.
The tshark -r server.pcap -q -z follow,tcp,raw,{} command itself reads the capture file a second time, running the familiar "Follow TCP Stream in Raw Format" command from Wireshark on the specified TCP stream index that replaces the {} characters. If you're rusty on Wireshark, "Follow TCP Stream" just dumps the TCP payload data in one of a variety of formats, such as "raw" or ASCII. If you've never used this option in Wireshark, make sure you try it today!
The final command, md5sum, runs a MD5 hash on the preceding input.
To summarize, we've done this: taken a file, extracted all the raw TCP data payloads from its packets (without headers), and hashed the data with MD5. If we do this on two files and the hashes are the same, we know they contain exactly the same TCP data (barring the infinitesimally small probability of a MD5 hash collision).
In my case, both capture files produced the same hash, proving that the firewall was (for once) playing nice.
Wednesday, October 16, 2013
Java is to JavaScript as Car is to Carpet - a Beginner's Guide
Some recent discussions at work have led me to the surprising realization that lots of people working in IT don't understand that Java and JavaScript are almost completely unrelated to each other. This is actually a fairly important misunderstanding to correct: it leads to wasted troubleshooting efforts, such as downgrading or upgrading Windows Java installations in response to browser JavaScript errors.
I found the title of this blog entry in a StackOverflow post: "Java is to JavaScript as Car is to Carpet". That's pretty much it, in a nutshell. For the record, the only things that Java and JavaScript have in common are:
Java is associated with the web browser experience because of the widespread use of Java "applets" that are embedded in browser windows. Applets are not technically part of the browser; the compiled Java bytecode is downloaded by the browser and executed in a Java Virtual Machine (JVM) as a separate process. Applets are frequently transferred as a compressed "Java archive", or JAR file. Applets downloaded by a browser do not necessarily need to run in a browser window, but the fact that they are frequently embedded there leads to some confusion.
Neither is Java necessarily a client-side technology: many popular server-side applications are written in Java and execute in a server-side JVM. Google's Android platform extends things even further, using Java as the programming language but compiling the bytecode to execute on their own proprietary virtual machine.
JavaScript, on the other hand, is an interpreted (i.e., non-compiled) programming language that was originally developed to run inside web browsers. It was developed at Netscape and was later adopted by Microsoft and standardized as "ECMAScript". The use of "Java" in the name "JavaScript" was probably an attempt to piggyback on the popularity of Java; the two languages have almost nothing in common from a technical perspective.
JavaScript is most frequently used to control the web browser experience, but there are many projects that use JavaScript completely outside the browser. My first experience with this dates back to the late 1990s, when I used a JavaScript-based commercial tool to automate software deployments to Windows workstations. Today, there are many interesting non-browser-embedded JavaScript platforms, such as Node.js and PhantomJS.
I found the title of this blog entry in a StackOverflow post: "Java is to JavaScript as Car is to Carpet". That's pretty much it, in a nutshell. For the record, the only things that Java and JavaScript have in common are:
- They are both programming languages.
- The word "Java".
- Both came out of the web technology explosion of the early 1990s.
- Both are frequently encountered in the context of web browsers.
Java is associated with the web browser experience because of the widespread use of Java "applets" that are embedded in browser windows. Applets are not technically part of the browser; the compiled Java bytecode is downloaded by the browser and executed in a Java Virtual Machine (JVM) as a separate process. Applets are frequently transferred as a compressed "Java archive", or JAR file. Applets downloaded by a browser do not necessarily need to run in a browser window, but the fact that they are frequently embedded there leads to some confusion.
Neither is Java necessarily a client-side technology: many popular server-side applications are written in Java and execute in a server-side JVM. Google's Android platform extends things even further, using Java as the programming language but compiling the bytecode to execute on their own proprietary virtual machine.
JavaScript, on the other hand, is an interpreted (i.e., non-compiled) programming language that was originally developed to run inside web browsers. It was developed at Netscape and was later adopted by Microsoft and standardized as "ECMAScript". The use of "Java" in the name "JavaScript" was probably an attempt to piggyback on the popularity of Java; the two languages have almost nothing in common from a technical perspective.
JavaScript is most frequently used to control the web browser experience, but there are many projects that use JavaScript completely outside the browser. My first experience with this dates back to the late 1990s, when I used a JavaScript-based commercial tool to automate software deployments to Windows workstations. Today, there are many interesting non-browser-embedded JavaScript platforms, such as Node.js and PhantomJS.
Friday, August 2, 2013
Understanding Flow Export Terminology
The variety of terms used to describe network flow export technologies and components can be pretty confusing. Just last year I wrote a post on web usage tracking and NetFlow that is already a bit obsolete, so here's an attempt to explain some of the newer terms and capabilities in use today.
NetFlow Version 5
NetFlow v5 is sort of the least common denominator in flow technologies. Almost all vendors and devices that support a flow export technology will do NetFlow v5. Because it's only capable of exporting information about packet fields up to layer 4, however, it's not flexible enough to use for analytics that require information about the application layer. NetFlow v5 tracks only the following data:
Netflow v9 was Cisco's first attempt at defining an extensible flow export format, defined in RFC 3954 back in 2004. It provides a flexible format for building customizable flow export records that contain a wide variety of information types. Many of the goals for flexible flow export were defined in RFC 3917:
IPFIX
IPFIX is the IETF standard for extensible flow export. The basic protocol is specified in RFC 5101, but details are included in many other RFCs (Wikipedia has a partial list). IPFIX is based directly on NetFlow v9 and is generally interoperable, but since it's an open standard it is extensible without Cisco involvement. Hundreds of field types are defined in the IANA IPFIX documentation.
RFC 6759 defines an extension of IPFIX to include application-specific information in IPFIX export packets. This allows deep-packet-inspection technologies (such as Cisco's NBAR) to send information about non-standardized, tunneled, or encrypted application layer protocols to IPFIX collectors.
IPFIX is being used by various vendors (Plixer, Lancope, and nProbe/nTop come to mind) to export HTTP header data, making it capable of being used as a web usage tracker or web forensics tool with the appropriate collector/analyzer software.
Flexible NetFlow
As far as I can tell, Flexible NetFlow is a marketing term used by Cisco to encompass everything about their approach to configuring and implementing NetFlow v9 and IPFIX.
NSEL (NetFlow Security Event Logging)
NSEL is a proprietary extension of NetFlow v9 to used by Cisco's ASA firewalls to export firewall log data. It's not clear to me why Cisco didn't use IPFIX for this purpose.
Cisco AVC (Application Visibility and Control)
AVC is another Cisco marketing term that encompasses a variety of technologies surrounding the DPI and application-based routing capabilities in its routers, such as IPFIX, NetFlow v9, NBAR, PfR, ART (Application Response Time), and more.
Other Vendors
As mentioned above, most network technology vendors support NetFlow v5 and/or v9. IPFIX support is now becoming very common. Some vendors use proprietary extensions of NetFlow v9; Riverbed's CascadeFlow is one example of this.
In a followup post, I'll take a look at some tools that produce flow data without using export technologies.
NetFlow Version 5
NetFlow v5 is sort of the least common denominator in flow technologies. Almost all vendors and devices that support a flow export technology will do NetFlow v5. Because it's only capable of exporting information about packet fields up to layer 4, however, it's not flexible enough to use for analytics that require information about the application layer. NetFlow v5 tracks only the following data:
- Source interface
- Source and destination IP address
- Layer 4 protocol
- TCP flags
- Type of Service
- Egress interface
- Packet count
- Byte count
- BGP origin AS
- BGP peer AS
- IP next hop
- Source netmask
- Destination netmask
Netflow v9 was Cisco's first attempt at defining an extensible flow export format, defined in RFC 3954 back in 2004. It provides a flexible format for building customizable flow export records that contain a wide variety of information types. Many of the goals for flexible flow export were defined in RFC 3917:
- Usage-based accounting
- Traffic profiling
- Traffic engineering
- Attack/Intrusion Detection
- QoS monitoring
IPFIX
IPFIX is the IETF standard for extensible flow export. The basic protocol is specified in RFC 5101, but details are included in many other RFCs (Wikipedia has a partial list). IPFIX is based directly on NetFlow v9 and is generally interoperable, but since it's an open standard it is extensible without Cisco involvement. Hundreds of field types are defined in the IANA IPFIX documentation.
RFC 6759 defines an extension of IPFIX to include application-specific information in IPFIX export packets. This allows deep-packet-inspection technologies (such as Cisco's NBAR) to send information about non-standardized, tunneled, or encrypted application layer protocols to IPFIX collectors.
IPFIX is being used by various vendors (Plixer, Lancope, and nProbe/nTop come to mind) to export HTTP header data, making it capable of being used as a web usage tracker or web forensics tool with the appropriate collector/analyzer software.
Flexible NetFlow
As far as I can tell, Flexible NetFlow is a marketing term used by Cisco to encompass everything about their approach to configuring and implementing NetFlow v9 and IPFIX.
NSEL (NetFlow Security Event Logging)
NSEL is a proprietary extension of NetFlow v9 to used by Cisco's ASA firewalls to export firewall log data. It's not clear to me why Cisco didn't use IPFIX for this purpose.
Cisco AVC (Application Visibility and Control)
AVC is another Cisco marketing term that encompasses a variety of technologies surrounding the DPI and application-based routing capabilities in its routers, such as IPFIX, NetFlow v9, NBAR, PfR, ART (Application Response Time), and more.
Other Vendors
As mentioned above, most network technology vendors support NetFlow v5 and/or v9. IPFIX support is now becoming very common. Some vendors use proprietary extensions of NetFlow v9; Riverbed's CascadeFlow is one example of this.
In a followup post, I'll take a look at some tools that produce flow data without using export technologies.
Monday, May 6, 2013
Convert Hex to Decimal in IOS
Lots of IOS commands produce output in hex that I sometimes want to convert to decimal. Common ones for me are stuff like:
show ip cache flow
show ip flow top-talkers
and various debug commands. For example:
Router#sh ip cache flow | i Fa1/0.6
Fa0/1.6 10.5.188.158 Tu101* 10.5.24.15 06 0DA7 6D61 345
I have no idea what the port numbers in columns six and seven are. Fortunately, if the IOS device has the TCL or Bash shells available, we can quickly convert them.
Router#tclsh
Router(tcl)#puts [expr 0xda7]
3495
Router(tcl)#puts [expr 0x6d61]
28001
You could write a callable Tcl script to make this permanently available from normal EXEC mode too.
Router#sh run | i shell
shell processing full < required to enable Bash in IOS 15+
Router#printf "%d" 0xda7
3495
Router#printf "%d" 0x6d61
28001
show ip cache flow
show ip flow top-talkers
and various debug commands. For example:
Router#sh ip cache flow | i Fa1/0.6
Fa0/1.6 10.5.188.158 Tu101* 10.5.24.15 06 0DA7 6D61 345
I have no idea what the port numbers in columns six and seven are. Fortunately, if the IOS device has the TCL or Bash shells available, we can quickly convert them.
Method 1: Tcl Shell
Most routers have the Tcl shell available:Router#tclsh
Router(tcl)#puts [expr 0xda7]
3495
Router(tcl)#puts [expr 0x6d61]
28001
You could write a callable Tcl script to make this permanently available from normal EXEC mode too.
Method 2: Bash Shell
The Bash shell came out in one of the early IOS 15.0 versions, so you may or may not have it available. You need to explicitly enable it by entering "shell processing full" in global configuration mode.Router#sh run | i shell
shell processing full < required to enable Bash in IOS 15+
Router#printf "%d" 0xda7
3495
Router#printf "%d" 0x6d61
28001
Saturday, March 9, 2013
Quick Tip: Improvised File Transfer
The Python module "SimpleHTTPServer" is traditionally a quick and dirty way to test web code without the overhead of installing a full webserver. You can also use it as a quick way to transfer files between systems, if you have Python available:
jswan@ubuntu:~$ python -m SimpleHTTPServer 8080
Serving HTTP on 0.0.0.0 port 8080 ...
This causes Python to make all the files in the current directory available over HTTP on port 8080. From the client system, you can then use a browser, curl, or wget to transfer the file.
No, it's not secure, and yes, it may violate data exfiltration policies (and as an aside, Bro detects this by default). But I've used it fairly often to move files between Windows and Mac or Linux in situations where SCP isn't available and I don't feel like setting up a fileshare.
I've also used this in conjunction with the "time" command to test the effects of latency on various network protocols, and as an improvised way to test WAN optimization software.
jswan@ubuntu:~$ python -m SimpleHTTPServer 8080
Serving HTTP on 0.0.0.0 port 8080 ...
This causes Python to make all the files in the current directory available over HTTP on port 8080. From the client system, you can then use a browser, curl, or wget to transfer the file.
No, it's not secure, and yes, it may violate data exfiltration policies (and as an aside, Bro detects this by default). But I've used it fairly often to move files between Windows and Mac or Linux in situations where SCP isn't available and I don't feel like setting up a fileshare.
I've also used this in conjunction with the "time" command to test the effects of latency on various network protocols, and as an improvised way to test WAN optimization software.
Sunday, March 3, 2013
An Operational Reason for Knowing Trivia
I've been largely out of touch with the IT certification scene lately, but I'm sure that people are still complaining incessantly about the fact that they need to memorize "trivia" in order to pass certification tests. Back when I was teaching Cisco classes full-time, my certification-oriented students were particularly bitter about this. Of course, this is a legitimate debate and the definition of "trivia" varies from person to person.
When I saw this article about CloudFlare's world-wide router meltdown, however, I immediately felt a bit smug about all those hours spent learning and teaching about packet-level trivia. If you don't want to read the article, here's the tl;dr:
In order for this meltdown to happen, they had to have a compounded series of errors:
When I saw this article about CloudFlare's world-wide router meltdown, however, I immediately felt a bit smug about all those hours spent learning and teaching about packet-level trivia. If you don't want to read the article, here's the tl;dr:
- their automated DDoS detection tool detected an attack against a customer using packets sized in the 99,000 byte range.
- their ops staff pushed rules to their routers to drop those packets
- their routers crashed and burned
In order for this meltdown to happen, they had to have a compounded series of errors:
- the attack detection tool was coded to allow detection of packet sizes that can't actually occur: no bounds checking.
- the ops staff didn't retain the "trivia" that they learned in Networking 101, and thus couldn't see the problem with the output generated by the detection tool.
- the router OS didn't do input validation, and blew up when attempting to configure itself to do something crazy.
Sunday, January 27, 2013
Baby Bro, Part 3: Containers and Loops
Bro has four main container types, which I'm going to cover in somewhat nontraditional order:
A table is a collection of indexed key-value pairs: the same idea is referred to as a dictionary, associative array, or hash table in other languages. Here's a simple example that pairs letters with their place in the alphabet:
Running it, we get this:
jswan@so12a:~/bro$ bro tables.bro
{
[3] = c,
[1] = a,
[2] = b
}
Note that the output isn't in the same order as the script; in Bro, like in most other languages, hash tables are unordered.
Iterating over a table with a "for" loop returns the key, again like other languages:
And the output:
jswan@so12a:~/bro$ bro tables.bro
c
b
a
Because we printed only the value associated with the key, we never see the key in the output.
Sets
It's common in programming to need a data type that allows one to make a collection of distinct objects, without containing multiple instances of identical objects. This is the mathematical notion of a set. Bro has a native set type. Consider an example where you have a large list of IP addresses that you got from some other source: an intel feed, a firewall log, a web server log, etc. You want to get a unique set of addresses that have appeared, but you don't care how many times they appeared. This is the perfect use for a set:
And the output:
jswan@so12a:~/bro$ bro sets.bro
{
1.1.1.1,
3.3.3.3,
2.2.2.2,
fe80::abcd:1
}
Note that 2.2.2.2 appears only once in the set, despite having been included three times as different variables.
Vectors
A vector is Bro's version of a one dimensional array. Having spent most of my recent programming time in Python, I was surprised to find that Bro vectors work like hash tables for iteration: when you loop over a vector, you get the index into the vector rather than the object itself. Here's an example:
The output:
jswan@so12a:~/bro$ bro vectors.bro
0
1
2
3
If you want to iterate a vector and get the object, you have to specify the index:
jswan@so12a:~/bro$ bro vectors.bro
cat
dog
dinosaur
rat
Bro's vectors work pretty much exactly like a table with "counts" as keys (a count is yet another native Bro type that we haven't discussed yet; it's the same as an integer but it's always unsigned; it can't be a negative). In fact, some of the earlier Bro documentation doesn't even show vectors as valid types, so I wonder if they are actually implemented internally as tables.
Records
The last Bro container type is the record, which is sort of the meat and potatoes of Bro's wonderful logging tools. Records are discussed in detail in most of the other beginner-Bro material out there, so I'm not going to cover them here.
This will probably be the last "Baby Bro" post that I do with just the raw language features demonstrated inside the bro_init() event. Any further Bro posts will probably be using Bro in its intended context. Hope this was helpful!
- tables
- sets
- vectors
- records
A table is a collection of indexed key-value pairs: the same idea is referred to as a dictionary, associative array, or hash table in other languages. Here's a simple example that pairs letters with their place in the alphabet:
1 2 3 4 5 | event bro_init() { local letters = table([1] = "a", [2] = "b", [3] = "c"); print letters; } |
Running it, we get this:
jswan@so12a:~/bro$ bro tables.bro
{
[3] = c,
[1] = a,
[2] = b
}
Note that the output isn't in the same order as the script; in Bro, like in most other languages, hash tables are unordered.
Iterating over a table with a "for" loop returns the key, again like other languages:
1 2 3 4 5 6 7 8 9 | event bro_init() { local letters = table([1] = "a", [2] = "b", [3] = "c"); for (key in letters) { print letters[key]; } } |
And the output:
jswan@so12a:~/bro$ bro tables.bro
c
b
a
Because we printed only the value associated with the key, we never see the key in the output.
Sets
It's common in programming to need a data type that allows one to make a collection of distinct objects, without containing multiple instances of identical objects. This is the mathematical notion of a set. Bro has a native set type. Consider an example where you have a large list of IP addresses that you got from some other source: an intel feed, a firewall log, a web server log, etc. You want to get a unique set of addresses that have appeared, but you don't care how many times they appeared. This is the perfect use for a set:
1 2 3 4 5 6 7 8 9 10 11 12 13 | event bro_init() { local a1 = 1.1.1.1; local a2 = 2.2.2.2; local a3 = 3.3.3.3; local a4 = [fe80::abcd:1]; local a5 = 2.2.2.2; local a6 = 2.2.2.2; local unique = set(a1,a2,a3,a4,a5,a6); print unique; } |
And the output:
jswan@so12a:~/bro$ bro sets.bro
{
1.1.1.1,
3.3.3.3,
2.2.2.2,
fe80::abcd:1
}
Note that 2.2.2.2 appears only once in the set, despite having been included three times as different variables.
Vectors
A vector is Bro's version of a one dimensional array. Having spent most of my recent programming time in Python, I was surprised to find that Bro vectors work like hash tables for iteration: when you loop over a vector, you get the index into the vector rather than the object itself. Here's an example:
1 2 3 4 5 6 7 8 | event bro_init() { local animals = vector("cat","dog","dinosaur","rat"); for (animal in animals) { print animal; } } |
The output:
jswan@so12a:~/bro$ bro vectors.bro
0
1
2
3
If you want to iterate a vector and get the object, you have to specify the index:
1 2 3 4 5 6 7 8 | event bro_init() { local animals = vector("cat","dog","dinosaur","rat"); for (index in animals) { print animals[index]; } } |
jswan@so12a:~/bro$ bro vectors.bro
cat
dog
dinosaur
rat
Bro's vectors work pretty much exactly like a table with "counts" as keys (a count is yet another native Bro type that we haven't discussed yet; it's the same as an integer but it's always unsigned; it can't be a negative). In fact, some of the earlier Bro documentation doesn't even show vectors as valid types, so I wonder if they are actually implemented internally as tables.
Records
The last Bro container type is the record, which is sort of the meat and potatoes of Bro's wonderful logging tools. Records are discussed in detail in most of the other beginner-Bro material out there, so I'm not going to cover them here.
This will probably be the last "Baby Bro" post that I do with just the raw language features demonstrated inside the bro_init() event. Any further Bro posts will probably be using Bro in its intended context. Hope this was helpful!
Thursday, January 24, 2013
Cisco Ironport WSA with WCCP and IP Spoofing
Recently I had to set up a transparent proxy with the Cisco Ironport Web Security Appliance (WSA) using WCCP on a Catalyst 6500 with a Sup720, with IP spoofing and web cache ACLs enabled. Like with many technologies, this turned out to be pretty simple but I couldn't find it documented all in one place. Perfect blog fodder!
The network topology looked like this (simplified, but not by much):
Normally when you set up a transparent proxy with WCCP, the IP address of the proxy server is used as the source of the HTTP requests. The problem in this topology is that I wanted the real source address of the client to appear in the firewall logs. The IP spoofing feature on the WSA allows this to happen, but it requires configuring bidirectional WCCP redirection on the Cat6k. If this had been a Cisco ASA firewall, we could have enabled WCCP there and saved some trouble, but in this case the network was using a firewall from another vendor that didn't support WCCP.
One important thing to realize about WCCP on the Catalyst 6500 with the Sup720 is that WCCP egress redirection is done with software switching rather than in hardware, so if you find yourself wanting to use the command "ip wccp redirect out", you're virtually guaranteeing that you're going to redline the CPU on your supervisor. Thus, we want to do only ingress redirection.
The 6500 configuration is as follows:
! this ACL prevents web traffic to internal servers (not shown in diagram)
! from being inspected
ip access-list extended ACL_WCCP
deny ip any 10.0.0.0 0.255.255.255
deny ip any 192.168.0.0 0.0.255.255
permit ip any any
! define the web cache, referencing the ACL above
! this WCCP service handles only standard HTTP traffic
ip wccp web-cache redirect-list ACL_WCCP
! a second WCCP service is used for reverse-path redirection, which
! is required for IP spoofing to work
ip wccp 90
interface Vlan 10
description to client networks
ip wccp web-cache redirect in
interface Vlan 30
description to firewall cluster
! WCCP service group 90 is applied inbound on the return path
! from the Internet so that IP spoofing will work
ip wccp 90 web-cache redirect in
If you have multiple layer 3 interfaces going to client networks, you need to enable WCCP on all of them if reverse-path redirection is enabled. If we weren't using reverse redirection (i.e., if we weren't using IP spoofing), this wouldn't be the case: we could simply leave WCCP disabled on interfaces whose traffic shouldn't be proxied. With reverse redirection, though, the return traffic is always sent to the proxy server; if the proxy server sees the return traffic but not the egress traffic, it gets dropped. If you need to use IP spoofing and still have certain types of web traffic bypass the proxy, you would need to do this with ACLs applied to the WCCP service, rather than simply by leaving WCCP disabled on certain interfaces.
On the WSA, you create two WCCPv2 services under the Network-->Transparent Redirection menu, one with service group 0 (the default), and one with service group 90 (matching the the IOS configuration above). On group 90, you enable "redirect based on source port". For both groups, you enter the IP address of the SVI as the upstream router address (the IP address of VLAN 20 in this case). The WCCP service on the WSA then registers automatically with the Sup720.
Under the Security Services--> Web Proxy Settings menu, you enable transparent mode with IP spoofing for transparent connections only.
That's it: now you have a transparent proxy, with IP spoofing so that your firewall logs show accurate client IP addresses. Handling HTTPS traffic or HTTP traffic on non-standard ports is beyond the scope of this post.
The network topology looked like this (simplified, but not by much):
Normally when you set up a transparent proxy with WCCP, the IP address of the proxy server is used as the source of the HTTP requests. The problem in this topology is that I wanted the real source address of the client to appear in the firewall logs. The IP spoofing feature on the WSA allows this to happen, but it requires configuring bidirectional WCCP redirection on the Cat6k. If this had been a Cisco ASA firewall, we could have enabled WCCP there and saved some trouble, but in this case the network was using a firewall from another vendor that didn't support WCCP.
One important thing to realize about WCCP on the Catalyst 6500 with the Sup720 is that WCCP egress redirection is done with software switching rather than in hardware, so if you find yourself wanting to use the command "ip wccp redirect out", you're virtually guaranteeing that you're going to redline the CPU on your supervisor. Thus, we want to do only ingress redirection.
The 6500 configuration is as follows:
! this ACL prevents web traffic to internal servers (not shown in diagram)
! from being inspected
ip access-list extended ACL_WCCP
deny ip any 10.0.0.0 0.255.255.255
deny ip any 192.168.0.0 0.0.255.255
permit ip any any
! define the web cache, referencing the ACL above
! this WCCP service handles only standard HTTP traffic
ip wccp web-cache redirect-list ACL_WCCP
! a second WCCP service is used for reverse-path redirection, which
! is required for IP spoofing to work
ip wccp 90
interface Vlan 10
description to client networks
ip wccp web-cache redirect in
interface Vlan 30
description to firewall cluster
! WCCP service group 90 is applied inbound on the return path
! from the Internet so that IP spoofing will work
ip wccp 90 web-cache redirect in
If you have multiple layer 3 interfaces going to client networks, you need to enable WCCP on all of them if reverse-path redirection is enabled. If we weren't using reverse redirection (i.e., if we weren't using IP spoofing), this wouldn't be the case: we could simply leave WCCP disabled on interfaces whose traffic shouldn't be proxied. With reverse redirection, though, the return traffic is always sent to the proxy server; if the proxy server sees the return traffic but not the egress traffic, it gets dropped. If you need to use IP spoofing and still have certain types of web traffic bypass the proxy, you would need to do this with ACLs applied to the WCCP service, rather than simply by leaving WCCP disabled on certain interfaces.
On the WSA, you create two WCCPv2 services under the Network-->Transparent Redirection menu, one with service group 0 (the default), and one with service group 90 (matching the the IOS configuration above). On group 90, you enable "redirect based on source port". For both groups, you enter the IP address of the SVI as the upstream router address (the IP address of VLAN 20 in this case). The WCCP service on the WSA then registers automatically with the Sup720.
Under the Security Services--> Web Proxy Settings menu, you enable transparent mode with IP spoofing for transparent connections only.
That's it: now you have a transparent proxy, with IP spoofing so that your firewall logs show accurate client IP addresses. Handling HTTPS traffic or HTTP traffic on non-standard ports is beyond the scope of this post.
Saturday, January 19, 2013
Baby Bro, Part 2: Conditionals, Address Types
Bro has native types for addresses and networks, making it much easier to work with network data. Today's Baby Bro script shows global variable definition, the use of the address and subnet types, and a simple conditional:
Running this from the CLI, we get the expected output:
jswan@so12a:~/bro$ bro addr_net_types.bro
1.1.1.1 is in network 1.1.0.0/16
Bro also has several interesting built-in functions for working with network data that we'll explore in upcoming posts. For now, we'll take a look at the mask_addr function, which allows you to use Bro as an improvised subnet calculator. You can run a Bro micro-script from the CLI with with the -e option, just like the -e flag in Perl or the -c flag in Python:
jswan@so12a:~/bro$ bro -e "print mask_addr(10.18.32.199,14);"
10.16.0.0/14
jswan@so12a:~/bro$ bro -e "print mask_addr(10.18.32.199,31);"
10.18.32.198/31
Great for those late-night subnetting sessions after too many microbrews!
Just in case you were wondering: all of this works natively for IPv6, with some changes to the syntax:
jswan@so12a:~/bro$ bro -e "print [fe80::1db9] in [fe80::]/64;"
T # T is the way Bro outputs "True" in a Boolean test
We'll look at some more IPv6 stuff in an upcoming post.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | # declaring global variables # no need to put quotes around addr or subnet variable definitions global ipv4_host:addr = 1.1.1.1; global ipv4_net:subnet = 1.1.0.0/16; event bro_init() { if (ipv4_host in ipv4_net) { # addr and subnet types are autoconverted to strings with fmt print fmt("%s is in network %s",ipv4_host,ipv4_net); } else { print fmt("host %s is not in network %s",ipv4_host,ipv4_net); } } |
Running this from the CLI, we get the expected output:
jswan@so12a:~/bro$ bro addr_net_types.bro
1.1.1.1 is in network 1.1.0.0/16
Bro also has several interesting built-in functions for working with network data that we'll explore in upcoming posts. For now, we'll take a look at the mask_addr function, which allows you to use Bro as an improvised subnet calculator. You can run a Bro micro-script from the CLI with with the -e option, just like the -e flag in Perl or the -c flag in Python:
jswan@so12a:~/bro$ bro -e "print mask_addr(10.18.32.199,14);"
10.16.0.0/14
jswan@so12a:~/bro$ bro -e "print mask_addr(10.18.32.199,31);"
10.18.32.198/31
Great for those late-night subnetting sessions after too many microbrews!
Just in case you were wondering: all of this works natively for IPv6, with some changes to the syntax:
jswan@so12a:~/bro$ bro -e "print [fe80::1db9] in [fe80::]/64;"
T # T is the way Bro outputs "True" in a Boolean test
We'll look at some more IPv6 stuff in an upcoming post.
Tuesday, January 15, 2013
Baby Bro, Part 1: Functions Etc.
[Note: Blogger seems to have done something nasty to my new blog template, so it's back to the old one at least temporarily]
Here's my first "Baby Bro" post. Before getting into using Bro scripting for its intended use of network traffic analysis, I wanted to figure out how to accomplish basic tasks common to most programming languages:
I'm not sure if I'll get through all of them in this series, but here's a start: a main dish of functions, with a side of string formatting and concatenation.
I think this is fairly self explanatory, given the comments. We have two functions:
We can run this from the CLI with no PCAP ingestion just to get the standard output:
jswan@so12a:~/bro$ bro test.bro
3 + 1 = 4
first second
Here's my first "Baby Bro" post. Before getting into using Bro scripting for its intended use of network traffic analysis, I wanted to figure out how to accomplish basic tasks common to most programming languages:
- Functions
- Common types and variable definitions
- Loops
- Conditionals
- Iteration of container types
- Basic string and arithmetic operations
I'm not sure if I'll get through all of them in this series, but here's a start: a main dish of functions, with a side of string formatting and concatenation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | 1 # "add one" is the function name 2 # (i:int) is the variable and type passed into the function 3 # the final "int" is the type returned by the return statement 4 function add_one(received_value:int): int 5 { 6 local returned_value = received_value + 1; 7 return returned_value; 8 } 9 10 # this function shows two strings passed in, returning a string 11 function concat(a:string,b:string): string 12 { 13 return a + " " + b; # one way of doing string concatenation 14 } 15 16 event bro_init() # bro_init() fires when Bro starts running 17 { 18 local x = 3; # defining a local variable 19 local y = add_one(x); # using the first function defined above 20 print fmt("%d + 1 = %d",x,y); # formatted printing as in printf 21 22 print concat("first","second"); # using the second function defined above 23 } |
I think this is fairly self explanatory, given the comments. We have two functions:
- add_one: adds one to whatever integer is passed into the function, and returns the resulting integer.
- concat: concatenates two strings, separated by a space, and returns the result. There is a built-in string function for this, but I wanted to show that you can also do it with "+".
We can run this from the CLI with no PCAP ingestion just to get the standard output:
jswan@so12a:~/bro$ bro test.bro
3 + 1 = 4
first second
Monday, January 14, 2013
Basic Bro Language References
Finding simple examples of Bro language features is somewhat difficult: the scripts that come packaged with Bro are written by experts in the language and are quite idiomatic.
Here are some of the basic Bro language references I've found so far. In upcoming blog posts, I'll show some "Baby Bro" that is even more basic than these examples.
From Ryesecurity by Scott Runnels(@srunnels):
Solving Network Forensic Challenges with Bro, Part 1
Solving Network Forensic Challenges with Bro, Part 2
Solving Network Forensic Challenges with Bro, Part 3
Logging YouTube Titles with Bro
Justin Azoff's (@JustinAzoff) Bro Presentation on Github
The Official Bro Workshop 2011 Pages
The Bro Language Cheat Sheet
Here are some of the basic Bro language references I've found so far. In upcoming blog posts, I'll show some "Baby Bro" that is even more basic than these examples.
From Ryesecurity by Scott Runnels(@srunnels):
Solving Network Forensic Challenges with Bro, Part 1
Solving Network Forensic Challenges with Bro, Part 2
Solving Network Forensic Challenges with Bro, Part 3
Logging YouTube Titles with Bro
Justin Azoff's (@JustinAzoff) Bro Presentation on Github
The Official Bro Workshop 2011 Pages
The Bro Language Cheat Sheet
Sunday, January 13, 2013
Changes!
Yee-haw! New blog template!
Trying to figure out how to do source code highlighting in a way that doesn't suck or rely on external JavaScript hosting.
Trying to figure out how to do source code highlighting in a way that doesn't suck or rely on external JavaScript hosting.
Friday, January 11, 2013
"Hello World" in Bro IDS
One of the reasons I don't blog that much is that I generally assume that everything worth blogging has already been done, and that everyone reading is probably smarter than me and doesn't need me to explain things. I'm going to pretend that those are non-issues and try to blog more, no matter how basic the topic.
I just got back from FloCon 2013, which was quite interesting. The highlight for me was some informal after-hours knowledge-dumps from Seth Hall (@remor) and Liam Randall (@hectaman) on the subject of Bro (@Bro_IDS).
Before these mini-lessons, I didn't really have a good idea of how to get started with Bro scripting. Now I do!
The stuff we covered was actually a lot more complex than Hello World, but in the spirit of beginning coders everywhere, here's how you do "Hello World" in Bro (and a little more):
The fundamental idea behind Bro is that it's a scripting language that responds to events that are derived from packet streams. When Bro is monitoring a raw packet feed or ingesting a pcap file, it fires events whenever something interesting happens: FTP sessions, HTTP sessions, SSH sessions, etc.
The script above responds to the two simplest Bro events: the startup and shutdown of the software. If we run Bro from the CLI with a dummy pcap file to ingest, it writes the output to the terminal:
Of course, this isn't something we'd ever do in a real Bro environment; we'd want to be actually looking at the packet stream and taking actions in response to it. Stay tuned for more.
I just got back from FloCon 2013, which was quite interesting. The highlight for me was some informal after-hours knowledge-dumps from Seth Hall (@remor) and Liam Randall (@hectaman) on the subject of Bro (@Bro_IDS).
Before these mini-lessons, I didn't really have a good idea of how to get started with Bro scripting. Now I do!
The stuff we covered was actually a lot more complex than Hello World, but in the spirit of beginning coders everywhere, here's how you do "Hello World" in Bro (and a little more):
ubuntu@ip-10-73-25-224:~/bro$ cat hello.bro
bro_init() { print "Hello World!"; } bro_done() { print "Goodbye World!"; }
The fundamental idea behind Bro is that it's a scripting language that responds to events that are derived from packet streams. When Bro is monitoring a raw packet feed or ingesting a pcap file, it fires events whenever something interesting happens: FTP sessions, HTTP sessions, SSH sessions, etc.
The script above responds to the two simplest Bro events: the startup and shutdown of the software. If we run Bro from the CLI with a dummy pcap file to ingest, it writes the output to the terminal:
ubuntu@ip-10-73-25-224:~/bro$ bro -C -r foo.pcap hello.bro
Hello World!
Goodbye World!
Hello World!
Goodbye World!
Of course, this isn't something we'd ever do in a real Bro environment; we'd want to be actually looking at the packet stream and taking actions in response to it. Stay tuned for more.
Subscribe to:
Posts (Atom)