Loopback Mountain

Moving to GitHub Pages

2015-09-18T09:02:00.002-06:00

I've moved this blog to GitHub Pages.

The RSS feed is here.

Installing netmiko on Windows

2015-07-15T13:12:00.000-06:00

Netmiko is a Python module by Kirk Byers that provides a wrapper around the Paramiko SSH module for doing screen scraping and CLI automation on network devices.

Paramiko has some dependencies that make installation on Windows a tad tricky. Here's a quick way to get it done:

Install Anaconda.
From the Anaconda shell, run "conda install paramiko".
From the Anaconda shell, run "pip install scp".
Install git for Windows.
Clone netmiko with "git clone https://github.com/ktbyers/netmiko"
cd into the netmiko directory and run "python setup.py install".

Done! Screen scrape away, and don't forget to hound your vendors for real APIs... :-)

Extracting Traffic from Rolling Capture Files

2015-07-14T11:35:00.000-06:00

Every so often I need to extract a subset of traffic from a set of rolling timestamped pcap files. One common place I do this is with Security Onion; one of the great features of SO is its full-packet-capture feature: you can easily pivot from Snort, Suricata, or Bro logs to a full packet capture view, or download the associated pcap file.

But what if you don't have an associated alert or Bro log entry? Or if you're doing pcap on some system that's not as user-friendly as Security Onion, but nonetheless supports rolling captures?

The way I usually do this is with find and xargs. Here's an example of my most common workflow, using timestamps as the filtering criteria for find:

> find . -newerct "16:07" ! -newerct "16:10" | xargs -I {} tcpdump -r {} -w /tmp/{} host 8.8.8.8
> cd /tmp
> mergecap -w merged.pcap *.pcap

Translated:

Find all files in the current directory created after 16:07 but not created after 16:10. This requires GNU find 4.3.3 or later. It supports many different time and date formats.
Using xargs, filter each file with the "host 8.8.8.8" BPF expression and write it to /tmp with the same filename.
Merge all the .pcap files in /tmp into merged.pcap.

You can easily modify this workflow to fit other use cases.

More ADN (Awk Defined Networking)

2015-05-15T14:52:00.004-06:00

Want to know how many IPv4 nodes are in each of your VLANs? Use ADN:

ssh myswitch 'sh arp | i Vlan' | awk '{print $NF}' | sort | uniq -c | sort -rn

     79 Vlan38
     65 Vlan42
     58 Vlan34
     22 Vlan36
     21 Vlan32
     20 Vlan40
      9 Vlan3
      7 Vlan8
      5 Vlan6
      5 Vlan204
      5 Vlan203
      5 Vlan2
      4 Vlan74
      3 Vlan82
      3 Vlan4

ADN - Awk Defined Networking

2015-04-24T10:39:00.000-06:00

Because I have yet to transition to a completely software-defined network in which everything configures itself (wink wink), I still have to do tasks like bulk VLAN changes.

Thanks to a recent innovation called ADN, or "AWK Defined Networking", I can do this in a shorter time window that the average bathroom break. For example, I just had a request to change all ports on a large access switch stack that are currently in VLAN 76 to VLAN 64:

# ssh switch_name.foo.com 'show int status | i _76_' | grep Gi | awk '{print "int ",$1,"\n","description PC/Phone","\n","switchport access vlan 64"}'
Password: ***

int Gi1/0/25
description PC/Phone
switchport access vlan 64
int Gi1/0/26
description PC/Phone
switchport access vlan 64
[many more deleted]

Then I copied and pasted the results into config mode. Back to lounging on the beach.

Not even any Python skills required!

Quick Example: Elasticsearch Bulk Index API with Python

2015-03-26T20:17:00.001-06:00

A quick example that shows how to use Elasticsearch bulk indexing from the Python client. This is dramatically faster than indexing documents one at a time in a loop with the index() method.

Filtering .raw fields with Python Elasticsearch DSL High-Level Client

2015-02-28T20:27:00.001-07:00

It took me a while to figure out how to search the not_analyzed ".raw" fields created by Logstash in Elasticsearch indices, using the high-level Python Elasticsearch client. Because keyword arguments can't have attributes, Python throws an error if you try it the intuitive way (this assumes you've already set up a client as es and an index as i, as shown in the docs):

Instead, you create a dictionary with your parameters and unpack it using the ** operator:

This produces the Elasticsearch query we want:

Pleasing terminal colors on Security Onion

2015-01-15T15:17:00.000-07:00

To get the lovely Solarized theme working in Security Onion:

sudo apt-get install gnome-terminal

I'm sure there's a way to get in working in the default xfce4 terminal, but I couldn't figure it out.

Follow instructions here: http://stackoverflow.com/questions/23118916/configuring-solarized-colorscheme-in-gnome-terminal-tmux-and-vim

Problems with kvm-ok in VIRL with VMWare Player

2015-01-08T15:21:00.000-07:00

I'm installing Cisco VIRL, and despite following the instructions regarding nested virtualization settings, the kvm-ok command was still complaining. I needed to edit the .vmx file for the VIRL VM and add/edit the following:

monitor.virtual_mmu = "hardware"
monitor.virtual_exec = "hardware"
vhv.enable = "TRUE"
monitor_control.restrict_backdoor = "true"

My Network Toolkit

2015-01-02T10:45:00.001-07:00

A while back, Chris Marget of Fragmentation Needed posted a run-down of his comprehensive and extremely clever network toolkit. Because I'm something of a weight weenie, mine is a lot more slimmed down. I thought I'd post it here:

The contents:

Two random USB drives (in case I need to leave one with somebody).
Single-mode and multi-mode LC fiber loopback plugs.
Rack PDU plug adapter.
Awesome PicQuic compact screwdriver (thanks to Chris's post).
T1 loopback plug (red) (because we still have T1s out here in the boonies).
Cat-6 pass-through plug (white).
Crossover adapter (orange).
Sharpie.
Console setup:

USB-to-DB9 adapter.
DB9-to-RJ45 adapter.
Flat Cat-6 cable.
Rollover adapter.
Velcro tie
Flat Cat-6 cable with velcro tie.

The console setup could probably be improved by adding a DB9 null-modem adapter. The coolest thing (IMO) that I'm missing from Chris's setup is the Bluetooth console adapter -- maybe one day.

The Fenix AA light and Leatherman Skeletool CX almost always live in a pocket rather than the kit and go with me everywhere. The kit all fits into a small zippered case that used to hold a Dell laptop power supply.

My main goal here was to have all the hard-to-find professional stuff in one small package. I have a separate "personal" kit that contains stuff like headphones, USB cables, and chargers for personal electronics.

Imposing Artificial Limitations to Develop Skills

2014-12-05T08:49:00.001-07:00

I'm a big fan of imposing artificial limitations on yourself in order to aid skill development. Here are some quick ideas:

When troubleshooting network devices from the CLI, try not to look at the configuration. Use only "show" or "debug" commands instead. I found this enormously beneficial when practicing for CCIE.
When troubleshooting larger operational issues or learning a new environment, try not to log into individual devices at all. Force yourself to use only your network management system, NetFlow, packet captures, or host-based tools like ping, traceroute, or nmap.
When learning automation or orchestration skills, force yourself to write scripts, run API calls, or use your favorite orchestration tool to do simple things, even if it doesn't seem like they merit the extra effort.

Simple Python Syslog Counter

2014-07-01T19:24:00.001-06:00

Recently I did a Packet Pushers episode about log management. In it, I mentioned some of the custom Python scripts that I run to do basic syslog analysis, and someone asked about them in the comments.

The script I'm presenting here isn't one of the actual ones that I run in production, but it's close. The real one sends emails, does DNS lookups, keeps a "rare messages" database using sqlite3, and a few other things, but I wanted to keep this simple.

One of the problems I see with getting started with log analysis is that people tend to approach it like a typical vendor RFP project: list some requirements, survey the market, evaluate and buy a product to fit your requirements. Sounds good, right? The problem with log analysis is that often you don't know what your requirements really are until you start looking at data.

A simple message counting script like this lets you look at your data, and provides a simple platform on which you can start to iterate to find your specific needs. It also lets us look at some cool Python features.

I don't recommend pushing this too far: once you have a decent idea of what your data looks like and what you want to do with it, set up Logstash, Graylog2, or a similar commercial product like Splunk (if you can afford it).

That said, here's the Python:

I tried to make this as self-documenting as possible. You run it from the CLI with a syslog file as the argument, and you get this:

$ python simple_syslog_count.py sample.txt
214   SEC-6-IPACCESSLOGP
15    SEC-6-IPACCESSLOGRL
10    LINEPROTO-5-UPDOWN
10    LINK-3-UPDOWN
7     USER-3-SYSTEM_MSG
4     STACKMGR-4-STACK_LINK_CHANGE
4     DUAL-5-NBRCHANGE
3     IPPHONE-6-UNREGISTER_NORMAL
3     CRYPTO-4-PKT_REPLAY_ERR
3     SEC-6-IPACCESSLOGRP
3     SEC-6-IPACCESSLOGSP
2     SSH-5-SSH2_USERAUTH
2     SSH-5-SSH2_SESSION
2     SSH-5-SSH2_CLOSE

10.1.16.12

     6     SEC-6-IPACCESSLOGP

10.1.24.3

     2     LINEPROTO-5-UPDOWN
     2     LINK-3-UPDOWN

[Stuff deleted for brevity]

For Pythonistas, the script makes use of a few cool language features:

Named, Compiled rRgexes

We can name a regex match with the (?PPATTERN) syntax, which makes it easy to understand it when it's referenced later with the .group('') method on the match object.
This is demonstrated in lines 36-39 and 58-59 of the gist shown above.
It would be more efficient to capture these fields by splitting the line with the .split() string method, but I wanted the script to work for unknown field positions -- hence the regex.

Multiplication of Strings

We control indentation by multiplying the ' ' string (that a single space enclosed in quotes) by an integer value in the print_counter function (line 50).

The reason this works is that the Python str class defines a special __mul__ method that controls how the * operator works for objects of that class:
>>> 'foo'.__mul__(3)
'foofoofoo'
>>> 'foo' * 3
'foofoofoo'

collections.Counter Objects

Counter objects are a subclass of dictionaries that know how to count things. Jeremy Schulman talked about these in a comment on the previous post. Here, we use Counters to build both the overall message counts and the per-device message counts:

>>> my_msg = 'timestamp ip_address stuff %MY-4-MESSAGE:other stuff'
>>> CISCO_MSG = re.compile('%(?P.*?):')
>>> from collections import Counter
>>> test_counter = Counter()
>>> this_msg = re.search(CISCO_MSG,my_msg).group('msg')
>>> this_msg
'MY-4-MESSAGE'
>>> test_counter[this_msg] += 1
>>> test_counter
Counter({'MY-4-MESSAGE': 1})

collections.defaultdict Dictionaries

It could get annoying when you're assigning dictionary values inside a loop, because you get errors when the key doesn't exist yet. This is a contrived example, but it illustrates the point:

>>> reporters = {}
>>> for reporter in ['1.1.1.1','2.2.2.2']:
... reporters[reporter].append['foo']
...
Traceback (most recent call last):
File "", line 2, in
KeyError: '1.1.1.1'
To fix this, you can catch the exception:

>>> reporters = {}
>>> for reporter in ['1.1.1.1','2.2.2.2']:
...     try:
...         reporters[reporter].append['foo']
...         reporters[reporter].append['bar']
...     except KeyError:
...         reporters[reporter] = ['foo']
...         reporters[reporter].append('bar')

As usual, though, Python has a more elegant way in the collections module: defaultdict

>>> from collections import defaultdict
>>> reporters = defaultdict(list)
>>> for reporter in ['1.1.1.1','2.2.2.2']:
... reporters[reporter].append('foo')
... reporters[reporter].append('bar')
>>> reporters
defaultdict(, {'1.1.1.1': ['foo', 'bar'], '2.2.2.2': ['foo', 'bar']})

In the syslog counter script, we use a collections.Counter object as the type for our defaultdict. This allows us to build a per-syslog-reporter dictionary that shows how many times each message appears for each reporter, while only looping through the input once (line 66):

per_reporter_counts[reporter][msg] += 1

Here, the dictionary per_reporter_counts has the IPv4 addresses of the syslog reporters as keys, with a Counter object as the value holding the counts for each message type:

>>> from collections import Counter,defaultdict
>>> per_reporter_counts = defaultdict(Counter)
>>> per_reporter_counts['1.1.1.1']['SOME-5-MESSAGE'] += 1
>>> per_reporter_counts
defaultdict(, {'1.1.1.1': Counter({'SOME-5-MESSAGE': 1})})
>>> per_reporter_counts['1.1.1.1']['SOME-5-MESSAGE'] += 5
>>> per_reporter_counts
defaultdict(, {'1.1.1.1': Counter({'SOME-5-MESSAGE': 6})})

If you got this far, you can go implement it for IPv6 addresses. :-)

Python Sets: Handy for Network Data

2014-06-20T10:55:00.001-06:00

My Python-related posts seem to get the most reads, so here's another one!

A problem that comes up fairly often in networking is finding the number of occurrences of unique items in a large collection of data: let's say you want to find all of the unique IP addresses that accessed a website, traversed a firewall, got denied by an ACL, or whatever. Maybe you've extracted the following list from a log file:

1.1.1.1
2.2.2.2
3.3.3.3
1.1.1.1
5.5.5.5
5.5.5.5
1.1.1.1
2.2.2.2
...

and you need to reduce this to:

1.1.1.1
2.2.2.2
3.3.3.3
5.5.5.5

In other words, we're removing the duplicates. In low-level programming languages, removing duplicates is a bit of a pain: generally you need to implement an efficient way to sort an array of items, then traverse the sorted array to check for adjacent duplicates and remove them. In a language that has dictionaries (also known as hash tables or associative arrays), you can do it by adding each item as a key in your dictionary with an empty value, then extract the keys. In Python:

>>> items = ['1.1.1.1','2.2.2.2','3.3.3.3','1.1.1.1','5.5.5.5','5.5.5.5','1.1.1.1','2.2.2.2']
>>> d = {}
>>> for item in items:
... d[item] = None
...
>>> d
{'5.5.5.5': None, '3.3.3.3': None, '1.1.1.1': None, '2.2.2.2': None}
>>> unique = d.keys()
>>> unique
['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2']

or, more concisely using a dictionary comprehension:

>>> {item:None for item in items}.keys()
['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2']

Python has an even better way, however: the "set" type, which emulates the mathematical idea of a set as a collection of distinct items. If you create an empty set and add items to it, duplicates will automatically be thrown away:

>>> s = set()
>>> s.add('1.1.1.1')
>>> s
set(['1.1.1.1'])
>>> s.add('2.2.2.2')
>>> s.add('1.1.1.1')
>>> s
set(['1.1.1.1', '2.2.2.2'])
>>> for item in items:
... s.add(item)
...
>>> s
set(['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2'])

Predictably, you can use set comprehensions just like list comprehensions to do the same thing as a one liner:

>>> {item for item in items}
set(['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2'])

Or, if you have a list built already you can just convert it to a set:

>>> set(items)
set(['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2'])

Python also provides methods for the most common types of set operations: union, intersection, difference and symmetric difference. Because these methods accept lists or other iterables, you can quickly find similarities between collections of items:

>>> items
['1.1.1.1', '2.2.2.2', '3.3.3.3', '1.1.1.1', '5.5.5.5', '5.5.5.5', '1.1.1.1', '2.2.2.2']
>>> more_items = ['1.1.1.1','8.8.8.8','1.1.1.1','7.7.7.7','2.2.2.2']
>>> set(items).intersection(more_items)
set(['1.1.1.1', '2.2.2.2'])
>>> set(items).difference(more_items)
set(['5.5.5.5', '3.3.3.3'])

Have fun!

Fun with Router IP Traffic Export and NSM

2014-04-02T09:54:00.001-06:00

The Basics
I finally got around to setting up Security Onion (the best network security monitoring package available) to monitor my home network, only to discover that my Cisco 891 router doesn't support support the right form of SPAN. Here's how I worked around it. The topology looks like this:

The 891 router has an integrated 8-port switch module, so the simple case would have been a traditional SPAN setup; something like this:

! vlan 10 is the user VLAN
monitor session 1 source interface vlan 10
monitor session 1 destination interface FastEthernet0

with the server's monitoring NIC connected to FastEthernet0.

The problem is that the 891 doesn't support using a VLAN as a source interface, and because of the way the embedded WAP works, a physical source interface won't work either. Hence, I turned to an obscure feature that's helped me occasionally in the past: Router IP Traffic Export. This is a feature for IOS software platforms that enables you to enable SPAN-like functions for almost any source interface.

The configuration looks like this:

ip traffic-export profile RITE_MIRROR
interface FastEthernet0
bidirectional
mac-address 6805.ca21.2ddd

interface Vlan10
ip traffic-export apply RITE_MIRROR

This takes all traffic routed across the Vlan10 SVI and sends it out the FastEthernet0 interface, rewriting the destination MAC address to the specified value. I used the MAC address of my monitoring NIC, but it shouldn't matter in this case because the monitoring NIC is directly attached. If I wanted to copy the traffic across a switched interface, it would matter.

My ESXi host (a low-cost machine from Zareason with 32GB of RAM) has two physical NICs; one for all of the regular VM traffic (using 802.1q to separate VLANs if needed) and one for monitoring. The monitoring pNIC is attached to a promiscuous mode vSwitch in ESXi, which in turn is connected to the monitoring vNIC on the Security Onion VM. The effect of this is identical to SPAN-ing all the traffic from VLAN 10 to my Security Onion monitoring system; I get Snort, Bro, Argus, and full packet capture with just the built-in software tools in IOS and ESXi.

Oddity: RITE Capture, Tunnels?
Interestingly, you can also use RITE to capture traffic to a RAM buffer and export it to a pcap file. I don't understand why you would use this instead of the much more flexible Embedded Packet Capture Feature, though.

Another thing I've wondered is whether you could use a L2 tunnel to send the mirrored traffic elsewhere in the network. The destination interface must be a physical Ethernet interface, but it would be interesting to try using a L2TPv3 tunnel from an Ethernet interface to another router--I have no idea if this would work.

Production Use?
Cisco makes the UCS E-series blades for ISR G2 routers that let you run a hypervisor on a blade inside your router chassis. These things have an external Ethernet port on them, so you should be able to connect a RITE export interface to the external port on an E-series blade, and run Security Onion inside your router. I've always wanted to try this, but I haven't been able to get funding yet to test it.

Quick Thoughts on the Micro Data Center

2014-03-21T16:04:00.000-06:00

Here's something that's been on my radar lately: while all the talk in the networking world seems to be about the so-called "massively scalable" data center, almost all of the people I talk to in my world are dealing with the fact that data centers are rapidly getting smaller due to virtualization efficiencies. This seems to be the rule rather than the exception for small-to-medium sized enterprises.

In the micro data center that sits down the hall from me, for example, we've gone from 26 physical servers to 18 in the last few months, and we're scheduled to lose several more as older hypervisor hosts get replaced with newer, denser models. I suspect we'll eventually stabilize at around a dozen physical servers hosting in the low hundreds of VMs. We could get much denser, but things like political boundaries inevitably step in to keep the count higher than it might be otherwise. The case is similar in our other main facility.

From a networking perspective, this is interesting: I've heard vendor and VAR account managers remark lately that virtualization is cutting into their hardware sales. I'm most familiar with Cisco's offerings, and at least right now they don't seem to be looking at the micro-DC market as a niche: high-port count switches are basically all that are available. Cisco's design guide for the small-to-medium data center starts in the 100-300 10GE port range, which with modern virtualization hosts will support quite a few thousand typical enterprise VMs.

Having purchased the bulk of our older-generation servers before 10GE was cheap, we're just getting started with 10GE to the server access layer. Realistically, within a year or so a pair of redundant, reasonably feature-rich 24-32 port 10GE switches will be all we need for server access, probably in 10GBASE-T. Today, my best Cisco option seems to be the Nexus 9300 series, but it still has a lot of ports I'll never use.

One thought I've had is to standardize on the Catalyst 4500-X for all DC, campus core, and campus distribution use. With VSS, the topologies are simple. The space, power, and cooling requirements are minimal, and the redundancy is there. It has good layer 3 capabilities, along with excellent SPAN and NetFlow support. The only thing it seems to be lacking today is an upgrade path to 40GE, but that may be acceptable in low-port-density environments. Having one platform to manage would be nice. The drawbacks, of course, are a higher per-port cost and lack of scalability -- but again, scalability isn't really the problem.

Comments welcome.

Packet Capture in Diverse / Tunneled Networks?

2014-03-05T20:26:00.001-07:00

(With the usual caveats that I am just a hick from Colorado, I don't know what I'm talking about, etc.)

I just read Pete Welcher's superb series on NSX, DFA, ACI, and other SDN stuff on the Chesapeake Netcraftsmen blog, and it helped me think more clearly about a problem that's been bothering me for a long time: how do we do realistically scalable packet capture in networks that make extensive use of ECMP and/or tunnels? Here's a sample network that Pete used:

Conventionally, we place packet capture devices at choke points in the network. But in medium-to-large data center designs, one of the main goals is to eliminate choke points: if we assume this is a relatively small standard ECMP leaf-spine design, each of the leaf switches has four equal-cost routed paths through the spine switches, and each spine switch has at least as many downlinks as there are leaf switches. The hypervisors each have two physical paths to the leaf switches, and in a high-density virtualization design we probably don't have a very good idea of what VM resides on what hypervisor at any point in time.

Now, add to that the tunneling features present in hypervisor-centric network virtualization schemes: traffic between two VMs attached to different hypervisors is tunneled inside VXLAN, GRE, or STT packets, depending on how you have things set up. The source and destination IP addresses of the "outer" hypervisor-to-hypervisor packet are not the sample as the "inner" VM-to-VM addresses, and presumably it's the latter that interest us. Thus, it's hard to even figure out which packets to capture. If we capture all of them (hard at any kind of scale; good 10G line-rate capture is still expensive and troublesome), we still have to filter for the inner tunnel headers to figure out what we're looking at.

What can we do? I see a few options:

Put a huge mirror/tap switch between the leaf and spine. Gigamon makes some big ones, with up to 64 x 40GE ports or 256 x 10GE ports. When you max those out, start putting them at the end of each row. They advertise the ability to pop all kinds of different tunnel headers in hardware, along with lots of cool filtering and load-balancing capabilities.
Buy or roll-your-own rack-mount packet capture appliances on commodity hardware, and run ad hoc SPAN sessions to them from the leaf switches.
Install hypervisor-based packet capture VMs on all your hypervisors and capture from promiscuous mode vSwitches. There are lots of commercial solutions here, or you could roll your own. Update: Pete Welcher responded on Twitter and mentioned the option of doing packet capture pre-tunnel-encap or post-tunnel-decap. That's originally what I was thinking of with this option, but after reviewing a couple of his posts again, it appears there may be scenarios where the hypervisor makes tunnels to itself, so a better way of doing it might be to implement a packet capture API in the hypervisor itself that can control the point in the tunnel chain where the capture takes place. The next question is: where do we retrieve the capture? Does the API send the capture to a VM, save it on a datastore, dump it to a physical port, send it via another tunnel akin to ERSPAN? I'd want multiple options.
Make sure Wireshark, tshark, or tcpdump is installed on every VM.
Give up on intra-DC capture and focus only at the ingress/egress points.

Option 1 is the only one that really confronts both the network diversity and tunnel encapsulation head-on. Those boxes and their administrative overhead don't come cheap, but today this is probably the most fiddle-free option. The other options require a lot of customization and manual intervention that may or may not interfere with change-control procedures, and don't provide obvious solutions for de-obfuscating the tunneled traffic. Option 2 also suffers from serious scalability problems in ECMP designs. Option 5 just avoids the problem, but might work for some people.

However, the whole point of these designs is "SDN". What I *hope* is going to happen as SDN controllers start to become available is that the controller will be sufficiently aware of VM location that it can instruct the appropriate vSwitch OR leaf-switch OR spine switch to copy packets that meet a certain set of criteria to a particular destination port. Call it super-SPAN (can you tell I'm not headed for a new career in product naming?). It would be nice to be able to define the copied packets in different ways:

Conventional L3/L4 5-tuple. This would be nice because it could be informed by NetFlow/IPFIX data, without the need for DPI on the flow-exporter.
VM DNS name, port profile, or parent hypervisor.
QoS class.
"Application profile" -- it remains to be seen exactly what this means, but this is one of those SDN holy-grail things that allows more granular definition of traffic types.

Finally, it would be nice if the controller was smart enough to be able to load-balance the copied packets when necessary, so that the same capture target sees both sides of a given flow.

Again, I know nothing about plans for this stuff from any vendor. But I hope the powers-that-be in the SDN/etc world are thinking about at least some of these kinds of capabilities.

And... about the time they get that all figured out, we'll have to be dealing with a bunch of that traffic being encrypted between VMs or hypervisors...

Programmatically Configuring Interface Descriptions from Phone Descriptions

2014-02-14T15:05:00.003-07:00

I wrote some Python code that allows you to do the following:

Query a Catalyst switch CDP neighbor table from its HTTPS interface,
Extract the device names of the attached IP phones,
Query Communications Manager for the IP phone device description,
Apply the device description as the switch interface description.

Obviously, this makes it much easier to see whose phone is attached to a switch port.

I hope that this example saves someone the head-banging that I incurred while trying to figure out the AXL XML/SOAP API for Communications Manager.

I haven't tested this extensively; all my testing has been on Catalyst 3560 and 3750 switches and CUCM version 8.6. Using the --auto switch to automatically configure the switch is quite slow; this is a limitation of the HTTPS interface rather than the script code. It may be faster to leave that option off and manually copy/paste the printed configuration if you're in a hurry.

Note that your switch must be configured to allow configuration via the HTTPS interface; you may need to modify your TACACS/etc. configurations accordingly.

All the relevant info is in the Github repo.

Why Network Engineers Should Learn Programming

2014-01-20T11:07:00.000-07:00

Because Microsoft Excel is not a text editor. Seriously.

This is a followup to the previous post, inspired by Ethan Banks of Packet Pushers. If you do operational networking at all, you deal with text files all the time: logs, debug output, configuration files, command line diagnostics, and more. I'm constantly amazed when I see people open Word or Excel to do their text editing, often one keystroke at a time.

The number one reason to learn basic programming is to automate that stuff. Personally, I use a combination of traditional Unix shell tools and Python to get the job done, but you could probably do it all with one or the other.

There are lots of other reasons to learn programming too, many of which will be discussed on an upcoming Packet Pushers episode. But if you don't believe any of those, this one alone makes it worth the effort.

Step away from the spreadsheet. Do it now.

Quick Thoughts on Learning Python

2014-01-17T10:14:00.001-07:00

I was scheduled to be a guest on an upcoming episode of the Packet Pushers podcast, on the topic of Python for network engineers. Unfortunately due to bad luck I'm not going to be able to make the recording. Here are some quick thoughts on learning Python. If you're already an expert programmer you already know how to learn languages, so this post isn't for you.

Scenario 1: You've coded in another language, but you're not an expert.
I would start with the basic Python class at Google Code. It's targeted specifically at people who know basic programming skills in some other language. It was perfect for me; I went through the exercises and was able to quickly start writing simple, useful Python scripts.

Scenario 2: You don't know how to write code at all.
Start with the Udacity CS101 class if you like guided learning, or Learn Python the Hard Way if you prefer books. Be prepared to spend a lot of time on either. It's not easy the first time around.

After you've gotten through one of those two scenarios, do the following:

Spend time browsing the documentation for the Python Standard Library. Python is a large language, and chances are there's something in the standard library that will help you meet your goals. If you find yourself writing a lot of lines of code to accomplish something fairly simple, look harder. I recommend skimming the documentation for every module, then looking more carefully at the ones that interest you.
Matt Harrison's books on basic and intermediate Python are excellent. I recommend buying them and reading them.
Jeff Knupp's Writing Idiomatic Python is really good as you gain skills. It's a bit rough around the edges, but it will help you avoid common beginner mistakes and is well worth the read.
Practice a lot. I enjoy the math puzzles on Project Euler. These are not Python specific, but their structure makes them well suited to quick problem solving in any language. Beware -- it's very addictive!

Using Bro DNS Logging for Network Management

2014-01-08T10:57:00.001-07:00

I was recently asked if someone in our desktop support group could get alerted when certain laptops connected to the corporate network. We have a lot of employees who work at industrial locations and rarely connect their machines to our internal networks, so the support group likes to take those rare opportunities to do management tasks that aren't otherwise automated.

The two mechanisms that came to mind for alerting on these events are DHCP address assignment, and DNS autoregistration. While we do send DHCP logs to a central archive, the process of alerting on a frequently changing list of hostnames would be somewhat cumbersome. I have been looking for ways to use Bro for network management tasks, so this seemed like a natural use case.

We already had Bro instances monitoring DNS traffic for two of our central DNS servers. I don't fully understand how Windows DNS autoregistration works, but from looking at the Bro logs, it appears that the DHCP server sends a DNS SOA query to the central DNS servers containing the hostname of the device to which it assigns a lease.

I wanted to learn how to use the input framework in Bro 2.2, so I wrote the following script and loaded it from local.bro:

https://gist.github.com/jayswan/8321141

This raises a Bro notice whenever one of the hostnames in the hostnames.txt file is seen in a DNS SOA query. I then set up local.bro and broctl to email this notice type to the correct person.

This works for now, but I'd love to hear from any more experienced Bro programmers about better ways to do it.

New Year Resolution: Code Cleanup

2014-01-02T16:58:00.002-07:00

I enjoyed Ethan Banks' post on New Year's Thoughts: Start with Documentation, so I thought I'd write about what I'm doing this week: code cleanup. Over the last couple of years I've written a decent amount of code to automate mundane network management tasks. As quick one-off hacks have turned into things that I actually depend on, I've noticed a lot of ugliness that I want to fix.

Everything here is assuming Python as the language of choice:

For scripts that send email, check to make sure the list of mail receivers is up-to-date.
Look for those nasty embedded IP addresses and replace them with DNS names.
Change from old-style open(FILE)/close(FILE) constructs to with open(FILE) as f constructs.
Get rid of "pickles" for persistent structured data storage. Pickles are a form of native Python object serialization that are quick and convenient, but have a lot of potential problems. I've mostly used Python's native SQLite3 library to store data in simple persistent databases, but occasionally I just use plain text files.
Look for repetitive code in the main script logic and try to move it into functions or classes where possible. For example, I had several scripts that were building email reports via clunky string concatenation, so I created a Report class that knows how to append to itself and do simple formatting.
Remove unused module imports (that were usually there for debugging).
Standardize module imports, declaration of constants, etc.
Add more comments!
Remove old code that was commented out for whatever reason.
Look for places where I was creating huge lists in memory and try to figure out a way to reduce memory consumption with generators.

Things I haven't done, but probably should:

Change to new-style string formatting.
Write some tests, particularly for log parsers.
Migrate from optparse to argparse for handling CLI flags.

Things I haven't done, and probably never will:

Migrate to Python 3.

Handy Tshark Expressions

2013-11-08T16:07:00.000-07:00

Tshark is the CLI version of Wireshark, and it's amazing. I'm going to start collecting some of my favorite tshark one-liners here. Check back often.

Find All Unique Filenames Referenced in SMB2
tshark -r file.pcap -Tfields -e ip.src -e ip.dst -e text smb2 | grep -oP "GUID handle File: .*?," | sort | uniq | awk -F: '{print $2}' | sed 's/,//'

Notes:
You don't actually need to include the ip.src and ip.dst fields, since they're not extracted by the grep command. I include them in case I want to do an ad-hoc grep for an IP address during the analysis process. Another way to do the same thing would be to modify the display filter to look only for certain addresses, e.g.:

tshark -r file.pcap -Tfields -e text smb2 and ip.addr==1.1.1.1 | grep -oP "GUID handle File: .*?," | sort | uniq | awk -F: '{print $2}' | sed 's/,//'

How to Tell if TCP Payloads Are Identical

2013-11-01T19:41:00.000-06:00

I was working on a problem today in which vendor tech support was suggesting that a firewall was subtly modifying TCP data payloads. I couldn't find any suggestion of this in the firewall logs, but seeing as how I've seen that vendor's firewall logs lie egregiously in the past, I wanted to verify it independently.

I took a packet capture from both hosts involved in the conversation and started thinking about how to see if the data sent by the server was the same as the data received by the client. I couldn't just compare the capture files themselves, because elements like timestamps, TTLs, and IP checksums would be different.

After a bunch of fiddling around, I came up with the idea of using tshark to extract the TCP payloads for each stream in the capture file and hash the results. If the hashes matched, the TCP payloads were being transferred unmodified. Here are the shell commands to do this:

tshark -r server.pcap -T fields -e tcp.stream | sort -u | sed 's/\r//' | xargs -i tshark -r server.pcap -q -z follow,tcp,raw,{} | md5sum
2cfe2dbb5f6220f29ff8aff82f7f68f5 *-

You then run exactly the same commands on the "client.pcap" file and compare the resulting hashes. Let's break this down a bit more:

tshark -r server.pcap -T fields -e tcp.stream

This invokes tshark to read the "server.pcap" file and output the TCP stream indexes of each packet. This is just a long series of integers:

0
0
1
2
1
etc.

The next command, sort -u, produces a logical set of the unique (hence the "-u") stream indexes. In other words, it removes duplicates from the previous list. Not all Unix-like operating systems have the "sort -u" option; if yours is missing it, you can use "| sort | uniq" instead.

Next, sed 's/\r//' removes the line break from the end of the resulting stream indexes. If you don't do this, you'll get an error from the next command.

The next one's a bit of a doozy: xargs -i takes each stream index (remember, these are just integers) and executes the tshark -r server.pcap -q -z follow,tcp,raw,{}command once for each stream index, substituting the input stream index for the {} characters.

The tshark -r server.pcap -q -z follow,tcp,raw,{} command itself reads the capture file a second time, running the familiar "Follow TCP Stream in Raw Format" command from Wireshark on the specified TCP stream index that replaces the {} characters. If you're rusty on Wireshark, "Follow TCP Stream" just dumps the TCP payload data in one of a variety of formats, such as "raw" or ASCII. If you've never used this option in Wireshark, make sure you try it today!

The final command, md5sum, runs a MD5 hash on the preceding input.

To summarize, we've done this: taken a file, extracted all the raw TCP data payloads from its packets (without headers), and hashed the data with MD5. If we do this on two files and the hashes are the same, we know they contain exactly the same TCP data (barring the infinitesimally small probability of a MD5 hash collision).

In my case, both capture files produced the same hash, proving that the firewall was (for once) playing nice.

Java is to JavaScript as Car is to Carpet - a Beginner's Guide

2013-10-16T10:53:00.002-06:00

Some recent discussions at work have led me to the surprising realization that lots of people working in IT don't understand that Java and JavaScript are almost completely unrelated to each other. This is actually a fairly important misunderstanding to correct: it leads to wasted troubleshooting efforts, such as downgrading or upgrading Windows Java installations in response to browser JavaScript errors.

I found the title of this blog entry in a StackOverflow post: "Java is to JavaScript as Car is to Carpet". That's pretty much it, in a nutshell. For the record, the only things that Java and JavaScript have in common are:

They are both programming languages.
The word "Java".
Both came out of the web technology explosion of the early 1990s.
Both are frequently encountered in the context of web browsers.

Java is a compiled programming language that was originally developed with a major goal of allowing similar or identical codebases to run on different platforms without needing to be recompiled. It does this by compiling to "bytecode" rather than platform-specific machine code, which then typically runs inside a so-called "Java Virtual Machine". Java was originally developed and controlled by Sun Microsystems (now Oracle), but it has since been re-licensed under the GNU Public License. Numerous open-source Java implementations now exist, but the Oracle/Sun version is still the most familiar to the average user.

Java is associated with the web browser experience because of the widespread use of Java "applets" that are embedded in browser windows. Applets are not technically part of the browser; the compiled Java bytecode is downloaded by the browser and executed in a Java Virtual Machine (JVM) as a separate process. Applets are frequently transferred as a compressed "Java archive", or JAR file. Applets downloaded by a browser do not necessarily need to run in a browser window, but the fact that they are frequently embedded there leads to some confusion.

Neither is Java necessarily a client-side technology: many popular server-side applications are written in Java and execute in a server-side JVM. Google's Android platform extends things even further, using Java as the programming language but compiling the bytecode to execute on their own proprietary virtual machine.

JavaScript, on the other hand, is an interpreted (i.e., non-compiled) programming language that was originally developed to run inside web browsers. It was developed at Netscape and was later adopted by Microsoft and standardized as "ECMAScript". The use of "Java" in the name "JavaScript" was probably an attempt to piggyback on the popularity of Java; the two languages have almost nothing in common from a technical perspective.

JavaScript is most frequently used to control the web browser experience, but there are many projects that use JavaScript completely outside the browser. My first experience with this dates back to the late 1990s, when I used a JavaScript-based commercial tool to automate software deployments to Windows workstations. Today, there are many interesting non-browser-embedded JavaScript platforms, such as Node.js and PhantomJS.

Understanding Flow Export Terminology

2013-08-02T16:43:00.000-06:00

The variety of terms used to describe network flow export technologies and components can be pretty confusing. Just last year I wrote a post on web usage tracking and NetFlow that is already a bit obsolete, so here's an attempt to explain some of the newer terms and capabilities in use today.

NetFlow Version 5
NetFlow v5 is sort of the least common denominator in flow technologies. Almost all vendors and devices that support a flow export technology will do NetFlow v5. Because it's only capable of exporting information about packet fields up to layer 4, however, it's not flexible enough to use for analytics that require information about the application layer. NetFlow v5 tracks only the following data:

Source interface
Source and destination IP address
Layer 4 protocol
TCP flags
Type of Service
Egress interface
Packet count
Byte count
BGP origin AS
BGP peer AS
IP next hop
Source netmask
Destination netmask

Netflow Version 9
Netflow v9 was Cisco's first attempt at defining an extensible flow export format, defined in RFC 3954 back in 2004. It provides a flexible format for building customizable flow export records that contain a wide variety of information types. Many of the goals for flexible flow export were defined in RFC 3917:

Usage-based accounting
Traffic profiling
Traffic engineering
Attack/Intrusion Detection
QoS monitoring

The RFC defines 79 field types that may be exported in NetFlow v9 packets, and directs the reader to the Cisco website for further field types. The latest document I could find there defines 104 field types, several of which are reserved for vendor proprietary use and some of which are reserved for Cisco use.

IPFIX
IPFIX is the IETF standard for extensible flow export. The basic protocol is specified in RFC 5101, but details are included in many other RFCs (Wikipedia has a partial list). IPFIX is based directly on NetFlow v9 and is generally interoperable, but since it's an open standard it is extensible without Cisco involvement. Hundreds of field types are defined in the IANA IPFIX documentation.

RFC 6759 defines an extension of IPFIX to include application-specific information in IPFIX export packets. This allows deep-packet-inspection technologies (such as Cisco's NBAR) to send information about non-standardized, tunneled, or encrypted application layer protocols to IPFIX collectors.

IPFIX is being used by various vendors (Plixer, Lancope, and nProbe/nTop come to mind) to export HTTP header data, making it capable of being used as a web usage tracker or web forensics tool with the appropriate collector/analyzer software.

Flexible NetFlow
As far as I can tell, Flexible NetFlow is a marketing term used by Cisco to encompass everything about their approach to configuring and implementing NetFlow v9 and IPFIX.

NSEL (NetFlow Security Event Logging)
NSEL is a proprietary extension of NetFlow v9 to used by Cisco's ASA firewalls to export firewall log data. It's not clear to me why Cisco didn't use IPFIX for this purpose.

Cisco AVC (Application Visibility and Control)
AVC is another Cisco marketing term that encompasses a variety of technologies surrounding the DPI and application-based routing capabilities in its routers, such as IPFIX, NetFlow v9, NBAR, PfR, ART (Application Response Time), and more.

Other Vendors
As mentioned above, most network technology vendors support NetFlow v5 and/or v9. IPFIX support is now becoming very common. Some vendors use proprietary extensions of NetFlow v9; Riverbed's CascadeFlow is one example of this.

In a followup post, I'll take a look at some tools that produce flow data without using export technologies.