Wednesday, November 14, 2012

Rebooting Routers and Random Search Engine Fodder

I frequently tell junior networking folks that rebooting routers without a good reason is a sign of weakness. All too many people who got their start as Windows sysadmins have learned from experience that the way to fix any weird problem is to reboot. In a Windows environment it usually works.

In the networking world, that doesn't cut it. One of the defining characteristics of a good network engineer is a belief in a "culture of availability", in which you don't take down services in a haphazard, unplanned fashion just to see what happens. Reboot-just-to-see-what-happens isn't a part of that culture.

Unfortunately, software bugs sometimes make reloads necessary, but I prefer to identify them before rebooting. Twice in the last month though, I've run into bugs that required a reboot without much research beforehand. The first one involved a brand-new 2900-series router that needed a reboot before the SRST EULA would show up as accepted: annoying! I opened a TAC case on this one and was advised to try a reboot before anything else; never got a bug ID either.

The second one involved an end-of-support voice gateway that I had to shift from MGCP to H.323 during part of a M&A transition. After making the changes, calls coming in on the PRI would connect, then disconnect immediately before matching any dial-peers: everything in debug isdn q931 looked normal, yet debug voip ccapi showed nothing. The console would log this message:

vnm_dsprm_voice_connect: mismatch htsp

Google was remarkably unhelpful on this, with only a few matches and none that were enlightening. I'm guessing that "vnm" stands for "voice network module", and I know that "htsp" shows up in a lot of voice module debugs, but I don't know much more. The "dsp" part seemed promising, but the gateway had no DSP farm configuration. I'd tried various combinations of "shut/no shut" on the T1 controller, voice port, and D channel with no luck.

With nothing else to go on and end-of-support equipment that had been up for about three years, I decided to give in to weakness and try a reload, just to see what would happen. It immediately started working. Bah.

So, I'm posting this here so that search engines will find it and other people can try a reload if they run into the same situation.

Wednesday, October 24, 2012

Walk on the Wild Side: VoIP over VPN over Internet

Over the years I've seen or heard a lot of snide or offhand comments (from vendors, at conferences, on Twitter, etc.) regarding running voice over Internet VPNs in the enterprise environment. It's often taken for granted that people will pay for MPLS VPNs just to be able to control voice quality, and people who don't do so are sometimes assumed to be either too dumb to know better, or at least "deserving" of what they get.

At the company for which I work, we've migrated over the last several years from a WAN consisting mostly of leased circuits and MPLS VPNs to running almost entirely on IPSec VPNs over the Internet. The biggest reason is cost: we work in what I only half-jokingly refer to on Twitter as #ExtremeRuralNetworking. Many of our WAN sites are in remote locations served by only one small rural LEC, with extremely long distances from the central office. Provisioning MPLS VPNs through nationwide carriers to these sites can be unbelievably expensive, sometimes 20-30 times the cost of a DSL circuit or an Internet T1, or even more. Frequently the only service available is based on some kind of long-range wireless technology. Sometimes local providers will sell you (or a nationwide MPLS provider) something that claims to be a terrestrial T1 but actually includes microwave hops. Occasionally a MPLS provider's backhaul network doubles or even triples the latency compared to an Internet path.

Our experience has been this: VoIP performance over Internet VPNs is almost as good as over MPLS VPNs with dedicated service planes. There is definitely a small percentage of the time that voice quality suffers, but when we asked our business units if they would rather have better voice quality or pay substantially larger WAN bills, the choice was easy. I would go so far as to say that 99+% of the time, most people can't distinguish between Internet VPN voice versus MPLS VPN voice.

Keeping in mind that we deal with relatively low call volume (i.e., we're not running call centers over Internet VPNs!), here are a few things I've learned in setting up VPNs for the best voice-over-VPN-over-Internet quality:
  • Set up your QoS mostly the same way you would over private circuits or MPLS VPNs: put voice in a priority queue, reserve bandwidth for call control, use a scavenger class, etc.
  • Shape your traffic to the physical capacity of the link; if you have a 1.5 Mb contract with a 100 Mb physical link, make sure you shape to 1.5 Mb.
  • Avoid radically asymmetric circuits if the "up" speed is very slow. We had several sites with a 12 Mb "down" speed and a 768 kbps "up" speed; this proved to be mostly unworkable given our traffic patterns.
  • Use the same providers where possible, and consider the number of AS hops between you and the target AS. This makes for better consistency between sites and streamlines troubleshooting. It also usually results in small reductions in latency, which make a big difference in voice quality.
  • Latency is usually the biggest variable in VoIP over VPN setups. The modern Internet combined with good voice codecs is surprisingly good at dealing with packet loss and jitter, but latency is often highly variable, and as I said above, makes a bigger difference than I would have thought.
  •  Simple and consistent configurations make things easier, as always. We use Cisco DMVPN, which makes for a pretty easy configuration template.
  • If you have more than one uplink, choose the one with the lowest latency as your primary, and check it periodically. Small providers frequently change their transit providers, and it's not uncommon to see big changes performance through the same small provider several times a year.
  • Set user expectations. If your business knows you're saving them money and the price is that they have to switch to cell phones, long distance PSTN dialing, or POTS lines a few times a year, they'll deal with it.
Your mileage may vary.

Wednesday, August 22, 2012

Tip for Solarwinds NCM Users Facing SmartNet Renewals

It's time for our annual Cisco SmartNet renewals, which always results in much pain, wailing, and gnashing of teeth. If you use Solarwinds Network Configuration Manager, you already have your serial numbers and hostnames in the database. However, there's no canned report that gives you just those two items without a bunch of extra information. Here's a SQL query that makes it a bit easier by extracting the hostname and serial number of your devices:

select nodecaption,chassisid from nodes,cisco_chassis where nodes.nodeid=cisco_chassis.nodeid order by nodecaption asc

If you paste the results of this into a sheet in Excel, you can then use a VLOOKUP function in Excel to match the serial numbers your Cisco channel partner sends you to the hostnames recorded in NCM. This makes the true-up a lot easier.

Tuesday, August 21, 2012

IOS Quick Tip: Find Never Used Switch Ports

Here's a quick regex that allows you to find ports that aren't just currently inactive, but which have never been used since the switch was reloaded:

c3750-ASW2A#sh int counter | i \ 0\ +0\ +0\ +0
Fa1/0/2                0             0             0             0
Fa1/0/5                0             0             0             0
Fa1/0/6                0             0             0             0
Fa1/0/7                0             0             0             0

There's a space after each one of the backslashes. What we're really looking for is ports that show zero packets for all of their interface counters: hence, the regex shows lines that have four zeros preceded and followed by one or more spaces, followed by a zero.

Note: you don't actually need the backslashes in IOS; apparently spaces don't need to be escaped in the IOS regex parser. Using them is a habit formed by working with regexes in other OSes. For example:

c3750-ASW2A#sh int counter | i 0 +0 +0 +0
Fa1/0/2                0             0             0             0
Fa1/0/5                0             0             0             0

Initially, I used a $ at the end of the regex to match the zero at the end of the line, but this doesn't seem to work in all IOS versions; I suspect some of them must have whitespace after the zero. In later images, you can use the "count" filter to count the number of never-used ports. Note that because there are separate sections for input and output packets in "show interface counter", you'll need to divide the result by 2:

c3750-ASW2A#sh int counter | count \ 0\ +0\ +0\ +0
Number of lines which match regexp = 60 <-- divide this by 2

If you're running the command via SSH from a Unix-y shell, you can get just the interface names like this:

$ssh 10.1.1.5 'sh int counter | i 0 +0 +0 +0' | sort | uniq | cut -d ' '  -f 1 
Fa1/0/10
Fa1/0/13
Fa1/0/14

Thursday, August 9, 2012

BroExchange 2012 Linkfest

I just went to the 2012 BroExchange, the conference for users of Bro-IDS. I am almost completely new to Bro, but had a great time learning from some seriously smart folks. Here's a random linkfest of stuff:

Brownian, a web front-end to an ElasticSearch back-end for Bro logs.
auditing-sshd, a version of OpenSSH that exports user activity data to Bro (or other log infra, I suppose)
parallel, a GNU tool for executing shell jobs in parallel.
cpacket, maker of mirror switches, similar to Gigamon.
Bro developer Seth Hall's github page
The awesome Security Onion Linux distro with Bro pre-installed.

I may add more as I decipher my notes.

Friday, July 27, 2012

Active Directory and ASA LDAP Authentication

A quick note on using LDAP for multi-domain authentication with Cisco ASA and an Active Directory global catalog server... when using the ASA to match on an LDAP object name, like this:

ldap attribute-map MY_MAP_NAME
  map-value memberOf "CN=foo,OU=bar,DC=example,DC=com" MY_GROUP_POLICY

...the Active Directory group needs to have certain properties:

  1. It must be a security group with universal scope.
  2. Users in the group must have a primary group different from the group matched by the ASA.
  3. The user's primary group must have universal scope.
I don't know if this still holds true if you have only a single domain and you're using the regular Active Directory LDAP service instead of the global catalog service, but in a multi-domain setup the GCS does not correctly report the "memberOf" attribute unless these conditions are met. This is an Active Directory quirk and thus is not directly related to ASAs, but troubleshooting an ASA issue was how I discovered it.

Sunday, July 1, 2012

CiscoLive 2012 Rundown

Late but better than never:

2012 was my seventh consecutive year at CiscoLive, and for probably the third or fourth year in a row it was the only in-person training event I attended. I've tried to shift to on-line training as much as possible, but the social aspects of CiscoLive are important enough to me that I prefer to go in person. I hadn't been to San Diego in about 15 years, and I really enjoyed the location. I ended up in a hotel that was about 1.5 miles from the conference center, but due to the pleasant location I only ended up taking the shuttle once the entire week; the walk was great.

I didn't have an overall focus this year like I have in years past; I wanted to get exposure to some new stuff, increase depth in other areas, and explore a few things just for fun. I'm not going to review every session, but here are some highlights:

DWDM 101
We don't use DWDM at my company, but it's an area about which I've always wanted to learn more. This was a great introductory session with a lot of new material for me; I'm definitely going to have to review it again if I want to retain it.

Cisco ISR G2: Architectural Overview and Use Cases
My company uses ISRs extensively, so this was a great way to catch up on newer developments in branch routing. The UCS E-series servers that allow you to embed a hypervisor into the branch router were probably the most interesting new product of the show for me. Even though I work for a fairly small company, we are still surprisingly siloed between network and server teams, so it will be interesting to see if the server guys will be interested; they are strongly committed to Dell at this point.

Data Center Design for the Small and Medium Business  
I was pretty excited about this, since it's pitched exactly at my use case. My company is falls into the upper end of Cisco's definition of "medium", and is pushing into the small enterprise category in terms of employee count. However, we have a very distributed network, so our data center needs still fall squarely into the "medium" design category.

Architecting Solutions for Security Investigations and Monitoring
I went to this session because the presenter, Martin Nystrom, is one of the great lesser-known presenters at the show. I've been to several of his sessions over the years, and they're often lightly attended despite being some of the best sessions I've seen. I guess logging and monitoring just don't sound sexy enough. Despite the unfortunate verbification of the word "archtect" in the session title, this was possibly my favorite session this year. The best thing about Martin's sessions is that he is not in marketing: he runs Cisco's incident response team, and if Cisco products can't do the job, they either get them fixed or use something else. He explains exactly how and why Cisco designs for incident detection and response, and helpfully, often talks about what they tried that didn't work.

Global MPLS WAN Redesign Case Study
I went to this purely out of academic interest, but it turned out to be fascinating. This was a detailed case study of a network migration for an unspecified large government agency with the craziest customer requirements EVER. If you sit down and think about what would possibly corner you into NEEDING to build MPLS VPNs over a mesh of static GRE tunnels using 7600s, you would only be scratching the surface. Only the government could design something this weird.

Exploring the Engineering Behind the Making of a Switch
If you don't have a background in electrical or mechanical engineering, check this out: a really well done explanation of what it takes to bring a switch to market, from customer requirements to chassis design to ASIC design.

The NOC at CiscoLive
This is the last session of the show, and is thus sparsely attended, but it's worth seeing. A generally light-hearted, funny, and informative presentation about designing, building, and running the CiscoLive show network. The highlight this year was the revelation that the big outages at the beginning of the show were caused by blade reloads on a 6500 after somebody "borrowed" a bunch of RAM off the blade before it was shipped to San Diego. The wireless network engineer's explanation of how this was blamed on "wireless problems" was hilarious.

Keynotes

  1. I wish John Chambers would stop giving the same keynote every year.
  2. Padmasree Warrior is a better presenter than her boss. Maybe we can start with her next time.
  3. Please find someone new to make the executive slide decks. They are really awful.
  4. Jamie and Adam were awesome. Mythbusters is pretty much the only show I watch and I was psyched to see them. Thanks CiscoLive!
CCIE NetVet Reception
There should be a hard limit on the number of comments to Chambers about problems with specific products. Maybe it should be zero. I like seeing people ask hard questions, but droning on about your problems with the Wizbang 4000 or whatever is a waste of time.

General Comments
Great job with the vegetarian food options! I am not a vegetarian (I just like to see vegetables suffer), but I eat a lot of vegetables and I really appreciate the great selection this year.

I've also enjoyed Cisco's efforts over the years to reduce the amount of garbage produced by the event. For 2012, I appreciated the reduced amount of junk included with the (excellent this year) backpack. Next time I'm going to try to remember a coffee mug to reduce the number of paper cups I use.

All that said: I'm not sure if I'll go next year... getting to Orlando from rural Colorado is a pain, so I may skip it in favor of something like Sharkfest or Splunk's conference. If I don't make it, I'll see everyone in San Francisco in 2014.

Friday, April 27, 2012

CUCM CDR Cause Codes

Lately I've been doing a fair amount of work with CUCM call detail records. The cause codes are always a pain, since they're listed as numeric codes that correspond to meanings listed only in otherwise obscure documents.

To help with this, I created a big list of all the different cause codes used in Cisco Unified Communication Manager call detail records, put them into Python dictionary format, and posted them on Github:

https://github.com/jayswan/cdr/blob/master/causecodes.py

They should be human readable as-is, and easy enough to convert into hash tables for use in other scripting languages. Hopefully this will save someone from having to pore through the Cisco PDFs that otherwise are the only source for this information. I used the CUCM 6.x documentation when creating it, but it should be quite similar for other versions.

One of these days I'll clean up my CDR analysis scripts enough for public consumption.

Wednesday, April 18, 2012

Cisco IOS 15.2M and 15.2T Command Reference & Configuration Guides

For my own reference; as of today Cisco's search tool doesn't show these on the first page of results even if you search for the explicit document names. Google and DuckDuckGo do better, but don't find the root pages.

Command References
Configuration Guides

Monday, April 16, 2012

CCIE: Five Year Reflections

I passed the CCIE Routing & Switching lab five years ago today. Back then my number seemed enormous, but now five years later I'm already below the halfway point (as of this writing I believe the numbers are in the mid-to-high thirty-thousands). A lot has changed since then: it seems like "data center" and "cloud" have taken over almost completely as the hot topics in network engineering (with "software defined networking" hot on their heels), and it seems like Cisco has lost some of its shine due to its rapid diversification into markets outside pure networking and the rise of tough competitors in networking niche markets. We've gone through a huge economic contraction that we may or may not be exiting. In the certification world, Cisco has added several new tracks, and other companies have added their own coveted expert-level certifications.

I want to write about a few trends in certification and professional development that I've either observed personally, or that seem to be the subject of frequent discussion on the Internets.

Consolidation
One of the most interesting things I've noticed as a regular attendee at Cisco Live is that almost all CCIEs are in one of three categories:
  1. Consultants working for Cisco resellers.
  2. Employees of Cisco or one of its competitors.
  3. Instructors working in the training and certification business.
Maybe this is sampling bias: perhaps it's just that the majority of CCIEs who attend Cisco Live also fall into one of those categories, and the ones who don't aren't attending in droves. Still, it seems comparatively rare to find CCIEs who are actually employed full time in network design or operations for a single company. I think one reason for this is that as IT employees in operational positions gain experience and seniority, their training and professional development opportunities decrease, possibly due to increasing costs, lack of availability of advanced training, and reluctance of employers to have their A-Teamers away from the office.

I think this is unfortunate, and it may be one cause behind the churn that companies tend to experience among high-level technical employees. The expense of maintaining training and professional development programs for these employees may also be a factor in the amount of outsourcing that we see in the network engineering field.

Track Proliferation and Specialization
I feel like the CCIE Routing & Switching track is kind of like a black belt in a legitimate martial art: it represents a thorough mastery of the basics, impresses novices who don't know any better, and hopefully impresses upon its recipients that they are really just at the beginning of the path. It still seems to me like it would be hard to pass the CCIE lab without understanding fundamental networking really well, but apparently it is possible; it's not uncommon to read about "paper CCIEs", and I've met at least one myself.

For me, the whole motivation behind studying for the lab was to confirm and exercise my understanding of the basics. I'm not a consultant or reseller, and I'm no longer a trainer; I actually work on the same network every day, and although my employer was very supportive of my studies, they certainly didn't require it. This motivation is one of the reasons that I haven't gone on to another track: they're too product specific for what I do. I work daily with Cisco security, voice, and wireless products, but I'm not intellectually driven by that kind of product specificity in anything resembling the way that I'm driven by the underlying theory and practice of general networking. The logical next step for me would probably be the CCDE, and indeed I was lucky enough to be invited as a beta participant in that program. I got spanked badly on the practical and haven't gone back, at least partly because the exam made it clear to me that even if I passed, I wouldn't have the real-world experience of working on multi-thousand router networks to go along with it.

Defining the Super-Generalist
None of this is meant to diminish the accomplishment inherent in the other CCIE tracks in any way: I remain extremely impressed by my friends who have passed the other tracks. However, for people working in mainstream IT networking my observation has been that the world could use more super-generalists. What skills should the super-generalist have? Here's my take on it:

[Edited to add: I'm not saying that this is a high-level skill set that substitutes for a CCIE. I'm saying this is a good base for working towards CCIE, and that if you find yourself missing big chunks of this while working on your second CCIE, you might consider re-prioritizing your learning.]
  • Extremely solid IPv4 networking fundamentals. Certification programs are supposed to emphasize the basics, but I see CCNP-level people who haven't yet fully grokked ARP, STP, connection-oriented vs connectionless concepts, or why routing protocols work the way they do, even if they can explain how they work
  • A growing familiarity with IPv6, and an appreciation of how protocols other than IPv4 have attempted to solve common problems.
  • The ability to use Wireshark and tcpdump and interpret the resulting data.
  • An understanding of the inner workings of common application-layer protocols, especially HTTP, DNS, and SMTP (yeah yeah, you can say email is dead but people still scream when it breaks). People can and do make entire careers out of each one, but understanding the basics is imperative. I am always amazed at how common it is to see server admins who don't understand HTTP response codes or how a recursive DNS query works.
  • A familiarity with the internals of both Windows and Linux.
  • Familiarity with common virtual machine platforms and how they affect networks.
  • The basics of a scripting language and the common automation tools in the platforms with which you work most frequently.
  • Fundamentals of network monitoring: SNMP, NetFlow, syslog, WMI, taps and mirror ports, considerations for asymmetric flows, etc.
  • The basics of databases. This has long been one of my weakest areas, and something I've been working on fixing.
  • The security considerations surrounding all of the above--and not just from a control standpoint. It's not enough to just know packet filtering and encryption; you also need to understand more than a little about the psychological aspects of security and privacy, and you should understand how your monitoring and diagnostic tools can be used both for good and ill.
  • The big-picture of how the Internet works: what BGP is and the common ways that ISPs connect to customers and to each other, what CDNs are, the role of IANA and the RIRs, what the IETF and RFCs are, etc.
  • A little respect for the ones who have gone before us, and some knowledge of Internet folklore. You damn well ought to know a little about the likes of Paul Baran, John Postel, Vint Cerf, Radia Perlman, and many others.
  • The ability to write and speak coherently!
I'm sure I've left a few things out (add them in the comments), but even with just these you can iterate through them for years on end.

Thursday, April 12, 2012

Thoughts on Udacity CS101

Over the last 7 weeks I took the first-ever offering of Udacity's CS101 class. This was billed as a free basic computer science class for raw beginners, using Python as the language of choice. I'm neither a beginner nor a skilled programmer: I started programming more years ago than I care to admit, but I've never done it as a core part of my job, and I'm entirely self-taught. Over the years I've used only a small number of languages:
  • BASIC, back in the Apple ][ days
  • Perl (off and on--mostly off), since the Perl 4.x days
  • Objective-C (back on the OpenStep platform, before MacOS X revived it)
  • I've also played around with several other languages, including C, Pascal, Tcl, JavaScript, and others, but haven't done anything with them beyond the play stage.
  • Python
I started my exposure to Python maybe 18 months ago when I had a few small work-related scripting projects, and I wanted to learn something new. I've historically used Perl for networking-related scripts, but although I'd probably used it more than any other language, it's never really clicked for me in an intuitive way.

I started by working through the introductory Python course on Google Code, and it immediately felt right; I was able to quickly start writing small useful scripts without a lot of trouble. After that, I spent some time working through some other tutorials, materials from Pycon, and puzzles at Project Euler. I also wrote quite a few small projects at work, some of which I've blogged about here. Naturally, I was excited when I heard about Udacity's new curriculum; I've always wanted to take some CS classes to fill in holes in my base knowledge, but I've never had time. So, how did it go?

The good:
  • The user interface and website functionality was GREAT. I loved the format and delivery style of the videos. The tablet interface that the instructors use to deliver the course material is outstanding.
  • The embedded Python interpreter works really well. Only a few times did I feel it was necessary to code outside the embedded tools.
  • The homework assignments were well crafted and fairly graded.
  • The instructors were excellent. I really liked the "field trips" into real-world environments and the trips into computing history.
  • The short class cycles are really nice, and make it easier to keep up with the course.
Caveats (I'm not saying this is bad... just stuff to be aware of)
  • This doesn't seem to be a course for raw beginners, unless you have a ton of time to put into it. It started off really slow, but quickly ramped up the pace. If I had had no programming background I wouldn't have been able to keep up after the first few weeks.
  • It doesn't seem to be directed at intermediate programmers either. With the exception of the final sections on recursion, I didn't learn anything completely new, but the experience of doing the homework was still excellent for practicing and clarifying the basics. That said, I don't think they could have done it much better... when you are offering a class to thousands of participants, you need to make some tough choices. Frame your expectations accordingly.
  •  I wish that they had introduced some core Python concepts earlier and worked backwards into implementation details, rather than working forwards from smaller pieces. For example: before learning about Python dictionaries (i.e. hash tables), we had to implement a simple hash table using lists. This was interesting, but tedious--and I had the advantage of being able to see where they were going with it. I felt like it would have been better to introduce the dictionary first, then explain how you would implement it using simpler components. They also skipped some of the constructs that make Python seem so much more powerful and intuitive than some other languages, such as list comprehensions and generator expressions. The instructors have much more experience with teaching the material than I ever will, however--so maybe their way works better for the majority of students.
  • I never got into the discussion forums--they seemed rather chaotic, and every time I checked them I had to wade through a lot of posts complaining about issues with the grading. At the same time, the forums are really the only place you can get any personalized assistance, so I guess I'll need to figure them out in the future.
Overall, I thought it was a great experience and I'll definitely be taking another course from Udacity in the future.

Thursday, April 5, 2012

Why NetFlow Isn't A Web Usage Tracker

2013 Update:
As I mentioned in an update to the original post, HTTP tracking is now available via custom IPFIX exports in a variety of products. Cisco's ISR G2 and ASR 1K routers now have this export available as part of their MACE feature set (Data license required). You'll still need a collector capable of dealing with this export record, however. The content below still applies to NetFlow v5 and traditional NetFlow v9.

Here's a question I find myself answering frequently on the Solarwinds NetFlow forum:

How can I use NetFlow to track the websites being accessed from my network?

The short answer that I usually give on the forum is this: you can't, because NetFlow v5 doesn't track HTTP headers. With this blog post, though, I'll go into the answer in more detail so that I can refer people to it in the future.

First, a quick review of what NetFlow is, and how it works:
  • When NetFlow is enabled on a router interface, the router begins to track information about the traffic that transits the interface. This information is stored in a data structure called the flow cache.
  • Periodically, the contents of the flow cache can be exported to a "collector", which is a process running on an external system that receives and stores flow data. This process is called "NetFlow Data Export", or NDE. Typically the collector is tied into an "analyzer", which massages the flow data into something useful for human network analysts.
    •  NDE is optional. One can gather useful information from NetFlow solely from the command-line without ever using an external collector.
  • Data that can be tracked by NetFlow depends on the version. The most commonly deployed version today is NetFlow version 5, which tracks the following key fields:
    • Source interface
    • Source and destination IP address
    • Layer 4 protocol (e.g., ICMP, TCP, UDP, OSPF, ESP, etc.)
    • Source and destination port number (if the layer 4 protocol is TCP or UDP)
    • Type of service value
  • These "key fields" are used to define a "flow"; that is, a unidirectional conversation between a pair of hosts. Because flows are unidirectional, an important feature in NetFlow analysis software is the ability to pair the two sides of a flow to give a complete picture of the conversation.
  • Other "non-key" fields are also tracked. In NetFlow version 5, the other fields are as follows. Note that not all collector software preserves all the fields.
    • TCP flags (used by the router to determine the beginning and end of a TCP flow)
    • Egress interface
    • Timestamps
    • Packet and byte count for the flow
    • BGP origin AS and peer AS
    • IP next-hop
    • Source and destination netmask
  • NetFlow v9, Cisco Flexible NetFlow, and IPFIX (the IETF flow protocol, which is very similar to NetFlow v9) allow user-defined fields that can track any part of the packet headers. IPFIX offers enough flexibility to track information about HTTP sessions, and many vendors are starting to implement this capability.
  • Many vendors have defined other flow protocols that offer more or fewer capabilities, but virtually all of them duplicate at least the functions of NetFlow v5.
For reference, here's a snapshot from a packet capture of a NetFlow v5 export packet (the destination public IP address has been disguised as a RFC 1918 address):

    pdu 1/30
        SrcAddr: 203.79.123.118 (203.79.123.118)
        DstAddr: 10.118.218.102 (10.118.218.102)
        NextHop: 0.0.0.0 (0.0.0.0)
        InputInt: 1
        OutputInt: 0
        Packets: 3
        Octets: 144
        [Duration: 1.388000000 seconds]
            StartTime: 3422510.740000000 seconds
            EndTime: 3422512.128000000 seconds
        SrcPort: 3546
        DstPort: 445 <-- probably a port scan for open Microsoft services
        padding
        TCP Flags: 0x02
        Protocol: 6  <-- this is the layer 4 protocol; i.e. TCP
        IP ToS: 0x00
        SrcAS: 4768 <-- this particular router is tracking BGP Origin-AS
        DstAS: 0
        SrcMask: 22 (prefix: 203.79.120.0/22)
        DstMask: 30 (prefix: 10.118.218.100/30)
        padding
    Returning to our original question:

    NetFlow v5 isn't a good web usage tracker because nowhere in the list of fields above do we see "HTTP header".  The HTTP header is the part of the application layer payload that actually specifies the website and URL that's being requested. Here's a sample from another packet capture:

    GET / HTTP/1.1
    User-Agent: curl/7.21.6 (i686-pc-linux-gnu) libcurl/7.21.6 OpenSSL/1.0.0e zlib/1.2.3.4 libidn/1.22 librtmp/2.3
    Host: www.ubuntu.com
    Accept: */*

     This is the request sent by the HTTP client (in this case the "curl" command-line HTTP utility) when accessing http://www.ubuntu.com. The header "GET / HTTP/1.1" command requests the root ("/") of the website referenced by the "Host:" field; i.e. www.ubuntu.com.

    The IP address used in this request was 91.189.89.88. However if we do a reverse lookup on this address, the record returned is different:

    $ dig -x 91.189.89.88 +short
    privet.canonical.com.

    A little search-engine-fu shows that several other websites are hosted at the same IP address:

    kubuntu.org
    canonical.com

    If we do the same trick with other websites (like unroutable.blogspot.com, hosted by Google), we can easily find cases in which there are dozens of websites hosted at the same IP address.

    Because NetFlow doesn't extract the HTTP header from TCP flows, we have only the IP address to go on. As we've seen here, many different websites can be hosted at the same IP address; there's no way to tell just from NetFlow whether a user visited www.canonical.com or www.ubuntu.com. Furthermore, with the most popular sites hosted on content distribution caches or cloud service providers, the reverse DNS lookups for high-bandwidth port 80 flows frequently resolve to names in networks like Akamai, Limelight, Google, Amazon Web Services, Rackspace, etc., even if those content distribution networks have nothing to do with the content of the actual website that was visited.

    The bottom line is this: if you want to track what websites are visited by users on a network, NetFlow v5 isn't the best tool, or even a good one. A web proxy (e.g., Squid) or a web content filter (e.g., Websense, Cisco WSA, etc.) is a probably the best tool, since they track not only HTTP host headers but also (usually) the Active Directory username associated with the request.

    Other tools that could do the job are security related tools like httpry or Bro-IDS, both of which have features for HTTP request tracking. These tools are both available in the excellent Security Onion Linux distribution.

    [Edited to add] The anonymous commenter below observes that nProbe exports HTTP header information via IPFIX, and notes that some vendors have firewalls that do so as well. nProbe is an excellent free tool that takes a raw packet stream and converts it to NetFlow or IPFIX export format.

    Friday, March 30, 2012

    Configure Windows IP Addressing from Command Prompt

    For reference, because I have a mental block about remembering this:


    netsh interface ipv4 set dnsservers "Local Area Connection" static 8.8.8.8 primary
    netsh interface ipv4 set address "Local Area Connection" static 1.1.1.2 255.255.255.0 1.1.1.1


    netsh interface ipv4 set address "Local Area Connection" source=dhcp

    Wednesday, March 7, 2012

    PyLab with iPython on MacOS X 64-bit

    Putting this here for search engine fodder, since it took me a while to figure out...

    If you try to run the way-cool PyLab feature of iPython on MacOS X with a 64-bit platform (e.g., my iMac running Lion), you get this:

    ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/multiarray.so, 2): no suitable image found.  Did find:
        /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/multiarray.so: no matching architecture in universal wrapper

    What this rather cryptic error means is that the library is 32-bit. To get it to work simply run it with the "arch -i386" command:

    $ arch -i386 ipython --pylab

    The numpy module should now load normally, and PyLab should start.

    Friday, February 10, 2012

    Cisco CDR Timestamp and IP Address Conversion

    The call detail records (CDRs) used by Cisco Unified Communications Manager (aka Call Manager) can be tricky to interpret. Two fields that are frequently confusing are the IP address fields and the timestamp field.

    Continuing my theme of using Python to help with Cisco product administration, here's how to convert timestamps and IP addresses to human-readable format.

    Timestamps are pretty easy. They're recorded in UNIX epoch time and there are lots of examples available showing how to convert them using Excel or various scripting tools. In Python:

    import time

    def time_to_string(time_value):
        """convert Unix epoch time to a human readable string"""
        return time.strftime("%m/%d/%Y %H:%M:%S",time.localtime(float(time_value)))

    It took me a while to figure out how to convert IPv4 addresses, since CUCM uses signed 32-bit integers to represent IP addresses and Python's long integers can be of infinite length. First, a review of how to do the conversion manually is in order. Let's say that a CUCM CDR lists an IP address as "-2126438902". First, we convert this signed 32-bit integer to hex using Windows calculator:

    -2126438902 = 0x81411E0A

    Next, we break the hex number into 1-byte chunks, and reverse the order:

    0x0A
    0x1E
    0x41
    0x81

    Finally, we convert each byte to decimal and put them together into an IP address:

    10.30.65.129

    Here's the Python function to do the conversion:



    And here it is in the interpreter showing that it works for reasonable input types:

    >>> int_to_ip('-2126438902')
    '10.30.65.129'
    >>> int_to_ip(-2126438902)
    '10.30.65.129'
    >>> int_to_ip(-2126438902L)
    '10.30.65.129'

     I don't have access to an IPv6-aware CUCM install, so you're on your own for that!

    Sunday, January 29, 2012

    Amusement: Python and Netmasks

    As a network engineer, it's not uncommon for me to need to convert between hex and decimal. While I'm reasonably good at doing this in my head for smaller numbers, when it comes to deciphering stuff like higher TCP or UDP port numbers written in hex, I usually end up using the Python interpreter that's usually open somewhere on my machine. For me, the Python interpreter is the best general purpose calculator app I've found. Using the port number for Flash as an example:

    jswan$ python
    Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34)
    [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 0x78f
    1935
    >>> hex(1935)
    '0x78f'

    The topic of numeric base conversions often makes my mind drift to every former networking instructor's pet topics, IPv4 subnetting. The other day, I started playing with using the Python interpreter to find network IDs, which is really easy as long as you're using hex:

    >>> hex(0x0a0a0a25 & 0xffffffe0)
    '0xa0a0a20'

    This little snippet finds the bitwise-AND of 10.10.10.37 and 255.255.255.224, which as every up-and-coming CCNA knows is the network ID for that subnet: 10.10.10.32. Back when I was teaching Cisco classes full-time, I used to have a never-ending argument with another instructor about the fact that I didn't teach bitwise operations as part of subnetting: my position was that if all you are doing is solving subnet problems with your squishy human brain, you don't need to learn a bunch of truth tables when there are easier ways to do it in human memory. However, if you're writing code you actually do need to do bitwise operations.

    Unfortunately most of us (me included) aren't wired for reading IPv4 addresses in hex. So I started wondering how little Python code I could use to calculate network IDs for IPv4 in dotted decimal. A little screwing around and I started wondering if I could fit the entire thing into a Twitter post. Here's what I came up with:

    while 1:'.'.join([str(a&m)for a,m in zip([int(n)for n in raw_input('addr?').split('.')],[int(n)for n in raw_input('mask?').split('.')])])

     Try it:

    addr?1.1.1.37
    mask?255.255.255.224
    '1.1.1.32'

    Now I realize this bit of code is neither particularly clever nor easy to read, which makes it bad code. But hey, it fits in a single tweet. It works by using one of Python's coolest features, the list comprehension. If we start with the innermost parts, it makes more sense:

    [int(n) for n in raw_input('addr?').split('.')]
    [int(n) for n in raw_input('addr?').split('.')]

    These two sections return lists of integers corresponding to the address and mask the user enters. For example:

    [1,1,1,37]
    [255,255,255,224]

    Next, the "zip" function returns a list of tuples that pair corresponding entries in the two lists:

    [(1, 255), (1, 255), (1, 255), (37, 224)]

    Next, the outermost list comprehension performs a bitwise-AND of each tuple, returning the octets of our network ID in a new list:

    [1,1,1,32]

    Finally, the "join" method puts them back together into "1.1.1.32", and "while 1:" makes it a loop until you ctrl-c out of it.

    Working with netmasks in CIDR notation is a bit more complicated, and requires more than one line of code--I'll save that for another post.