Thursday, December 31, 2009

predicting reload times on Catalyst 3560/3750

During a recent IOS upgrade on a Catalyst 3560, I was connected to the console and noticed that the reload was taking much longer than usual due to some operations by the "Front End Microcode IMG Mgr". The output looked like this:

POST: PortASIC RingLoopback Tests : Begin
POST: PortASIC RingLoopback Tests : End, Status Passed

front_end/ (directory)
extracting front_end/fe_type_1 (34760 bytes)
extracting front_end/front_end_ucode_info (86 bytes)
extracting front_end/fe_type_2 (73104 bytes)
extracting ucode_info (76 bytes)

Front-end Microcode IMG MGR: Installed 3 image(s) in cache:

Front-end Microcode IMG MGR: found microcode images for 3 devices.
Image for front-end 0: flash:/front_end_ucode_cache/ucode.1
Image for front-end 7: flash:/front_end_ucode_cache/ucode.1
Image for front-end 14: flash:/front_end_ucode_cache/ucode.1

Front-end Microcode IMG MGR: Preparing to program device microcode...
Front-end Microcode IMG MGR: Preparing to program device[0]...26580 bytes.
Front-end Microcode IMG MGR: Programming device 0...rwRrrrrrrwsssspsssspsssspsss
[output truncated]

I opened a TAC case to find out what this is, since if you are relying on highly predictable reload times during a maintenance window, this could throw a wrench into your plans.

It turns out that the Catalyst switches have a special-purpose microcontroller that rarely needs to be upgraded. When it does need upgrading, however, the upgrade happens as a normal part of a new IOS image load. This upgrade makes the first reload to the new IOS take much longer than usual--I didn't time it, but I would guess 3-4 times longer than normal.

Microcontroller upgrades are not typically listed in the image release notes, so the only way to know for sure how long a particular upgrade is going to take is to test it in a lab, using the exact same before/after images that you will use in production.

Monday, December 21, 2009

ACLs and TCAMs in Catalyst Switches

One of the things you need to look at when designing networks with Catalyst switches is the potential for TCAM exhaustion due to ACL and QoS configuration. Here are a couple of documents that explain the issue:

Catalyst 6500

Catalyst 4500 and 4900 Series

Tuesday, December 15, 2009

ten steps of small LAN design

A few days ago I posted an amusing comment on Ivan Pepelnjak's always excellent Cisco IOS Hints and Tricks blog, and he found it funny enough to create a separate post. I'll repeat my ten step program here for future reference:

  1. Build everything at layer 2 because "it's simpler".
  2. Scale a little.
  3. Things start breaking mysteriously. Run around in circles. Learn about packet sniffers and STP.
  4. Learn about layer 3 features in switches you already own. Start routing.
  5. Scale more.
  6. Things start breaking mysteriously. Learn about TCAMs. Start wishing for NetFlow.
  7. Redesign. Buy stuff.
  8. Scale more.
  9. VMWare jockeys start asking about bridging across the WAN.
  10. Enroll in hair loss program.

incoming dial-peers

I had an interesting troubleshooting experience that showed me that I didn't fully understand how incoming dial-peers work with POTS lines.

I had a simple H.323 config that hands off a call arriving on an FXO port to CallManager:

voice-port 1/0/2
connection plar 7001
description POTS line
caller-id enable

dial-peer voice 7001 voip
destination-pattern 700.
session target ipv4:
dtmf-relay h245-alphanumeric

When a call was placed to the line connected to the FXO port on 1/0/2, the call would be sent to the IP phone with the wrong caller ID.

I ran a "debug voip ccapi" and discovered that the incoming dial-peer was not the default dial-peer 0, but another dial-peer (numbers sanitized):

dial-peer voice 1000 pots
description 555-1212
destination-pattern 1212
clid network-number 9705551212
port 1/0/2

This dial-peer had accidentally been left active from a prior configuration, and its "clid network-number" command was thus overwriting the correct caller ID.

I didn't know this previously, but it turns out that an incoming POTS dial peer is matched if it has a "port" statement equal to the inbound voice-port, AND any one of the following three commands is present:

incoming called-number

Removing the destination-pattern command or removing the dial-peer entirely corrects the problem and causes dial-peer 0 to be matched inbound.

Tuesday, December 8, 2009

simple exclusion filters

I use these constantly (and many others, but these come first to mind):

display only interfaces with assigned IP addresses:
sh ip int b | e una

display only active switch interfaces:
sh int status | e not

display CDP neighbors, except phones:
sh cdp n | e SEP