Friday, June 21, 2013

Spanning-tree Review

Spanning Tree was standardized in IEEE 802.1d as a layer-2 protocol designed to eliminate switching loops - or rather "bridging" loops since bridges were the prominent device at the time of adoption of the standard.

Root Bridge

The root bridge is the "master" bridge for maintaining all STP information within a network.  When a switch boots up, it immediately believes it is the root bridge and broadcasts a Bridge Protocol Data Unit (or BPDU) to all other switches on the network, at which time an election takes place.  The switch with the lowest bridge ID becomes the root.

The bridge ID is composed of two factors.  First is the root priority, which is a number between 0 and 61440, randomly assignable in increments of 4096 on the switch.  The default priority is 32768 (in hex, 8000), which all Cisco switches are initially configured with.  The second component of the bridge ID is the MAC address.  In the case of an election, the switch with the lowest root ID becomes the root.  First the priority is considered, then the MAC.

It is therefor advisable to configure the priority on a robust core switch when working in a highly complex or robust network to ensure that the switch controlling the STP is capable to handle the workload as well as in a central location.

BPDUs are multicast every two seconds by the root bridge following the initial election.  Contained in this packet is the bridge ID of the root bridge, and as long as that number is lower than the ID of the switch receiving the BPDU no election will be forced.  If a BPDU arrives with a lower ID than is configured on the switch, an election takes place and the STP topology will reconverge.

Root ports

After election of a root bridge, each nonroot switch forms an association back to the root.  The path with the lowest cumulative cost is placed into a forwarding state, and all other paths to the root are placed in a blocking state.  This information is contained in the BPDU, along with the root bridge ID and the sending switch's bridge ID. The cost  is the inverse of the bandwidth on the link between it and the root, so the lowest cost is the fastest path back to the root.  Costs are:

10 Gb:  Cost 2
1 Gb:  Cost 4
100 Mb:  Cost 19
10 Mb:  Cost 100

Each switch that receives a BPDU from the root will have a cost of 0.  It adds the cost of the interface on which it was received to that value and forwards that information to other switches with the new cost.  Each switch in turn performs the same action to determine what is the most efficient path to the root.

When equal cost paths are available, the system determines which port becomes the root port by first looking at the bridge ID and will select the lowest ID as the root.  If both links are to the same switch, it will use the lowest port priority, which is 128 by default but can be administratively set to chooose one link over another.  If port priority is equal, the switch will choose the lowest port number to determine the root port.

Designated Ports

 Designated ports connect to another switch and are a path back to the root bridge.  They are the "input" to the "output" of the root port and are selected based on the same path cost criteria as the root port.

Blocked Ports

When there are two paths from a switch back to the root port, STP will block client traffic from leaving the port with the highest path cost back to the root.  The port will still listen for BPDUs so that the switch can react to topology changes and notifications, but it simply doesn't send any data.

Port State Transitions

STP ports will go through a process of transitioning from a blocked state to a forwarding state.  Those states are:

Blocking - no client data is sent, but still listens for BPDU
Listening - no client data sent, but listens for topolgy updates
Learning - no client data sent, but learning MAC addresses on connected ports
Forwarding - now passing data as well as listening for BPDUs

The timers associated with those states:

Blocking - 20 seconds (max age timeout, or 10 BPDUs missed)
Listening - 15 seconds
Learning - 15 seconds

When a link goes down, a topology change notification BPDU (TCN BPDU) is sent from the switch reporting a link failure.  This is one of the only times a BPDU doesn't originate from the root bridge.   The root will eventually receive the TCN and notify the switches in its topology to start aging out MAC addresses 8 times more quickly than its default of 300 seconds.  The switches then begin rebuilding the topology by assigning ports to either designated, root or blocking states based on the updated condition of the network.

Because of these default timers, a network could take up to 50 seconds to converge and start sending data again.  To modify the default timers, only the root bridge needs to be adjusted and the timer changes will propagate through the topology.