Friday, July 13, 2012

Spanning Tree and Portfast


So, I have a lot of buffer full / output drops on ALL of my switches.  Spent some time with TAC yesterday troubleshooting, and it seems that I am having a ton of broadcasts, as a result of Spanning Tree topology changes.

sh span detail | in is executing|from|topology changes
 VLAN#### is executing the ieee compatible Spanning Tree protocol
  Number of topology changes 168945 last change occurred 00:00:27 ago
          from GigabitEthernet1/0/11

The TAC engineer narrowed it down to one port, which I chased down yesterday to a big Canon network copier.

Why would a Canon copier be causing Spanning Tree topology changes?

The port was in trunk mode, as all of our ports are in trunk mode to accommodate our data and (Non-Cisco) VoIP vlans. Since the copier doesn't talk on the phone much,  I put it’s port into access mode and the Spanning Tree topology changes stopped.

For a bit.

Then they started again. This time from a different port. Then another. This prompted some reading, as my TAC tech is out for the weekend.    I found this: Cisco

“As soon as a bridge detects a change in the topology of the network (a link that goes down or goes to forwarding), it advertises the event to the whole bridged network.”

So, it wasn’t just the copier. Anytime a link goes up or down, a TCN is sent out.

”The more hosts are in the network, the higher are the probabilities of getting a topology change. For instance, a directly attached host triggers a topology change when it is power cycled. In very large (and flat) networks, a point can be reached where the network is perpetually in a topology change status.”

This is us.  Perpetual topology change, perpetual TCN back to the root bridge, and then root broadcast of change notification.

So what is the fix?

"Avoid TCN Generation with the portfast Command


The portfast feature is a Cisco proprietary change in the STP implementation. The command is applied to specific ports and has two effects: 


Ports that come up are put directly in the forwarding STP mode, instead of going through the learning and listening process. The STP still runs on ports with portfast. 

The switch never generates a TCN when a port configured for portfast goes up or down. 


Enable portfast on ports where the connected hosts are very likely to bring their link up and down (typically end stations that users frequently power cycle). This feature should not be necessary for server ports. It should definitely be avoided on ports that lead to hubs or other bridges. A port that directly transitions to forwarding state on a redundant link can cause temporary bridging loops.


 Topology changes can be useful, so do not enable portfast on a port for which a link that goes up or down is a significant event for the network."

OK, I got it. I know about portfast. But most of my ports are trunks. So, I have to use portfast trunk. And I don't want to enable portfast trunk unless I have some bpdu protection, as we do have techs and users plugging switches in from time to time, some temporary, some permanent.  So, portfast somewhat nerfs STP, right?  I don't want a user seeing a network cable laying on the table, and plugging it back into the switch it's already plugged into and causing a loop. It's happened a couple of times already.

So, does Portfast really keep STP from doing it's job? I have a 3750x on my desk, and on it I have two ports configured as trunks in two vlans, and have added portfast trunk to each port. I have plugged a 2960 into port 1/0/48 on the 3750x. I can ping the 2960's management address, and the port is happy.




Take a look at Spanning Tree as we sit. The 2960 is the root, as it has a lower MAC address. Fine for our purposes.. We just want to see what happens.



Turned on Spanning Tree Event Debugging. (Don't do this in production)
#debug spanning-tree events

Now, let's plug 1/0/47 into the same switch.



You can see the port immediately jumps to forwarding, but then two-tenths of a second later, STP blocks the port. Eigth-tenths later, port goes back to up? So is it up, or blocked? The light is green...  Turns out it is both.

#show ip interface brief shows us that the port is up, but #show spanning-tree shows us that it is indeed blocked.





So, Spanning Tree does it's job, even though portfast is enabled. Why then, all the concern about having portfast enabled on a port that a switch can be plugged into? Two-tenths of a second seems pretty fast to me.

Just for giggles, I am going to plug both ends of a cable into the 2960 to see if I can really get a loop going. These ports have the default configuration on them (or lack thereof). So, no trunk, or portfast trunk, or even portfast.



Nope. No loop. But the difference is, the port was never forwarding. It went from listening to blocking, and stayed that way. So, even though it took a second and a half to block, it didn't have a chance to make a loop. Incidentally, there were no STP events triggered on the 3750 when this "loop" was made.

What do you think? Is two-tenths of a second really enough time to start an irreversible switching loop?  I am certainly not going to run out and throw portfast on all of my access ports, but this was a fun experiment nonetheless.



Monday, July 9, 2012

CCNA!

I passed, with a 902! And, yes- I'm allowed to post my score according to the Cisco posting guidelines.. Read them here.

Feels great! Had a case of the Monday's and a little panic when I woke up and my mind was blank. Could barely remember my name. But, when I sat down, it all came back.  Went very well, and I feel awesome.

Vacation is short, though. My CCNP ROUTE class starts tomorrow at 8 am. 

On to the next one!