Thursday, October 27, 2016

Unicast Flooding

CCIE - General Network Challenges

Section 1.1.c (i)


The topic of unicast flooding can be one of those overlooked network topics due to its simple and common operation. At best, understanding of the unicast flood process and causes can lead to a more efficient network, yet a lack of understanding can result in a noticeably degraded network.

What are Unicast Floods? To begin, here are a couple definitions:

  •  Unicast - transmission intended for a single destination, can be a L2 (destination host MAC address) or L3 (destination host IP address) concept.
  •  Unicast Flood - Layer 2. Undesirable behavior of a switch treating a unicast frame as a broadcast frame, flooding out all switchports except the received port. 
When a switch receives a unicast frame destined for an unknown host; more specifically, the destination MAC address of the frame contains a MAC address not stored in the CAM table, the switch will flood the frame out all ports that make up the broadcast domain, save the originating port. The hope is that the destination host will eventually receive the frame and respond, creating a frame of its own, containing its source MAC address that can be used to populate the CAM table of the switch.

Unicast flooding is normal, but undesirable behavior. As mentioned above, it is the result of a lack of information. Often times it only takes a flood of one or two frames for a switch to discover the information required to forward the remaining frames more efficiently. Unicast flooding can really be thought of as a host or endpoint discovery technique.  

The following events, if not compensated for, can cause a unicast flood:


CAM table is full

Probably the simplest and easiest to detect, but the most impactful. If the CAM table of a switch is at capacity, it is unable to learn any new destination MAC addresses a switch will start to flood all unicast frames. This will result in a quick and dramatic impact on network resources, seeing interface utilization spike heavily.

Asymmetric forwarding paths at Layer 2

This is commonly the result of a next hop redundancy protocol between two layer 3 switches acting as the default gateway for host networks. If both the switches present equally valid layer 3 routes, the upstream network will try to leverage all paths. STP will only have a single layer 2 path to any one host. The result is that any traffic inbound on an HSRP standby switch my have a valid host ARP entry, but no associated CAM entry. This switch will be forced to flood every inbound frame until the ARP entry times out. This will result in the standby switch needing to ARP for the destination. This ARP process will serve to not only populate the ARP cache, but also the CAM table.

ARP entry, but no CAM entry

By default ARP timers are longer than the CAM cache timers
== Cisco Default Timers ==
Minutes Seconds
CAM Table 5 14,400
ARP Cache 300 240 (4 hours)

If any device needs to forward a frame for which it has an ARP entry (that is an IP to MAC correlation), but no corresponding MAC entry in the CAM table, the forwarding device will be forced to flood the unicast frame. The becomes most apparent with asymmetric forwarding paths, as discussed above. The solution would be to make ARP and CAM timers match; either by:
  • Changing CAM timeout to 4 hours/14,400 seconds (recommended for stable networks with a large number of ARP entries)
  • Changing ARP timeout to 5 minutes/300 seconds or less (More ARP requests will be generated, maybe better method in smaller networks)
  • Matching both ARP and CAM to some arbitrary value (Because, sure why not)
More modern solutions involve the implementation of  technologies such as VSS or vPC to create intelligent, logical blocks in the forwarding path.

STP topology changes

STP topology changes will initiate a TCN (Topology Change Notification) BPDU. These TCNs affect CAM entries on respective interfaces.In the case of traditional 802.1D STP (or PVST+), receiving a TCN on an interface will cause all associated CAM entries to change their aging timers to 15 seconds. 802.1w RSTP (or RPVST) and 802.1s (MST) will age out and flush CAM entries immediately, bypassing the 15 second rule. This will cause an unknown unicast flood until the CAM table is rebuilt. In a stable network, this behavior is acceptable as there should be few TCN BPDUs. Portfast can be used on interfaces known to not participate in STP, such as end hosts, so as to prevent TCNs.

Reference Links: