CEF Load Balancing

Posted by Bradley | BGP | Tuesday 7 July 2009 11:11

Looking into CEF there are 3 main methods of load balancing per destination, per packet, and per port.

Per Destination – The original algorithm creates a 4 bit hash of the source and destination IP address and load balances based on this 16 value hash, an issue with this is that every router in the routing domain uses the same algorithm and this can cause something called CEF polarisation.

CEF polarisation occurs when traffic uses per destination load balancing and the same algorithm is used throughout the network which causes traffic to not be load balanced after the first distribution. In the example below if 100Mbs of traffic was coming into R1, it would be load balanced 50/50, with 50Mbs to R2 and 50Mbs to R3, but as R2 & R3 will use the same algorithm to determine which path the traffic will take, but as the algorithim is idential it will be a 100/0 split, with 50Mbs going to R4 and R7 and no data going to R5 or R6.

       R4-
     /
   R2
  /  \
 /    R5-
R1
 \     R6-
  \   /
    R3
      \
       R7-

To counter this issue a newer algorithm called the universal algorithm was developed where a 32 bit value is added to the algorithm, this value can be manually set but defaults to the highest loopback IP on the router. Per destination load balancing with the universal algorithm is the current default method of load balancing.

If there are a number of tunnels such as L2TP, GRE, MPLS etc operating through the router this could also cause route polarisation due to the low number of sessions, as such the tunnel algorithm was developed to solve this issue. It appears to be a pretty undocumented feature, so any more information on the algorithm would be appreciated.

The algorithm can be changed as required with the following command;

Router(config)#ip cef load-sharing algorithm ?
  include-ports  Algorithm that includes Layer 4 ports
  original       Original algorithm
  tunnel         Algorithm for use in tunnel only environments
  universal      Algorithm for use in most environments

Per Packet – In this mode packets are load shared in a round robin way, it can result in increased jitter due to multiple paths and as such generally not advisable.

Per Port – This is suitable for heavily NATed networks with a low number of hosts, and utilises a hashing function based on the layer 4 port numbers. As NATed hosts have a distributed range of source port numbers this allows for efficient load balancing in such situations.

Note:  Some more configuration examples to follower later

BGP Route Reflectors

Posted by Bradley | BGP | Friday 29 August 2008 02:32

BGP Route Reflectors (RRs) defined in RFC 1966 reduces the need for iBGP peers to be in a full mesh, by relaxing the standard rule where iBGP routes are not sent to iBGP peers.

A designated Route Reflector router treats all neighbour routers as one of 3 types, clients, non-clients and eBGP. The relaxed rules for Route Reflectors are;

  • Updates received from a client are sent to all clients & non-clients
  • Updates received from a non-client are sent only to clients
  • Updates received from eBGP peers are sent to all clients & non-clients

A Route Reflector and its clients combine to form a Route Reflector cluster, a BGP design could have multiple clusters or a cluster with multiple Route Reflectors. If muliple clusters exist then one Route Reflector per cluster must be peered directly with a Route Reflector in another cluster, although usually the Route Reflectors from all clusters peer directly.

As one of the loop prevention rules of BGP has been removed (never send iBGP learned routes to iBGP peers) it is possible that loops could form. To prevent this BGP route reflectors only reflect the best routes, also there are 2 optional non-transitive BGP attributes which further prevent looping these are;

ORIGINATOR_ID, this is a 4 byte attribute which contains the Router ID of the first iBGP peer which originated the route into the AS. Route Reflectors never send an update to the router specified in the ORIGINATOR_ID

CLUSTER_LIST, is list of the cluster IDs which the update has been passed through, each Route Reflector will add its own cluster ID to the list before sending an update. If a Route Reflector sees its own cluster ID in the list it ignores the update.