Friday, May 1, 2015

BGP Wedgies - Demystified

Usually when we configure BGP we expect the network to converge correctly after all the peerings come up, however under rare circumstances this is not the case. In this blogtorial, we will explore one such corner case scenario in which depending on the order of operation, the BGP topology can end up in a unpredictable topology or a BGP Wedgie.

Here is the topology.



As usual let's go ahead and get the routers configured and give this topology a wedgie.


R1 interfaces and BGP config.

 R1#sh run int gig1.12  
 interface GigabitEthernet1.12  
  encapsulation dot1Q 12  
  ip address 12.12.12.1 255.255.255.0  
 end  
!
 R1#sh run | sec bgp  
 router bgp 1  
  bgp log-neighbor-changes  
  timers bgp 1 3  
  redistribute connected  
  neighbor 12.12.12.2 remote-as 1  
  neighbor 14.14.14.4 remote-as 4  
!
 R1#sh run int gig1.14  
 interface GigabitEthernet1.14  
  encapsulation dot1Q 14  
  ip address 14.14.14.1 255.255.255.0  
 end  

R2 interfaces and BGP config. Notice that we are setting the BGP Local Preference to 150 (default is 100) on any incoming routes from R3 to R2. 

Local Preference is shared between iBGP neighbors and route with the highest local-preference is elected as the best route. 

 R2#sh run int gig1.12  
 !  
 interface GigabitEthernet1.12  
  encapsulation dot1Q 12  
  ip address 12.12.12.2 255.255.255.0  
 end  
 R2#sh run int gig1.23  
 !  
 interface GigabitEthernet1.23  
  encapsulation dot1Q 23  
  ip address 23.23.23.2 255.255.255.0  
 end  
 R2#sh run | sec bgp  
 router bgp 1  
  bgp log-neighbor-changes  
  timers bgp 1 3  
  redistribute connected  
  neighbor 12.12.12.1 remote-as 1  
  neighbor 12.12.12.1 route-reflector-client  
  neighbor 23.23.23.3 remote-as 1  
  neighbor 23.23.23.3 route-reflector-client  
  neighbor 23.23.23.3 route-map set-lp-high-from-r3 in  !!-Set local-preference high-!!
!
 R2#show route-map set-lp-high-from-r3  
 route-map set-lp-high-from-r3, permit, sequence 10  
  Match clauses:  
  Set clauses:  
   local-preference 150  
  Policy routing matches: 0 packets, 0 bytes  

R3 interface and BGP config. 

 R3#sh run int gig1.23  
!  
 interface GigabitEthernet1.23  
  encapsulation dot1Q 23  
  ip address 23.23.23.3 255.255.255.0  
 end  
 R3#sh run int gig1.34  
!  
 interface GigabitEthernet1.34  
  encapsulation dot1Q 34  
  ip address 34.34.34.3 255.255.255.0  
 end  
 R3#sh run | sec bgp  
 router bgp 1  
  bgp log-neighbor-changes  
  timers bgp 1 3  
  redistribute connected  
  neighbor 23.23.23.2 remote-as 1  
  neighbor 34.34.34.4 remote-as 4  

R4 interface and BGP config. 


 R4#sh run int gig1.34  
!  
 interface GigabitEthernet1.34  
  encapsulation dot1Q 34  
  ip address 34.34.34.4 255.255.255.0  
 end  
 R4#sh run int gig1.14  
!  
 interface GigabitEthernet1.14  
  encapsulation dot1Q 14  
  ip address 14.14.14.4 255.255.255.0  
 end  
 R4#sh run | sec bgp  
 router bgp 4  
  bgp log-neighbor-changes  
  network 4.4.4.4 mask 255.255.255.255  
  neighbor 14.14.14.1 remote-as 1  
  neighbor 34.34.34.3 remote-as 1  
  neighbor 34.34.34.3 route-map as-prepend out !!-Set as-prepend out to R3-!!
!
 R4#sh route-map as-prepend   
 route-map as-prepend, permit, sequence 10  
  Match clauses:  
  Set clauses:  
   as-path prepend 4 4  
  Policy routing matches: 0 packets, 0 bytes  
 R4#  

In this scenario, R3 should be the exit point for 4.4.4.4/32 and R1 should be the backup exit point for 4.4.4.4/32 so let's do a traceroute to verify.

 R1#traceroute 4.4.4.4  
 Type escape sequence to abort.  
 Tracing the route to 4.4.4.4  
 VRF info: (vrf in name/id, vrf out name/id)  
  1 12.12.12.2 4 msec 4 msec 4 msec  
  2 23.23.23.3 5 msec 5 msec 5 msec  
  3 34.34.34.4 3 msec * 4 msec  
 R1#  

 R1#sh ip bgp   
    Network     Next Hop      Metric LocPrf Weight Path  
  *>i 4.4.4.4/32    34.34.34.4        0  150   0    4 4 4 i  
  *                 14.14.14.4        0        0    4 i  

As you can see from R1 point of view that is indeed the case.

Why is R1 going towards R3? 

Before R2 reflects the route to R1 it is setting the BGP Local Preference to 150 on any incoming route from R3, therefore R1 is preferring the iBGP learned route via R2 instead of the eBGP route learned via R4.

Remember BGP Path Selection is Weight, Local preference, locally originated routes, AS PATH, etc. 

Let's create the BGP Wedgie

Let's flap the link between R3 and R4 and observe the traceroute again from R1.

 R3#conf t  
 Enter configuration commands, one per line. End with CNTL/Z.  
 R3(config)#int gig1.34  
 R3(config-subif)#shut   
 %BGP-5-NBR_RESET: Neighbor 34.34.34.4 reset (Interface flap)  
 %BGP-5-ADJCHANGE: neighbor 34.34.34.4 Down Interface flap  
 %BGP_SESSION-5-ADJCHANGE: neighbor 34.34.34.4 IPv4 Unicast topology base removed from session Interface flap  
 R3(config-subif)#no shut   
 R3#  
 %SYS-5-CONFIG_I: Configured from console by console  
 R3#  
 %BGP-5-ADJCHANGE: neighbor 34.34.34.4 Up   

 R1#traceroute 4.4.4.4  
 Type escape sequence to abort.  
 Tracing the route to 4.4.4.4  
 VRF info: (vrf in name/id, vrf out name/id)  
  1 14.14.14.4 4 msec * 3 msec  
 R1#  

Why is R1 now going straight to R4? 

Reason R1 is now going straight to R4 is because R3 link flapped and the previous route to 4.4.4.4/32 via R3 was withdrawn. R1 then advertised its route via R4 to R2 which is then reflected to R3. Once the BGP establishes between R3 and R4 after the flap, R3 now has 2 routes in the BGP table -- one from R1 and one from R4. R3 must now make the choice between these two and R3 ends up picking the route via R1 as the best path because this route has the shorter AS PATH. This means that R3 cannot advertise the route learned via R4 to R2 or other peers.

 R3#sh ip bgp   
    Network     Next Hop      Metric LocPrf Weight Path  
  *  4.4.4.4/32    34.34.34.4    0        0     0   4 4 4 i  
  *>i              14.14.14.4    0        100   0   4 i  

Under normal conditions BGP only advertises its best path to its peers. This behavior can be changed to some extent -- ex. BGP Best External.

Conclusion

I believe I've actually come across this scenario once before when one of our primary links to a client flapped and we started using our backup path to them and never failed back to the primary. Once we flapped the backup link we failed back to the primary path. When we brought up this issue to our Client's attention we were told that this was a bug in their router's BGP implementation but I strongly suspect that they have a BGP Wedgie ... they just don't know it yet.

If you are interested in reading further, please see RFC 4264.

Many more articles to come so ....

Please subscribe/comment/+1 if you like my posts as it keeps me motivated to write more and spread the knowledge.