Equal-Cost Multi-Path Routing (ECMP) HowTo

About

This document aims to provide some example use cases and scenarios, how Equal-Cost Multi-Path (ECMP) routing can be applied and how it may behave in different situations.

Introduction

ECMP can allow for routes to be configured so that multiple paths can exist for the same destination, simultaneously. Meaning that different flows, aimed for the same destination, can potentially be routed on different paths. This can be useful for a number of different reasons, for example:

  • Load balancing of traffic flows, bound for the same destination, over different paths.

  • Distribution of frames on multiple links for better bandwidth utilization.

  • Allow traffic flows to select an alternate path on potential direct link failure, without the need for waiting on potential routing protocol updates.

This document provide a few different use cases and scenarios to help describe how ECMP can be applied and how it functions. They are intended to provide a basic understanding how they are used, so that they could possibly be applied to solve similar scenarios.

Case 1: Stateless Load Balancing

In this example use case, we aim to provide a router configuration that can provide some basic load balancing using ECMP, by providing two different default gateways. Refer to the example in Figure 1 below, for an example setup.

                     .--.-.
                    ( (    )__
                   (_,  \ ) ,_)  Internet
                     '-'--`--'
                      |     |
                      |     |
              .-------'     '-------.
              |                     |
              |                     |
         .----+----.           .----+----.
         |         |           |         |
         |  ISP1   |           |  ISP2   |
         |         |           |         |
         '----+----'           '----+----'
          .99 |                     | .99
              |                     |
172.16.1.0/24 '-------.     .-------' 172.16.2.0/24
                      |     |
                   .1 |     | .1
                    .-+-----+-.
                    |         |
                    |   R1    |
                    |         |  GW: 172.16.1.99
                    '----+----'      172.16.2.99
                         |
                         |
                     .--.-.
                    ( (    )__
                   (_,  \ ) ,_)  Lan
                     '-'--`--'

Figure 1: Load balancing over two different ISPs, by utilizing two same-cost default gateways on R1.

On the device R1 we have simply configured two different default routes, with the same cost, each pointing to one of the ISPs as the next hop gateway. The intention is that traffic flows from devices in the Lan will be distributed as much as possible between ISP1 and ISP2.

Note

This could of course be scaled up by having even more next hops towards additional devices. This example uses only two next hops for the sake of simplicity.

Configure

No specific configuration in regards to ECMP needs to be made on router R1, we simply configure two default routes towards two different next hop addresses:

Note

Remember to NOT configure the routes with different distances.

R1:/#> configure ip
R1:/config/ip/#> route default 172.16.1.99
R1:/config/ip/#> route default 172.16.2.99
R1:/config/ip/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R1:/#>

Now that the routes have been configured, the routing table should look something like the following:

R1:/#> show ip route
S - Static | C - Connected | K - Kernel route  | > - Selected route
O - OSPF   | R - RIP       | [Distance/Metric] | * - FIB route

S>* 0.0.0.0/0 [1/0] via 172.16.1.99, vlan1, weight 1, 00:00:03
  *                 via 172.16.2.99, vlan2, weight 1, 00:00:03
C>* 172.16.1.0/24 is directly connected, vlan1, 00:01:32
C>* 172.16.2.0/24 is directly connected, vlan2, 00:01:32

Notice that the 0.0.0.0/0 (default) destination have two next hop addresses listed for it. Traffic bound for the Internet that enters the router R1, from any devices located in the Lan network, should now be provided a next hop towards either ISP1 or ISP2, on a per flow level.

Note

The effectiveness of the load balancing depends how the hash mapping is performed. It is not a guarantee that there will be an exact 50/50 split across the two ISPs.

Case 2: Load Balancing and Redundant Routes

This use case will provide an example how we can configure multiple routers with ECMP routes to provide load balancing, while also providing several redundant routes. An example of this type of setup is provided in Figure 2 below.

   Network-A                                                           Network-B
   .--.-.                                                               .--.-.
  ( (    )__                                                           ( (    )__
 (_,  \ ) ,_)                                                         (_,  \ ) ,_)
   '-'--`--'                                                           '-'--`--'
      |                  .---------.           .---------.                  |
      |           NET-1  |         |   NET-3   |         |  NET-5           |
172.16.1.0/24    .-------+   R2    +-----------+   R4    +-------.    172.16.2.0/24
      |        .´      .2|         |.2       .4|         |.4      `.        |
      |      .´          '----+----+           +----+----'          `.      |
 .----+----+´                 |     `.       .´     |                 `+----+----.
 |         |.1                |       `.   .´       |                .6|         |
 |   R1    |            NET-9 |         `.´         | NET-10           |   R6    |
 |         |.1                |   NET-7.´ `.NET-8   |                .6|         |
 '---------+.                 |      .´     `.      |                 .+---------'
             `.          .----+----+´         `+----+----.          .´
               `.      .3|         |.3       .5|         |.5      .´
                 `-------+   R3    +-----------+   R5    +-------´
                  NET-2  |         |   NET-4   |         |  NET-6
                         '---------'           '---------'

                Networks ==========================================
                NET-1: 192.168.10.0/24     NET-6:  192.168.60.0/24
                NET-2: 192.168.20.0/24     NET-7:  192.168.70.0/24
                NET-3: 192.168.30.0/24     NET-8:  192.168.80.0/24
                NET-4: 192.168.40.0/24     NET-9:  192.168.90.0/24
                NET-5: 192.168.50.0/24     NET-10: 192.168.100.0/24
                ===================================================

Figure 2: Routers with redundant connections, in order to achieve load balancing and route redundancy for traffic between Network-A and Network-B.

In the setup above, we want to ensure connectivity between the two networks Network-A and Network-B. On each of the routers routes are configured with the same distance, so that they are of equal cost, so that ECMP routing can be utilized. Because of the redundant paths, link failures should not impact the traffic sent between the two networks Network-A and Network-B.

With this setup many paths will exist at the same time that could be utilized, depending on how the multi-path hash is calculated for each traffic flow. Each of the routers will technically be able to send the traffic on any equal cost route. As an example, the following paths are of the same cost and could all be selected for different flows:

  • Network-A -> R1 -> R2 -> R4 -> R6 -> Network-B
  • Network-A -> R1 -> R2 -> R5 -> R6 -> Network-B
  • Network-A -> R1 -> R3 -> R5 -> R6 -> Network-B
  • Network-A -> R1 -> R3 -> R4 -> R6 -> Network-B

Note

Any route specified between R2 and R3, and vice versa between R4 and R5, should be configured with a higher distance, so that the cost will not be the same as the other routes. The reason for this is that we do not want any of those routes to be part of the ECMP routing, since that path requires an extra hop. Rather, that route should be utilized of the other is not usable, because of link failure.

In most link failures scenarios depending on the number of link failures and where they are situated, connectivity should still be possible between the two networks. However, depending on where the link failure is, it is possible that some traffic flows may use a path that is not the most efficient. As an example, if the link for NET-3 and NET-8 goes down, both of the following paths could be valid:

  • Network-A -> R1 -> R3 -> R5 -> R6 -> Network-B
  • Network-A -> R1 -> R2 -> R3 -> R5 -> R6 -> Network-B

As we can see, in that specific scenario some traffic flow could use a path that has an extra hop. This should mostly be a problem if we use static routing, with a dynamic routing protocol it should select the better path, once the updated routing information have propagated to each of the routers.

Warning

We could still end up in a situation with this setup where we can route traffic to a device that will not be able to forward it further, if we experience multiple link failures. As an example, if the links for networks NET-3, NET-7 and NET-10 on router R4 are down, router R6 could still forward traffic towards it, since it cannot know that those links are down, when we use static routing. If we would use a dynamic routing protocol this should not be an issue.

Configure With Static Routes

This is an example how we can configure the devices with static routes. It is somewhat more cumbersome to do than to simply configure either RIP or OSPF. Nevertheless, it is more and more difficult the larger a topology becomes. However, since the topology in this example is not particularly extensive it can be a useful as an example.

For this configuration it is assumed that all of the interfaces have already been configured as intended. We need to configure each of the routers with the necessary static routes to be able to reach Network-A at 172.16.1.0/24 and Network-B at 172.16.2.0/24. Since we want to utilize ECMP routing, all the possible next hops for each destination IP address must be considered.

We need to be aware when we set up the routes, over the networks NET-9 and NET-10, to use a higher distance value. Setting a higher distance value for those routes will increase the cost, so that they will not be considered equal as part of the ECMP routing. Because providing a next hop over those two networks will always result in an additional hop to reach any destination in either Network-A or Network-B. We only want those routes to be considered if the outer routes are unavailable, due to link failure for instance. In essence, they should serve as a backup path, and not be part of the regular multi-path distribution of traffic flows that will be the result of the ECMP routing.

R1 Configuration:

R1:/#> configure
R1:/config/#> ip
R1:/config/ip/#> route 172.16.2.0/24 192.168.10.2
R1:/config/ip/#> route 172.16.2.0/24 192.168.20.3
R1:/config/ip/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R1:/#> show ip route
S - Static | C - Connected | K - Kernel route  | > - Selected route
O - OSPF   | R - RIP       | [Distance/Metric] | * - FIB route

C>* 172.16.1.0/24 is directly connected, vlan7, 02:29:32
S>* 172.16.2.0/24 [1/0] via 192.168.10.2, vlan1, weight 1, 00:29:57
  *                     via 192.168.20.3, vlan2, weight 1, 00:29:57
C>* 192.168.10.0/24 is directly connected, vlan1, 01:02:26
C>* 192.168.20.0/24 is directly connected, vlan2, 00:29:57

R2 Configuration:

R2:/#> configure
R2:/config/#> ip
R2:/config/ip/#> route 172.16.1.0/24 192.168.10.1
R2:/config/ip/#> route 172.16.2.0/24 192.168.30.4
R2:/config/ip/#> route 172.16.2.0/24 192.168.80.5
R2:/config/ip/#> route 172.16.1.0/24 192.168.90.3 10
R2:/config/ip/#> route 172.16.2.0/24 192.168.90.3 10
R2:/config/ip/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R2:/#> show ip route
S - Static | C - Connected | K - Kernel route  | > - Selected route
O - OSPF   | R - RIP       | [Distance/Metric] | * - FIB route

S>* 172.16.1.0/24 [1/0] via 192.168.10.1, vlan2, weight 1, 01:04:50
S   172.16.1.0/24 [10/0] via 192.168.90.3, vlan4, weight 1, 01:28:08
S>* 172.16.2.0/24 [1/0] via 192.168.30.4, vlan1, weight 1, 00:31:51
  *                     via 192.168.80.5, vlan3, weight 1, 00:31:51
S   172.16.2.0/24 [10/0] via 192.168.90.3, vlan4, weight 1, 01:22:15
C>* 192.168.10.0/24 is directly connected, vlan2, 01:04:50
C>* 192.168.30.0/24 is directly connected, vlan1, 00:31:51
C>* 192.168.80.0/24 is directly connected, vlan3, 00:32:10
C>* 192.168.90.0/24 is directly connected, vlan4, 01:28:08

R3 Configuration:

R3:/#> configure
R3:/config/#> ip
R3:/config/ip/#> route 172.16.1.0/24 192.168.20.1 1
R3:/config/ip/#> route 172.16.2.0/24 192.168.40.5 1
R3:/config/ip/#> route 172.16.2.0/24 192.168.70.4 1
R3:/config/ip/#> route 172.16.1.0/24 192.168.90.2 10
R3:/config/ip/#> route 172.16.2.0/24 192.168.90.2 10
R3:/config/ip/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R3:/#> show ip route
S - Static | C - Connected | K - Kernel route  | > - Selected route
O - OSPF   | R - RIP       | [Distance/Metric] | * - FIB route

S>* 172.16.1.0/24 [1/0] via 192.168.20.1, vlan1, weight 1, 00:39:17
S   172.16.1.0/24 [10/0] via 192.168.90.2, vlan4, weight 1, 01:34:24
S>* 172.16.2.0/24 [1/0] via 192.168.40.5, vlan2, weight 1, 00:39:15
  *                     via 192.168.70.4, vlan3, weight 1, 00:39:15
S   172.16.2.0/24 [10/0] via 192.168.90.2, vlan4, weight 1, 01:29:50
C>* 192.168.20.0/24 is directly connected, vlan1, 00:39:17
C>* 192.168.40.0/24 is directly connected, vlan2, 00:39:15
C>* 192.168.70.0/24 is directly connected, vlan3, 01:25:02
C>* 192.168.90.0/24 is directly connected, vlan4, 01:34:24

R4 Configuration:

R4:/#> configure
R4:/config/#> ip
R4:/config/ip/#> route 172.16.1.0/24 192.168.30.2 1
R4:/config/ip/#> route 172.16.1.0/24 192.168.70.3 1
R4:/config/ip/#> route 172.16.2.0/24 192.168.50.6 1
R4:/config/ip/#> route 172.16.2.0/24 192.168.100.5 10
R4:/config/ip/#> route 172.16.1.0/24 192.168.100.5 10
R4:/config/ip/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R4:/#> show ip route
S - Static | C - Connected | K - Kernel route  | > - Selected route
O - OSPF   | R - RIP       | [Distance/Metric] | * - FIB route

S>* 172.16.1.0/24 [1/0] via 192.168.30.2, vlan2, weight 1, 00:41:40
  *                     via 192.168.70.3, vlan3, weight 1, 00:41:40
S   172.16.1.0/24 [10/0] via 192.168.100.5, vlan4, weight 1, 01:27:57
S>* 172.16.2.0/24 [1/0] via 192.168.50.6, vlan1, weight 1, 00:40:34
S   172.16.2.0/24 [10/0] via 192.168.100.5, vlan4, weight 1, 01:27:57
C>* 192.168.30.0/24 is directly connected, vlan2, 00:41:40
C>* 192.168.50.0/24 is directly connected, vlan1, 00:40:34
C>* 192.168.70.0/24 is directly connected, vlan3, 01:27:53
C>* 192.168.100.0/24 is directly connected, vlan4, 01:27:57

R5 Configuration:

R5:/#> configure
R5:/config/#> ip
R5:/config/ip/#> route 172.16.1.0/24 192.168.40.3 1
R5:/config/ip/#> route 172.16.1.0/24 192.168.80.2 1
R5:/config/ip/#> route 172.16.2.0/24 192.168.60.6 1
R5:/config/ip/#> route 172.16.2.0/24 192.168.100.4 10
R5:/config/ip/#> route 172.16.1.0/24 192.168.100.4 10

R5:/config/ip/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R5:/#> show ip route
S - Static | C - Connected | K - Kernel route  | > - Selected route
O - OSPF   | R - RIP       | [Distance/Metric] | * - FIB route

S>* 172.16.1.0/24 [1/0] via 192.168.40.3, vlan1, weight 1, 00:43:08
  *                     via 192.168.80.2, vlan3, weight 1, 00:43:08
S   172.16.1.0/24 [10/0] via 192.168.100.4, vlan4, weight 1, 01:29:03
S   172.16.2.0/24 [10/0] via 192.168.100.4, vlan4, weight 1, 01:29:03
S>* 172.16.2.0/24 [1/0] via 192.168.60.6, vlan2, weight 1, 01:30:28
C>* 192.168.40.0/24 is directly connected, vlan1, 00:43:12
C>* 192.168.60.0/24 is directly connected, vlan2, 01:30:28
C>* 192.168.80.0/24 is directly connected, vlan3, 00:43:08
C>* 192.168.100.0/24 is directly connected, vlan4, 01:29:03

R6 Configuration:

R6:/#> configure
R6:/config/#> ip
R6:/config/ip/#> route 172.16.1.0/24 192.168.50.4 1
R6:/config/ip/#> route 172.16.1.0/24 192.168.60.5 1
R6:/config/ip/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R6:/#> show ip route
S - Static | C - Connected | K - Kernel route  | > - Selected route
O - OSPF   | R - RIP       | [Distance/Metric] | * - FIB route

S>* 172.16.1.0/24 [1/0] via 192.168.50.4, vlan2, weight 1, 00:42:20
  *                     via 192.168.60.5, vlan1, weight 1, 00:42:20
C>* 172.16.2.0/24 is directly connected, vlan7, 01:56:12
C>* 192.168.50.0/24 is directly connected, vlan2, 00:42:20
C>* 192.168.60.0/24 is directly connected, vlan1, 01:31:09

Configure With OSPF

In this example we will configure the routers with OSPF, instead of static routes. The outcome in the end should mostly be the same, in terms of the ECMP route selection. In addition we should also be more resistant to ending up in the scenarios where the static routes could end up not working.

Warning

This setup is supposed to replace the configuration in the example Configure With Static Routes above, not to be combined with it.

As in the previous example we assume that all of the relevant interfaces have already been configured. The configuration example will simply focus on the set up of OSPF on the routers. In this example we will use a very simple OSPF configuration, the only thing we really need to do is to specify to OSPF what networks it should know about.

Only for router R1 and R6 will we need to set an additional configuration option, by using the redistribute connected command. Because we want OSPF to know about the networks Network-A and Network-B that are directly connected to R1 and R6 respectively, but exist outside of the OSPF domain. Thus, if we want to be able to route traffic between those two networks, their existence must be made aware to the OSPF routing domain. Otherwise no routes towards those destinations will be inserted in any of the routers routing tables.

Note

While using OSPF will ensure a more optimal path during link failures, the failover time could in some cases be a bit longer. The reason for this is that the routing information from the protocol must propagate throughout the network.

R1 Configuration:

R1:/#> configure
R1:/config/#> router
R1:/config/router/#> ospf
R1:/config/router/ospf/#> network 192.168.10.0/24
R1:/config/router/ospf/#> network 192.168.20.0/24
R1:/config/router/ospf/#> redistribute connected
R1:/config/router/ospf/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R1:/#>

R2 Configuration:

R2:/#> configure
R2:/config/#> router
R2:/config/router/#> ospf
R2:/config/router/ospf/#> network 192.168.10.0/24
R2:/config/router/ospf/#> network 192.168.30.0/24
R2:/config/router/ospf/#> network 192.168.90.0/24
R2:/config/router/ospf/#> network 192.168.80.0/24
R2:/config/router/ospf/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R2:/#>

R3 Configuration:

R3:/#> configure
R3:/config/#> router
R3:/config/router/#> ospf
R3:/config/router/ospf/#> network 192.168.20.0/24
R3:/config/router/ospf/#> network 192.168.90.0/24
R3:/config/router/ospf/#> network 192.168.70.0/24
R3:/config/router/ospf/#> network 192.168.40.0/24
R3:/config/router/ospf/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R3:/#>

R4 Configuration:

R4:/#> configure
R4:/config/#> router
R4:/config/router/#> ospf
R4:/config/router/ospf/#> network 192.168.30.0/24
R4:/config/router/ospf/#> network 192.168.50.0/24
R4:/config/router/ospf/#> network 192.168.100.0/24
R4:/config/router/ospf/#> network 192.168.70.0/24
R4:/config/router/ospf/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R4:/#>

R5 Configuration:

R5:/#> configure
R5:/config/#> router
R5:/config/router/#> ospf
R5:/config/router/ospf/#> network 192.168.100.0/24
R5:/config/router/ospf/#> network 192.168.60.0/24
R5:/config/router/ospf/#> network 192.168.80.0/24
R5:/config/router/ospf/#> network 192.168.40.0/24
R5:/config/router/ospf/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R5:/#>

R6 Configuration:

R6:/#> configure
R6:/config/#> router
R6:/config/router/#> ospf
R6:/config/router/ospf/#> network 192.168.50.0/24
R6:/config/router/ospf/#> network 192.168.60.0/24
R6:/config/router/ospf/#> redistribute connected
R6:/config/router/ospf/#> leave
Applying configuration.
Configuration activated.  Remember "copy run start" to save to flash (NVRAM).
R6:/#>

Status and Verification

If we have managed to configure everything correctly we should now be able to send traffic between the two networks. We could do some basic verification by simply pinging some device from one network to the other:

root@Network-A-Host1:/home/admin # ping 172.16.2.99 -c 3
PING 172.16.2.99 (172.16.2.99): 56 data bytes
64 bytes from 172.16.2.99: seq=0 ttl=60 time=4.063 ms
64 bytes from 172.16.2.99: seq=1 ttl=60 time=7.465 ms
64 bytes from 172.16.2.99: seq=2 ttl=60 time=6.815 ms

--- 172.16.2.99 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 4.063/6.114/7.465 ms

In order to check what specific path we are taking to reach the target, we could utilize traceroute:

root@Host-A:/home/admin # traceroute 172.16.2.99
traceroute to 172.16.2.99 (172.16.2.99), 30 hops max, 46 byte packets
 1  172.16.1.1 (172.16.1.1)  1.268 ms  0.956 ms  0.914 ms
 2  192.168.10.2 (192.168.10.2)  2.167 ms  2.722 ms  1.618 ms
 3  192.168.70.4 (192.168.70.4)  2.787 ms  1.928 ms  1.722 ms
 4  192.168.60.6 (192.168.60.6)  3.757 ms  1.791 ms  1.287 ms
 5  172.16.2.99 (172.16.2.99)  1.682 ms  2.385 ms  2.022 ms

The traceroute output corresponds to the following hops:

  • Network-A-Host1 -> R1 -> R2 -> R4 -> R6 -> Network-B-Host1

When we perform a traceroute on another host located at 172.16.2.222, in this case, we can see the because of the ECMP routing we selected a different path:

root@Host-A:/home/admin # traceroute 172.16.2.222
traceroute to 172.16.2.222 (172.16.2.222), 30 hops max, 46 byte packets
 1  172.16.1.1 (172.16.1.1)  1.522 ms  0.852 ms  0.884 ms
 2  192.168.20.3 (192.168.20.3)  1.532 ms  1.306 ms  1.409 ms
 3  192.168.80.5 (192.168.80.5)  2.445 ms  1.319 ms  1.289 ms
 4  192.168.60.6 (192.168.60.6)  2.082 ms  1.702 ms  1.622 ms
 5  172.16.2.222 (172.16.2.222)  5.326 ms  4.561 ms  3.892 ms

The traceroute output corresponds to the following hops:

  • Network-A-Host1 -> R1 -> R3 -> R5 -> R6 -> Network-B-Host2

If we have a link failure traffic should still continue flow. As an example, when we ping a host located in Network-B at 172.16.2.99 from Network-A it takes the following route:

  • Network-A-Host1 -> R1 -> R2 -> R4 -> R6 -> Network-B-Host1

Then a link failure occurs for NET-3, between R2 and R4, when this occurs the traffic will instead be routed the following path:

  • Network-A-Host1 -> R1 -> R2 -> R5 -> R6 -> Network-B-Host1

We now hop from R2 to R5 over NET-8 instead. This can clearly be seen with traceroute:

root@Host-A:/home/admin # traceroute 172.16.2.99
traceroute to 172.16.2.99 (172.16.2.99), 30 hops max, 46 byte packets
 1  172.16.1.1 (172.16.1.1)  1.017 ms  0.776 ms  1.826 ms
 2  192.168.10.2 (192.168.10.2)  2.072 ms  2.215 ms  2.577 ms
 3  192.168.80.5 (192.168.80.5)  3.365 ms  1.871 ms  1.691 ms
 4  192.168.60.6 (192.168.60.6)  2.324 ms  2.509 ms  2.301 ms
 5  172.16.2.99 (172.16.2.99)  2.441 ms  2.976 ms  2.020 ms