SMS Blog
EVPN+IRB Over MPLS With JUNOS and IOS-XR
By Zachary Cayou, Network Engineer, SMS
Introduction
I was given a project to implement EVPN+IRB over MPLS in our network, with the bonus to make it interoperable between JUNOS and IOS-XR routers. At the time, the depths of Google revealed precisely zero examples, guides, or blog posts of anyone attempting to do this. In addition, vendor documentation on the subject tends to assume a particular network design, which we do not follow. As a result, it became an interesting experiment of trial/error, dissecting documentation and RFCs, and a fair amount of head scratching.
The purpose of this post is to outline the compatibility, limitations, and tweaks necessary to implement EVPN+IRB in a multivendor environment with JUNOS and IOS-XR.
Primer
Ethernet VPN (EVPN) is a next-generation VPN protocol for building both L2 and L3 VPNs. EVPN attempts to address many of the challenges faced by traditional L2VPN protocols such as VPLS, while also providing L3VPN capabilities.
Adding integrated routing and bridging (IRB) into EVPN enables both L2 forwarding of intra-subnet traffic and L3 forwarding of inter-subnet traffic within the L3VPN. This facilitates stretching a L2 domain across the core when L2 reachability is needed, while providing optimal forwarding of L3 traffic, and enabling VM mobility support with distributed anycast gateways.
The topic of EVPN, associated protocols, and their applications are far too broad to be covered in depth here, and thus the details and challenges discussed hereafter assume a fair understanding of EVPN already. Details on EVPN implementations may be found in the respective vendor’s documentation.
The specific platforms and OS versions referenced:
- Juniper MX80s running JUNOS 20.4R1.12
- Cisco ASR9010s running IOS-XR 6.7.3
Interoperability
Is EVPN+IRB over MPLS interoperable between JUNOS and IOS-XR? At the time of writing… no, strictly speaking they are not, due to reasons I’ll outline below. The two platforms implement incompatible EVPN+IRB behavior. That said, with a certain degree of workarounds, loose interoperability can be achieved.
While EVPN is a mature technology, EVPN+IRB is less so. Vendors began adding IRB features of EVPN in advance of governing RFCs. RFC 9135 (Integrated Routing and Bridging in Ethernet VPN (EVPN)) was in draft until October 2021. This RFC largely outlines the two IRB models that IOS-XR and JUNOS follow, symmetric and asymmetric, respectively.
Asymmetric IRB
In the asymmetric IRB model, the lookup operation for inter-subnet routing is asymmetric on the ingress and egress PE. When H1 sends a packet destined for H4, PE1 interface X receives the frame and conducts an IP lookup for the destination in its VRF table, where the longest match resolves to the network on interface Y. PE1 then does a lookup for H4’s MAC, which resolves as an EVPN learned adjacency. The packet is then encapsulated with a source MAC of interface Y, a destination MAC of H4, and then forwarded as a L2 payload across the core to PE2. At PE2, a single MAC lookup is performed to switch the traffic to H4. While this model achieves some simplicity in the control plane, and provides for centralized routing, it introduces limitations in scalability and flexibility. Since the ingress PE must be able to do the MAC lookup for the destination, it follows that it must also contain every IRB interface and install adjacency rewrites in the forwarding plane for every host in the routing domain, regardless if the PE has any local hosts in that network.
Symmetric IRB
In the symmetric model, the lookup operation for inter-subnet routing is symmetric on the ingress and egress PE. When host H1 sends a packet destined for H4, PE1 interface X receives the frame and conducts an IP lookup for the destination in its VRF table, where the longest match resolves to a host route learned from PE2. The traffic is forwarded as a L3 payload across the core to PE2. PE2 then does a lookup for H4s MAC, which resolves as a local adjacency on interface Y, and the packet can be forwarded to H4.
In the symmetric model, it should be clear that PE1 need not store a L2 adjacency for H4 in the forwarding plane, nor does interface Y need to exist on PE1 at all. This is achieved by including an additional label (aka Label2) and an additional VRF route-target into EVPN Type-2 MAC/IP advertisements. The additional route-target and label are used the same as they would be in a VPNv4 advertisement: to import the host-route into the correct VRF and to provide a forwarding label.
Compatibility
RFC 9135 says that while asymmetric and symmetric IRB modes may coexist on the same network and the egress PE indirectly dictates the mode by the presence or absence of Label2 and the VRF Route-Target in the EVPN Type-2 advertisements. In other words, this coexistence only means that PEs can prefer to operate in different modes, but they must be capable of both modes. As it turns out, IOS-XR operates exclusively in symmetric mode, and JUNOS operates exclusively in asymmetric mode.
Let’s look at where this breaks down on the data-plane. Where PE1 is JUNOS and PE2 is IOS-XR, PE1 sends an EVPN Type-2 MAC/IP advertisement without Label2/VRF RT for H1, and PE2 sends an EVPN Type-2 MAC/IP advertisement with Label2/VRF RT for H4. PE1 does not recognize the Label2/VRF RT attributes, and they are ignored, and installs an adjacency for the H4 MAC/IP in the forwarding table. PE1 performs an IP lookup for H4, finds interface Y as the longest match, does a MAC lookup for H4, forwards the L2 payload to PE2, and PE2 successfully switches the packet to H4. In the return direction from H4, PE2 performs an IP lookup for H1, finds interface X as the longest match, but fails to resolve a MAC for H1. Now even though PE2 is aware of the MAC/IP binding from EVPN in the control plane, this binding is not installed as an adjacency in the data plane. PE2 operates purely in the symmetric mode and expects to see Label2/VRF RT in the Type-2 advertisement for H2 if inter-subnet routing was desired.
The ultimate problem is bidirectional traffic fails because there is no way to properly route traffic from a PE that only supports symmetric mode to a PE that only supports asymmetric mode. Therefore, we can conclude that EVPN+IRB in isolation is not presently interoperable between JUNOS and IOS-XR. However, since the problem is purely that of a routing type, there are other ways we could approach the problem to still make it work.
Solution
Our network consists of hundreds of sites, each typically with a pair of PEs connected to a L2 campus infrastructure. Each subnet’s gateway lives as a FHRP VIP between each PE. Routing for each site is distributed over VPNv4. The introduction of EVPN+IRB in our network was intended to support stretching subnets across sites for services that required L2 connectivity, as well as supporting VM mobility for failover events. Our requirement included that a host stretched to any site must always be able to route on the local PE.
The interoperability problem outlined above breaks down specifically due to the lack of host-routes advertised from the JUNOS PE to IOS-XR PE, but in our network we are already running another protocol that we could use to solve this problem: VPNv4. The solution is to make the JUNOS PEs advertise the local EVPN host-routes inside VPNv4. While this does introduce additional overhead on the control-plane, as every EVPN Type-2 MAC/IP advertisement from the JUNOS PE will also have a corresponding VPNv4 advertisement, this expense is trivial for our use cases. By only injecting the EVPN host routes into VPNv4 on the JUNOS PE, we end up running EVPN in the asymmetric model on the JUNOS to IOS-XR path, and in the symmetric model on the IOS-XR to JUNOS path. At this point we’re still bound by the limitations of the asymmetric model on the JUNOS PEs. When we take the same solution one step further by also advertising the host-routes in VPNv4 from the IOS-XR PE, then we finally replicate the symmetric model bidirectionally.
Configuration Steps
The following are the configuration steps utilized to achieve loose EVPN+IRB interoperability on our network. For brevity, this assumes the control-plane is already configured with VPNv4 and EVPN address families enabled, and with relevant VRFs already created.
1. Create the attachment circuits and set the ethernet circuit parameters. In our design, each PE in a pair connects to the L2 campus but not as a LAG, so we configure the ethernet segment in single-active mode.
IOS-XR:
evpn interface Bundle-Ether1 evpn interface Bundle-Ether1 ethernet-segment evpn interface Bundle-Ether1 ethernet-segment identifier type 0 00.00.00.00.00.00.00.00.01 evpn interface Bundle-Ether1 ethernet-segment load-balancing-mode single-active interface Bundle-Ether1.1000 l2transport interface Bundle-Ether1.1000 l2transport description v1000;VRF-A;172.16.0.0/24 interface Bundle-Ether1.1000 l2transport encapsulation dot1q 1000 interface Bundle-Ether1.1000 l2transport rewrite ingress tag pop 1 symmetric
JUNOS:
set interfaces ae1 flexible-vlan-tagging set interfaces ae1 encapsulation flexible-ethernet-services set interfaces ae1 esi 00:00:00:00:00:00:00:00:00:02 set interfaces ae1 esi single-active set interfaces ae1 unit 1000 description “v1000;VRF-A;172.16.0.0/24” set interfaces ae1 unit 1000 encapsulation vlan-bridge set interfaces ae1 unit 1000 vlan-id 1000
2. Create the IRB interfaces in the respective VRF. To provide gateway ARP consistency as a distributed anycast gateway, the MAC address must be statically assigned and replicated on each PE.
IOS-XR PE:
The “host-routing” knob enables the symmetric behavior in the control plane.
interface BVI1000 description VRF-A;172.16.0.0/24 interface BVI1000 host-routing interface BVI1000 vrf VRF-A interface BVI1000 ipv4 address 172.16.0.1 255.255.255.0 interface BVI1000 mac-address 0.0.1000
JUNOS:
set interfaces irb unit 1000 description “VRF-A;172.16.0.0/24” set interfaces irb unit 1000 description “VRF-A;172.16.0.0/24” set interfaces irb unit 1000 family inet address 172.16.0.1/24 set interfaces irb unit 1000 mac 00:00:00:00:10:00 set routing-instance VRF-A interface irb.1000
3. Create the EVPN instance.
IOS-XR:
The binding of the attachment circuit(s), and IRB interface is done inside of a L2VPN configuration, which is then associated with the EVPN instance.
evpn evi 1000 evpn evi 1000 bgp evpn evi 1000 bgp rd 1000:1000 evpn evi 1000 route-target import 1000:1000 evpn evi 1000 route-target export 1000:1000 evpn evi 1000 description VRF-A;172.16.0.0/24 l2vpn bridge group VRF-A l2vpn bridge group VRF-A bridge-domain 1000 l2vpn bridge group VRF-A bridge-domain 1000 interface Bundle-Ether1.1000 l2vpn bridge group VRF-A bridge-domain 1000 routed interface BVI1000 l2vpn bridge group VRF-A bridge-domain 1000 evi 1000
JUNOS:
The EVPN instance is created as a routing-instance of type EVPN.
By default, JUNOS will suppress both ingress/egress ARPs across the core, and instead proxy the response utilizing information known from the Type-2 MAC/IP advertisements. IOS-XR does not implement this feature, so ARP requests/replies must be allowed across the core. The hidden command ‘no-arp-suppression’ is necessary to disable this behavior on JUNOS.
JUNOS also implements default-gateway MAC synchronization by default. In our use case with distributed anycast gateways, this feature is not necessary since all gateway MACs are statically set, and should be disabled with the “default-gateway do-not-advertise” knob.
Finally, JUNOS by default does not insert a control-word in front of the payload for egress traffic, while IOS-XR by default does. A control-word is a nibble of zeros that sits between the bottom MPLS label and the payload. The purpose is to ensure a “dumb” transit device does not mistake a L2 payload with a source MAC starting with a 4 or 6 as a L3 IPv4 or IPv6 payload. This must be consistent across all PEs, otherwise the payload offset on received traffic will be inconsistent.
set routing-instances VRF-A-evpn-1000 protocols evpn interface ae1.1000 set routing-instances VRF-A-evpn-1000 protocols evpn no-arp-suppression set routing-instances VRF-A-evpn-1000 protocols evpn default-gateway do-not-advertise set routing-instances VRF-A-evpn-1000 protocols evpn control-word set routing-instances VRF-A-evpn-1000 instance-type evpn set routing-instances VRF-A-evpn-1000 vlan-id none set routing-instances VRF-A-evpn-1000 routing-interface irb.1000 set routing-instances VRF-A-evpn-1000 interface ae1.1000 set routing-instances VRF-A-evpn-1000 route-distinguisher 1000:1000 set routing-instances VRF-A-evpn-1000 vrf-target target:1000:1000
4. Advertise host routes in VPNv4… i.e the interoperability workaround.
IOS-XR:
This command imports the local EVPN IRB adjacencies as host-routes into the VRF table, allowing for advertisement in VPNv4 or other protocols.
vrf VRF-A address-family ipv4 unicast import from bridge-domain advertise-as-vpn
JUNOS:
In JUNOS the local EVPN IRB adjacencies already exist VRF table, and advertising them requires nothing other than allowing routes from protocol evpn.
Also JUNOS will advertise both Type-2 MAC/IP advertisements AND Type-2 MAC advertisements. Here we only require the MAC/IP advertisements, so to reduce control-plane overhead, the MAC only advertisements should be filtered.
set policy-options policy-statement VRF-A-export term EVPN from protocol evpn set policy-options policy-statement VRF-A-export term EVPN then accept set policy-options policy-statement rr-bgp-export term EVPN from family evpn set policy-options policy-statement rr-bgp-export term EVPN from evpn-mac-route mac-only set policy-options policy-statement rr-bgp-export term EVPN then reject
Future
The solution in place in our network is not optimal with regards to control-plane utilization, as we are required to double the necessary BGP advertisements for any given host in an EVPN. However, it remains more than viable at the scale we plan to deploy EVPN for the foreseeable future. JUNOS has recently implemented support for the symmetric IRB model for EVPN+IRB over VXLAN, so presumably support over MPLS is on the horizon. At that point, transitioning to the native symmetric model in EVPN would be desirable for both overhead and protocol simplicity.