VXLAN communication process:
For two virtual terminals in the same VXLAN, the communication process can be summarized as follows:
- The sender sends a data frame to the receiver, which contains the virtual MAC address of the sender and receiver.
- The VTEP node connected to the sender receives the data frame. By looking up the VXLAN where the sender is located and the VTEP node to which the receiver is connected, After adding the VXLAN header, the external UDP header, and the external IP header, the packet is sent to the destination VTEP node.
- The packet is transmitted over the physical network to the destination VTEP node.
- After receiving the packet, the destination VTEP node removes the external IP header and the external UDP header of the packet, checks the VNI of the packet, and the destination MAC address of the internal data frame. After confirming that the receiver is connected to the VTEP node, the VXLAN header is removed. Internal data frames are delivered to the receiver.
- The receiver receives the data frame and the transfer is complete.
VXLAN network model:
As you can see from the figure, the following new elements in the traditional data center
network appear in the VXLAN network:
The edge device of the VXLAN network is the start and endpoints of the VXLAN tunnel. The related processing of VXLAN packets is performed on this. In short, it is the absolute protagonist in the VXLAN network. The VTEP can be either a network device (such as Huawei’s CE series switch) or a server where the virtual machine is located. So how does it work? The answer will be announced later.
VNI (VXLAN Network Identifier)
As mentioned above, vlans take up only 12 bits of space in Ethernet data frames, which makes the VLAN’s isolation capability inadequate in data center networks. The emergence of VNI is specifically to solve this problem. VNI is a user ID similar to VLAN ID. A VNI represents a tenant.
Virtual machines belonging to different VNI cannot directly communicate with each other at layer 2. When VXLAN packets are encapsulated, VNI is allocated enough space to support the isolation of massive tenants. Detailed implementation, we will introduce it later.
“Tunnel” is a logical concept, it is not new, such as the familiar GRE. To put it bluntly, the original message is “transformed” and “packaged” so that it can be transmitted on a bearer network (such as an IP network). From the perspective of the host, it is as if there is a straight link between the start and end of the original message. And this seemingly straight link is the “tunnel.” As the name implies, the “VXLAN tunnel” is used to transmit packets that are encapsulated in VXLAN. It is a virtual channel established between two VTEPs.
The payload message inside VXLAN specified in RFC7348 must be an Ethernet packet, which limits the scope of use of the VXLAN protocol. In order to allow VXLAN to support Overlay transmission of other protocol messages more widely, the RFC draft is exploring VXLAN Generic Protocol Encapsulation (GPE), which is a VXLAN general protocol encapsulation.
The GPE package uses some of the reserved bits specified in the original FRC7348.
- Version (Ver): Indicates the VXLAN GPE protocol version. The initial value is 0.
- Next Protocol Bit (P bit): If the P bit is 1, the Next Protocol field is valid.
- BUM Traffic Bit (B bit): If the B bit is 1, it indicates that the encapsulated packet in the VXLAN is a BUM packet.
- OAM Flag Bit (O bit): If the O bit is 1, the encapsulated packet in the VXLAN is an OAM packet.
- Next Protocol: 8 digits. Indicates the protocol format of the encapsulated packets inside the VXLAN.
BUM (Broadcast, Unknown-unicast, Multicast) is broadcast, unknown unicast, and multicast traffic. According to different ways of copying flood traffic, it can be divided into unicast routing mode (head-end replication) and multicast routing mode (core replication). In the head-end replication mode, the VTEP is responsible for copying the packets. The unicast mode sends the copied packets to the local site through the local interface and sends them to all remote VTEPs in the VXLAN through the VXLAN tunnel. After receiving the VXLAN packet, the remote VTEP decapsulates the packet and floods the original data in the VXLAN of the local site. To avoid loops, after the remote VTEP receives a packet from the VXLAN tunnel, it will not flood it to other VXLAN tunnels.
VXLAN Layer 3 Gateway – L3 Gateway
The VXLAN Layer 3 gateway provides the Layer 3 forwarding function of the VXLAN. The VXLAN is associated with the VSI virtual interface (VXLAN virtual interface). The VSI virtual interface is assigned the IP address as the gateway of all VMs in the VXLAN.
The main functions of the VXLAN Layer 3 Gateway:
- Realize mutual visits between virtual machines and non-VXLAN networks in VXLAN
- Completing virtual machine exchanges across VXLAN
VXLAN Layer 3 gateways are available in both centralized and distributed modes depending on how they are deployed.
Centralized gateways, which are all centered on the Spine device. All traffic across VXLAN, VXLAN, and non-VXLAN access traffic need to pass Spine. The advantage of a centralized gateway is that all the flows can pass through Spine device, which makes it easier to implement flow control and automatic draining. The disadvantage is that the Spine device is under pressure and is not conducive to large-scale deployment.
In the distributed VXLAN Layer 3 gateway solution, each VTEP device can act as a VXLAN IP gateway to perform Layer 3 forwarding of traffic at the local site. Distributed layer 3 gateway can be a good solution to the problem that leads to excessive pressure in Spine device due to the concentration of traffic. It can also be flexibly extended on the group network. On a distributed gateway network, the Spine device is not a VTEP. It is only part of the Underlay network and is responsible for forwarding ordinary IP packets.
VXLAN has become the best choice for the current SDN Overlay technology due to its simple data plane and good compatibility, but VXLAN still has a long way to go in the future.
For example, exploring the VXLAN GPE package is a direction, and solving the QoS of the VXLAN tunnel is also a direction. The control plane needs to do more, how to better implement on-demand customization, how to achieve intelligent traffic adjustment, how to better compatible with heterogeneous devices, and so on. I believe that the future will give us a better answer.
Part of the content is selected from H3C product support and service technology column “Interpretation of VXLAN”
For more articles you can follow us on: