Sujal_Das_headshot.jpg

The Case for Express Virtio (XVIO) – Part 1

By Sujal Das | Nov 03, 2016

SmartNIC and XVIO take the operational efficiency of OpenStack cloud server infrastructures to new heights.

I am writing this from the OpenStack Summit in Barcelona, fresh from speaking to many data center operators who are deploying OpenStack-based clouds. Virtual machines (VM) and applications that run in them form the core building blocks of today’s modern cloud data centers. Revenue from VMs are the lifeline of public cloud service providers. Enterprises that deploy private cloud infrastructure rely on VMs and their apps to deliver the needed services to their customers. Data center operators rely on the promise of deploying SDN and NFV, and realizing deployment efficiencies using COTS servers to achieve their revenue, service and other business goals. However, I heard all of them echo what James Hamilton of Amazon Web Services said a few years ago, “Data center networks are in my way,” but with a subtle difference. Instead of data center networks, some said SDN, and others said OpenStack networking or Open vSwitch (OVS) networking or Contrail vRouter networking. When it came to describing their challenges, the resounding common theme was poor server infrastructure efficiency. It sure hurts when 60% of data center infrastructure costs come from servers and related power and cooling.

So what is it about current networking solutions that is really hurting OpenStack-based server infrastructure efficiencies? Let’s look under the hood.

VM workloads differ – each requires a different networking configuration

VMs come in many flavors based on what they run: web server applications, Hadoop big data or other database applications, networking applications, or security applications. The VMs have different profiles in terms of resource requirements: the number of virtual CPU cores (vCPUs), the amount of I/O bandwidth (packets per second), memory (megabytes), disk (gigabytes), tenancy requirements (single versus multiple), reporting of metrics related to their operation and performance behavior, and others. The vCPUs and amount of I/O bandwidth (packets per second) for each VM are relevant in this discussion. For example, a web server application typically requires a moderate number of vCPU cores and lower I/O bandwidth, while Hadoop big data and networking application VMs typically require a moderate-to-high number of vCPU cores and high I/O bandwidth. Delivering optimal I/O bandwidth to VMs of different flavors is a challenge; it requires different networking configurations and affects the heterogeneity and efficiency of server infrastructures.

Networking configuration options: Virtio, DPDK or SR-IOV

The networking services for VMs in OpenStack deployments are mostly delivered using OVS or Contrail vRouter. The I/O bandwidth delivered through such datapaths to VMs is affected by the following:

The data path decisions on the host or hypervisor. The OVS and vRouter datapaths that deliver needed network services can execute in one of the following modes on the host or hypervisor:

1. Linux Kernel Space: The datapath typically executes in the Linux kernel space running on x86 CPU cores. This mode delivers the lowest I/O bandwidth, consuming a high number of CPU cores. More CPU cores can be allocated to the kernel datapath processing to enable higher performance, but in most cases performance caps off at less than 5 Mpps (million packets per second) with about 12 CPU cores. The network administrator has to pin CPU cores to the datapath processing tasks to ensure predictable performance.
2. Software Acceleration (DPDK): For higher performance, the datapaths may execute in the Linux user space, using the Data Plane Development Kit (DPDK) running on x86 CPU cores. More CPU cores can be allocated to the user space datapath processing to enable higher performance, but in most cases performance caps off at less than 8 Mpps with about 12 CPU cores. Again, the network administrator has to pin CPU cores to the datapath processing tasks to ensure predictable performance.
3. Hardware Assisted Bypass (PCIe Passthrough or SR-IOV): In case of PCIe Passthrough the entire PCIe device is mapped to the VM bypassing the hypervisor. In case of SR-IOV the hardware and the firmware provide a mechanism to segment the hardware and be used by multiple VMs at the same time. SR-IOV is the technology that is relevant to virtualization in this context. In this case, networking services in the OVS and vRouter datapath become unavailable.
4. Hardware Accelerated (SmartNICs): For even higher performance, the datapaths may execute in a SmartNIC such as Netronome’s Agilio platform, which can achieve up to 28 Mpps, consuming only one CPU core for control plane processing related to the OVS or vRouter datapath. Since the datapath runs in dedicated CPU cores in the SmartNIC, performance is predictable and no extra provisioning tasks are required from the network administrator. All networking services in the OVS and vRouter datapath remain available.

The I/O Interface between the VM and Host or Hypervisor. The data can be delivered to the VM from the OVS and vRouter datapaths in one of the following ways:

1. Virtio: In this case, VMs are completely hardware independent and can therefore be easily migrated across servers to boost server infrastructure efficiency. Applications running in all popular guest operating systems in the VMs require no change, making onboarding of customer or third party VMs easy. Live migration of VMs across servers is feasible. Networking services provided by the OVS and vRouter datapaths are available to the VMs. I/O bandwidth to and from the VM is lower than with DPDK and SR-IOV.
2. DPDK: In this case, VMs require a DPDK poll mode driver that is hardware independent. Applications need to be modified to leverage the performance benefits of DPDK, and as a result, customer and third party VM and applications onboarding is not as seamless. Live migration of VMs across servers is feasible. Networking services provided by the OVS and vRouter datapaths are available to the VMs, but in a limited way if the user space DPDK datapath is used, because such services typically evolve rapidly in the kernel and need to be ported and made available in the user space, and this may take time or may be difficult to implement. (This deficiency of DPDK was underscored by the Linux kernel maintainer David Miller at the recent Netdev1.2 conference in Tokyo, when he very aptly repeated multiple times that “DPDK is not Linux.”) I/O bandwidth to and from the VM is higher than Virtio but significantly lower than Single Root I/O Virtualization (SR-IOV).
3. SR-IOV (Single Root I/O Virtualization): In this case, VMs require a hardware-dependent driver in the VM. Applications need not be modified to leverage the performance benefits of SR-IOV. Customer and third party VM and applications onboarding is impossible unless the vendor hardware driver is available in the guest operating system. Live migration of VMs across servers is not feasible. Networking services provided by the OVS and vRouter datapaths are not available to the VMs if they are implemented in the kernel space or user space with DPDK. Networking services provided by the OVS and vRouter datapaths are not available to the VMs if they are implemented in a SmartNIC. I/O bandwidth to and from the VM is the highest using SR-IOV.

Adverse effects on data center operational efficiency

When operators have to deal with different VM profiles, it is impossible to achieve an optimal, homogenous end-to-end configuration across all servers. For example, the operator could take different paths based on data path options in the host/hypervisor and the I/O interface between the VM and the host/hypervisor:

1. Figure 1 - OVS in Host + Virtio in VM: One option could be to use the kernel datapath options for OVS or vRouter and Virtio-based delivery of data to VMs across all servers. The result is a homogenous server deployment managed using OpenStack. The challenge here is poor performance to the VMs that require higher packets per second, or not having enough CPU cores left to deploy an adequate number of VMs in the server. This results in poor SLAs and server sprawl.


OVS in host and Virtio in VM


2 .Figure 2 - OVS-DPDK in Host + DPDK PMD or Virtio to VM: A second option is to use DPDK user space datapath options for OVS or vRouter and DPDK and Virtio-based delivery of data to VMs across all servers. The result is a homogenous server deployment managed with OpenStack. The challenge here is mediocre performance to the VMs that require higher packets per second, or not having enough CPU cores left to deploy an adequate number of VMs in the server. Also, to deliver adequate levels of performance, the administrator will have to pin different numbers of cores to the DPDK datapath to get the right level of performance. As a result, cores could be wasted on some servers or performance may be inadequate on others. Enabling an efficient way of configuring DPDK and the right number of cores per server can become a nightmare, taking operational costs higher. (This challenge is further explained in the next section.)

OVS-DPDK in Host and DPDK PMD or Virtio to VM


3 .Figure 3 - Traditional NIC + SR-IOV: A third option is to create a silo of servers that are configured using SR-IOV to deliver the highest performance. If the datapaths are running in kernel or user space, as discussed earlier, all SDN-based services provided by OVS or vRouter are lost. VMs on this silo of servers cannot be migrated. This makes efficient management of the servers difficult, taking operational costs higher.

Traditional NIC and SR-IOV


4 .Figure 4 - SmartNIC + SR-IOV: A fourth option is to create a silo of servers that are configured using SR-IOV to deliver the highest performance. The datapath runs in a SmartNIC, as discussed earlier, and all SDN-based services provided by OVS or vRouter are kept intact. VMs on this silo of servers cannot be migrated, however. This makes efficient management of the servers difficult, taking operational costs higher.

SmartNIC and SR-IOV


As can be seen, none of the above options meets the performance, networking options and resource utilization requirements of the VMs. The data center operators end up selecting multiple options, which ultimately means a heterogeneous environment that has high CAPEX and OPEX. This is depicted in Figure 5. As can be seen, none of the above options holistically meets the performance, flexibility, networking services and resource utilization requirements of the VMs and the servers they are hosted in.


Server configuration Diagram


Figure 5: Servers are configured differently to service different VM profiles. VMs cannot be migrated at all or it is hard to do so


In the final, and next part of this blog, my colleague Abhijeet Prabhune will write about solutions to the above operational challenges I highlighted in this blog.

Read the Blog, "The Case for Express Virtio (XVIO) - Part 2" by Abhijeet Prabhune.