SlideShare a Scribd company logo
1 of 34
Download to read offline
Receive side scaling (RSS) with
eBPF in QEMU and virtio-net
Yan Vugenfirer - CEO, Daynix
Agenda
• What is RSS?


• History: RSS and virtio-net


• What is eBPF?


• Using eBPF for packet steering (RSS) in
virtio-net
What is RSS?
• Receive side scaling - distribution of packets’ processing
among CPUs


• A NIC uses a hashing function to compute a hash value
over a defined area


• Hash value is used to index an indirection table


• The values in the indirection table are used to assign the
received data to a CPU


• With MSI support, a NIC can also interrupt the associated
CPU
CPU1 CPU2 CPU3
CPU0
What is RSS?
NIC
RX	handling RX	handling RX	handling RX	handling
NIC	Driver
HW	
queue
HW	
queue
HW	
queue
HW	
queue
Packet	classification
What is RSS?
Packet	
header
Hash	
function	
(Toeplitz)
QueueCPU	
number
Redirection	table	(up	to	128	entries)
History: RSS and virtio-net
• Let’s use RSS with virtio-net!


• Utilisation CPUs for packet processing


• Cash locality for network applications


• Microsoft WHQL requirement for high speed
devices
History: RSS and virtio-net
• No multi-queue in virtio


• SW implementation in Windows guest driver (similar to RFS in Linux)
CPU1 CPU2 CPU3
CPU0
virtio-net
RX	
handling
RX	
handling
RX	
handling
NetKVM	
RX	virt-queue
Packet	classification
RX	handling
History: RSS and virtio-net
• virtio-net became multi queue device


• Due to Windows requirements - hybrid
model. Interrupt received on specific CPU
core, but could be rescheduled to another


• Works good for TCP


• Might not work for UDP


• Support legacy interrupts for old OSes
History: RSS and virtio-net
CPU1
CPU0
virtio-net
NetKVM	
RX	virt-queue
Packet	classification
RX	handling
RX	virt-queue
Packet	classification
RX	handling
History: RSS and virtio-net
• virtio spec changes


• Set steering mode


• Pass the device redirection tables


• Set hash value in virtio-net-hdr


• No inter-processor interrupts due to re-scheduling


• Vision: HW will do all the heavy work


• Implementations


• SW only POC in QEMU


• eBPF
virtio spec changes - capabilities
• VIRTIO_NET_F_RSS


• VIRTIO_NET_F_MQ must be set
virtio spec changes - device configuration
• virtio_net_config
struct virtio_net_con
fi
g
{

u8 mac[6]
;

le16 status
;

le16 max_virtqueue_pairs
;

le16 mtu
;

le32 speed
;

u8 duplex
;

u8 rss_max_key_size
;

le16 rss_max_indirection_table_length
;

le32 supported_hash_types
;

};
virtio spec changes - setting RSS parameters
• VIRTIO_NET_CTRL_MQ_RSS_CONFIG
struct virtio_net_rss_con
fi
g
{

le32 hash_types
;

le16 indirection_table_mask
;

le16 unclassi
fi
ed_queue
;

le16 indirection_table[indirection_table_length]
;

le16 max_tx_vq
;

u8 hash_key_length
;

u8 hash_key_data[hash_key_length]
;

};
virtio spec changes - virtio-net-hdr
struct virtio_net_hdr
{

u8
fl
ags
;

u8 gso_type
;

le16 hdr_len
;

le16 gso_size
;

le16 csum_start
;

le16 csum_offset
;

le16 num_buffers
;

le32 hash_value; (Only if
VIRTIO_NET_F_HASH_REPORT negotiated
)

le16 hash_report; (Only if
VIRTIO_NET_F_HASH_REPORT negotiated
)

le16 padding_reserved; (Only if
VIRTIO_NET_F_HASH_REPORT negotiated
)

};
What is eBPF?
• Enable running sandboxed code in Linux
kernel


• The code can be loaded at run time


• Used for security, tracing, networking,
observability
How can eBPF help us?
• Calculate the RSS hash and return the
queue index for incoming packets


• Populate the hash value in virtio_net_hdr
(work in progress)
The “magic”
• Loading eBPF program using IOCTL
TUNSETSTEERINGEBPF


• tun_struct has steering_prog field


• If eBPF program for steering is loaded,
tun_select_queue will call it with
bpf_prog_run_clear_cb
Hash population (work in progress)
• Population from eBPF program


• virtio_net_hdr with additional fields


• Work in progress in kernel


• Enlarge virtio_net_hdr in all kernel
modules


• Keep calculated hash in SKB and copy
it to virtio_net_hdr
eBPF program source in QEMU
• tun_rss_steering_prog


• tools/ebpf/rss.bpf.c


• Use clang to compile


• tools/ebpf/Makefile.ebpf
eBPF program skeleton
• During QEMU compilation include file is populated
with the compiled binary


• bpftool gen skeleton rss.bpf.o > rss.bpf.skeleton.h


• Helpers to initialise maps (mechanism to share data
between eBPF program and kerneluserspace)


• Some changes to support libvirt - mmapping the
shared data structure to user space (3 maps in
current main branch without mmaping, 1 map in
pending patches)
Configuration map
• The configuration map is BPF array map that
contains everything required for RSS:


• Supported hash flows: IPv4, TCPv4, UDPv4,
IPv6, IPv6ex, TCPv6, UDPv6


• Indirections table size (max 128)


• Default queue


• Toeplitz hash key - 40 bytes


• Indirections table - 128 entries
Loading eBPF program
• Two mechanisms


• QEMU using function in skeleton file.
Calling bpf syscall


• eBPF helper program (with libvirt) -
QEMU gets file descriptors from libvirt
with already loaded ebpf program and
mapping of the ebpf map (patches under
review)
Loading eBPF program
• Possible load failures


• Kernel support. Current solution requires 5.8+


• Without helper


• QEMU process capabilities: CAP_BPF,
CAP_NET_ADMIN


• sysctl kernel.unprivileged_bpf_disabled=1


• libbpf not present


• In case of helper usage - mismatch between helper and
QEMU


• Stamp is a hash of skeleton include file
Fallback
• Built it QEMU RSS steering


• Can be triggered also by live migration


• Hash population is enabled in QEMU
command line, because there is still not
hash population from eBPF program
Live migration
• Known issue: migrating to old kernel


• eBPF load failure


• Fallback to in-QEMU RSS steering
QEMU command line
• Multi-queue should be enabled


• -smp with vCPU for each queue-pair


• -device virtio-net-pci,
rss=on,hash=on,ebpf_rss_fds=<fd0,fd1>
QEMU command line
• rss=on


• Try to load eBPF from skeleton or by using
provided file descriptors


• Fallback to “built-in” RSS steering in QEMU if
cannot load eBPF program


• hash=on


• Populate hash in virtio_net_hdr


• ebpf_rss_fds - optional, provide file descriptors for
eBPF program and map
libvirt integration
• QEMU should run with least possible privileges


• eBPF helper


• Stamping the helper during compilation time


• Redirection table mapping


• Additional command line options to provide file
descriptors to QEMU


• Patches under review
Current status
• Initial support was merged to QEMU


• libvirt integration patches in QEMU and
libvirt are under discussion on mailing lists


• Hash population by eBPF program -
pending additional work for next set of
patches
Pending patches
• QEMU libvirt integration: https://lists.nongnu.org/
archive/html/qemu-devel/2021-07/msg03535.html


• libvirt patches: https://listman.redhat.com/archives/
libvir-list/2021-July/msg00836.html


• RSS support in Linux virtio-net driver: https://
lists.linuxfoundation.org/pipermail/virtualization/
2021-August/055940.html


• In kernel hash calculation reporting to guest driver:
https://lkml.org/lkml/2021/1/12/1329
virtio-net and eBPF future
• Packet filtering with vhost


• Security?
yan@daynix.com
Links
• https://www.kernel.org/doc/Documentation/networking/scaling.txt


• https://docs.microsoft.com/en-us/windows-hardware/drivers/network/
introduction-to-receive-side-scaling


• https://ebpf.io


• https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss-
with-a-single-hardware-receive-queue


• https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss-
with-hardware-queuing


• https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss-
with-message-signaled-interrupts


• https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss-
hashing-functions

More Related Content

What's hot

Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_mapslcplcp1
 
Boosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringBoosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringShapeBlue
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Ray Jenkins
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debuggingHao-Ran Liu
 
KVM tools and enterprise usage
KVM tools and enterprise usageKVM tools and enterprise usage
KVM tools and enterprise usagevincentvdk
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network InterfacesKernel TLV
 
Extreme Linux Performance Monitoring and Tuning
Extreme Linux Performance Monitoring and TuningExtreme Linux Performance Monitoring and Tuning
Extreme Linux Performance Monitoring and TuningMilind Koyande
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux NetworkingPLUMgrid
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionGene Chang
 
Kdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysisKdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysisBuland Singh
 
Enable DPDK and SR-IOV for containerized virtual network functions with zun
Enable DPDK and SR-IOV for containerized virtual network functions with zunEnable DPDK and SR-IOV for containerized virtual network functions with zun
Enable DPDK and SR-IOV for containerized virtual network functions with zunheut2008
 
COSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingCOSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingEric Lin
 

What's hot (20)

Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 
Dpdk pmd
Dpdk pmdDpdk pmd
Dpdk pmd
 
Boosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uringBoosting I/O Performance with KVM io_uring
Boosting I/O Performance with KVM io_uring
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
KVM tools and enterprise usage
KVM tools and enterprise usageKVM tools and enterprise usage
KVM tools and enterprise usage
 
Qemu Pcie
Qemu PcieQemu Pcie
Qemu Pcie
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
Extreme Linux Performance Monitoring and Tuning
Extreme Linux Performance Monitoring and TuningExtreme Linux Performance Monitoring and Tuning
Extreme Linux Performance Monitoring and Tuning
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
Kdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysisKdump and the kernel crash dump analysis
Kdump and the kernel crash dump analysis
 
Enable DPDK and SR-IOV for containerized virtual network functions with zun
Enable DPDK and SR-IOV for containerized virtual network functions with zunEnable DPDK and SR-IOV for containerized virtual network functions with zun
Enable DPDK and SR-IOV for containerized virtual network functions with zun
 
COSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem portingCOSCUP 2020 RISC-V 32 bit linux highmem porting
COSCUP 2020 RISC-V 32 bit linux highmem porting
 
Dpdk performance
Dpdk performanceDpdk performance
Dpdk performance
 

Similar to Receive side scaling (RSS) with eBPF in QEMU and virtio-net

Fastsocket Linxiaofeng
Fastsocket LinxiaofengFastsocket Linxiaofeng
Fastsocket LinxiaofengMichael Zhang
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Michelle Holley
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxAkshitAgiwal1
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] IO Visor Project
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld
 
Assisting User’s Transition to Titan’s Accelerated Architecture
Assisting User’s Transition to Titan’s Accelerated ArchitectureAssisting User’s Transition to Titan’s Accelerated Architecture
Assisting User’s Transition to Titan’s Accelerated Architectureinside-BigData.com
 
BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!Linaro
 
Load Balancing
Load BalancingLoad Balancing
Load Balancingoptalink
 
Spy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformSpy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformRedge Technologies
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...SignalFx
 
Introduction to ARM big.LITTLE technology
Introduction to ARM big.LITTLE technologyIntroduction to ARM big.LITTLE technology
Introduction to ARM big.LITTLE technology義洋 顏
 
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...Bruno Castelucci
 
DPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettDPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettJim St. Leger
 
2016 NCTU P4 Workshop
2016 NCTU P4 Workshop2016 NCTU P4 Workshop
2016 NCTU P4 WorkshopYi Tseng
 

Similar to Receive side scaling (RSS) with eBPF in QEMU and virtio-net (20)

Fastsocket Linxiaofeng
Fastsocket LinxiaofengFastsocket Linxiaofeng
Fastsocket Linxiaofeng
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptx
 
Linux Kernel Live Patching
Linux Kernel Live PatchingLinux Kernel Live Patching
Linux Kernel Live Patching
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep Dive
 
Assisting User’s Transition to Titan’s Accelerated Architecture
Assisting User’s Transition to Titan’s Accelerated ArchitectureAssisting User’s Transition to Titan’s Accelerated Architecture
Assisting User’s Transition to Titan’s Accelerated Architecture
 
15 ia64
15 ia6415 ia64
15 ia64
 
ch2.pptx
ch2.pptxch2.pptx
ch2.pptx
 
Ebpf ovsconf-2016
Ebpf ovsconf-2016Ebpf ovsconf-2016
Ebpf ovsconf-2016
 
BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!
 
Load Balancing
Load BalancingLoad Balancing
Load Balancing
 
Building a Router
Building a RouterBuilding a Router
Building a Router
 
Spy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platformSpy hard, challenges of 100G deep packet inspection on x86 platform
Spy hard, challenges of 100G deep packet inspection on x86 platform
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...
 
Introduction to ARM big.LITTLE technology
Introduction to ARM big.LITTLE technologyIntroduction to ARM big.LITTLE technology
Introduction to ARM big.LITTLE technology
 
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
Revisão: Forwarding Metamorphosis: Fast Programmable Match-Action Processing ...
 
DPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles ShiflettDPDK Summit 2015 - Aspera - Charles Shiflett
DPDK Summit 2015 - Aspera - Charles Shiflett
 
2016 NCTU P4 Workshop
2016 NCTU P4 Workshop2016 NCTU P4 Workshop
2016 NCTU P4 Workshop
 

More from Yan Vugenfirer

HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers - Kostiantyn Ko...
HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers - Kostiantyn Ko...HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers - Kostiantyn Ko...
HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers - Kostiantyn Ko...Yan Vugenfirer
 
Implementing SR-IOv failover for Windows guests during live migration
Implementing SR-IOv failover for Windows guests during live migrationImplementing SR-IOv failover for Windows guests during live migration
Implementing SR-IOv failover for Windows guests during live migrationYan Vugenfirer
 
Qemu device prototyping
Qemu device prototypingQemu device prototyping
Qemu device prototypingYan Vugenfirer
 
Windows network teaming
Windows network teamingWindows network teaming
Windows network teamingYan Vugenfirer
 
Rebuild presentation - IoT Israel MeetUp
Rebuild presentation - IoT Israel MeetUpRebuild presentation - IoT Israel MeetUp
Rebuild presentation - IoT Israel MeetUpYan Vugenfirer
 
Rebuild presentation during Docker's Birthday party
Rebuild presentation during Docker's Birthday partyRebuild presentation during Docker's Birthday party
Rebuild presentation during Docker's Birthday partyYan Vugenfirer
 
Contributing to open source using Git
Contributing to open source using GitContributing to open source using Git
Contributing to open source using GitYan Vugenfirer
 
Microsoft Hardware Certification Kit (HCK) setup
Microsoft Hardware Certification Kit (HCK) setupMicrosoft Hardware Certification Kit (HCK) setup
Microsoft Hardware Certification Kit (HCK) setupYan Vugenfirer
 
Building “old” Windows drivers (XP, Vista, 2003 and 2008) with Visual Studio ...
Building “old” Windows drivers (XP, Vista, 2003 and 2008) with Visual Studio ...Building “old” Windows drivers (XP, Vista, 2003 and 2008) with Visual Studio ...
Building “old” Windows drivers (XP, Vista, 2003 and 2008) with Visual Studio ...Yan Vugenfirer
 
Advanced NDISTest options
Advanced NDISTest optionsAdvanced NDISTest options
Advanced NDISTest optionsYan Vugenfirer
 
QEMU Development and Testing Automation Using MS HCK - Anton Nayshtut and Yan...
QEMU Development and Testing Automation Using MS HCK - Anton Nayshtut and Yan...QEMU Development and Testing Automation Using MS HCK - Anton Nayshtut and Yan...
QEMU Development and Testing Automation Using MS HCK - Anton Nayshtut and Yan...Yan Vugenfirer
 
Windows guest debugging presentation from KVM Forum 2012
Windows guest debugging presentation from KVM Forum 2012Windows guest debugging presentation from KVM Forum 2012
Windows guest debugging presentation from KVM Forum 2012Yan Vugenfirer
 

More from Yan Vugenfirer (14)

HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers - Kostiantyn Ko...
HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers - Kostiantyn Ko...HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers - Kostiantyn Ko...
HCK-CI: Enabling CI for Windows Guest Paravirtualized Drivers - Kostiantyn Ko...
 
Implementing SR-IOv failover for Windows guests during live migration
Implementing SR-IOv failover for Windows guests during live migrationImplementing SR-IOv failover for Windows guests during live migration
Implementing SR-IOv failover for Windows guests during live migration
 
Qemu device prototyping
Qemu device prototypingQemu device prototyping
Qemu device prototyping
 
Windows network teaming
Windows network teamingWindows network teaming
Windows network teaming
 
Rebuild presentation - IoT Israel MeetUp
Rebuild presentation - IoT Israel MeetUpRebuild presentation - IoT Israel MeetUp
Rebuild presentation - IoT Israel MeetUp
 
Rebuild presentation during Docker's Birthday party
Rebuild presentation during Docker's Birthday partyRebuild presentation during Docker's Birthday party
Rebuild presentation during Docker's Birthday party
 
Contributing to open source using Git
Contributing to open source using GitContributing to open source using Git
Contributing to open source using Git
 
Introduction to Git
Introduction to GitIntroduction to Git
Introduction to Git
 
Microsoft Hardware Certification Kit (HCK) setup
Microsoft Hardware Certification Kit (HCK) setupMicrosoft Hardware Certification Kit (HCK) setup
Microsoft Hardware Certification Kit (HCK) setup
 
UsbDk at a Glance 
UsbDk at a Glance UsbDk at a Glance 
UsbDk at a Glance 
 
Building “old” Windows drivers (XP, Vista, 2003 and 2008) with Visual Studio ...
Building “old” Windows drivers (XP, Vista, 2003 and 2008) with Visual Studio ...Building “old” Windows drivers (XP, Vista, 2003 and 2008) with Visual Studio ...
Building “old” Windows drivers (XP, Vista, 2003 and 2008) with Visual Studio ...
 
Advanced NDISTest options
Advanced NDISTest optionsAdvanced NDISTest options
Advanced NDISTest options
 
QEMU Development and Testing Automation Using MS HCK - Anton Nayshtut and Yan...
QEMU Development and Testing Automation Using MS HCK - Anton Nayshtut and Yan...QEMU Development and Testing Automation Using MS HCK - Anton Nayshtut and Yan...
QEMU Development and Testing Automation Using MS HCK - Anton Nayshtut and Yan...
 
Windows guest debugging presentation from KVM Forum 2012
Windows guest debugging presentation from KVM Forum 2012Windows guest debugging presentation from KVM Forum 2012
Windows guest debugging presentation from KVM Forum 2012
 

Recently uploaded

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyAnusha Are
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 

Receive side scaling (RSS) with eBPF in QEMU and virtio-net

  • 1. Receive side scaling (RSS) with eBPF in QEMU and virtio-net Yan Vugenfirer - CEO, Daynix
  • 2. Agenda • What is RSS? • History: RSS and virtio-net • What is eBPF? • Using eBPF for packet steering (RSS) in virtio-net
  • 3. What is RSS? • Receive side scaling - distribution of packets’ processing among CPUs • A NIC uses a hashing function to compute a hash value over a defined area • Hash value is used to index an indirection table • The values in the indirection table are used to assign the received data to a CPU • With MSI support, a NIC can also interrupt the associated CPU
  • 4. CPU1 CPU2 CPU3 CPU0 What is RSS? NIC RX handling RX handling RX handling RX handling NIC Driver HW queue HW queue HW queue HW queue Packet classification
  • 6. History: RSS and virtio-net • Let’s use RSS with virtio-net! • Utilisation CPUs for packet processing • Cash locality for network applications • Microsoft WHQL requirement for high speed devices
  • 7. History: RSS and virtio-net • No multi-queue in virtio • SW implementation in Windows guest driver (similar to RFS in Linux) CPU1 CPU2 CPU3 CPU0 virtio-net RX handling RX handling RX handling NetKVM RX virt-queue Packet classification RX handling
  • 8. History: RSS and virtio-net • virtio-net became multi queue device • Due to Windows requirements - hybrid model. Interrupt received on specific CPU core, but could be rescheduled to another • Works good for TCP • Might not work for UDP • Support legacy interrupts for old OSes
  • 9. History: RSS and virtio-net CPU1 CPU0 virtio-net NetKVM RX virt-queue Packet classification RX handling RX virt-queue Packet classification RX handling
  • 10. History: RSS and virtio-net • virtio spec changes • Set steering mode • Pass the device redirection tables • Set hash value in virtio-net-hdr • No inter-processor interrupts due to re-scheduling • Vision: HW will do all the heavy work • Implementations • SW only POC in QEMU • eBPF
  • 11. virtio spec changes - capabilities • VIRTIO_NET_F_RSS • VIRTIO_NET_F_MQ must be set
  • 12. virtio spec changes - device configuration • virtio_net_config struct virtio_net_con fi g { u8 mac[6] ; le16 status ; le16 max_virtqueue_pairs ; le16 mtu ; le32 speed ; u8 duplex ; u8 rss_max_key_size ; le16 rss_max_indirection_table_length ; le32 supported_hash_types ; };
  • 13. virtio spec changes - setting RSS parameters • VIRTIO_NET_CTRL_MQ_RSS_CONFIG struct virtio_net_rss_con fi g { le32 hash_types ; le16 indirection_table_mask ; le16 unclassi fi ed_queue ; le16 indirection_table[indirection_table_length] ; le16 max_tx_vq ; u8 hash_key_length ; u8 hash_key_data[hash_key_length] ; };
  • 14. virtio spec changes - virtio-net-hdr struct virtio_net_hdr { u8 fl ags ; u8 gso_type ; le16 hdr_len ; le16 gso_size ; le16 csum_start ; le16 csum_offset ; le16 num_buffers ; le32 hash_value; (Only if VIRTIO_NET_F_HASH_REPORT negotiated ) le16 hash_report; (Only if VIRTIO_NET_F_HASH_REPORT negotiated ) le16 padding_reserved; (Only if VIRTIO_NET_F_HASH_REPORT negotiated ) };
  • 15. What is eBPF? • Enable running sandboxed code in Linux kernel • The code can be loaded at run time • Used for security, tracing, networking, observability
  • 16. How can eBPF help us? • Calculate the RSS hash and return the queue index for incoming packets • Populate the hash value in virtio_net_hdr (work in progress)
  • 17. The “magic” • Loading eBPF program using IOCTL TUNSETSTEERINGEBPF • tun_struct has steering_prog field • If eBPF program for steering is loaded, tun_select_queue will call it with bpf_prog_run_clear_cb
  • 18. Hash population (work in progress) • Population from eBPF program • virtio_net_hdr with additional fields • Work in progress in kernel • Enlarge virtio_net_hdr in all kernel modules • Keep calculated hash in SKB and copy it to virtio_net_hdr
  • 19. eBPF program source in QEMU • tun_rss_steering_prog • tools/ebpf/rss.bpf.c • Use clang to compile • tools/ebpf/Makefile.ebpf
  • 20. eBPF program skeleton • During QEMU compilation include file is populated with the compiled binary • bpftool gen skeleton rss.bpf.o > rss.bpf.skeleton.h • Helpers to initialise maps (mechanism to share data between eBPF program and kerneluserspace) • Some changes to support libvirt - mmapping the shared data structure to user space (3 maps in current main branch without mmaping, 1 map in pending patches)
  • 21. Configuration map • The configuration map is BPF array map that contains everything required for RSS: • Supported hash flows: IPv4, TCPv4, UDPv4, IPv6, IPv6ex, TCPv6, UDPv6 • Indirections table size (max 128) • Default queue • Toeplitz hash key - 40 bytes • Indirections table - 128 entries
  • 22. Loading eBPF program • Two mechanisms • QEMU using function in skeleton file. Calling bpf syscall • eBPF helper program (with libvirt) - QEMU gets file descriptors from libvirt with already loaded ebpf program and mapping of the ebpf map (patches under review)
  • 23. Loading eBPF program • Possible load failures • Kernel support. Current solution requires 5.8+ • Without helper • QEMU process capabilities: CAP_BPF, CAP_NET_ADMIN • sysctl kernel.unprivileged_bpf_disabled=1 • libbpf not present • In case of helper usage - mismatch between helper and QEMU • Stamp is a hash of skeleton include file
  • 24. Fallback • Built it QEMU RSS steering • Can be triggered also by live migration • Hash population is enabled in QEMU command line, because there is still not hash population from eBPF program
  • 25. Live migration • Known issue: migrating to old kernel • eBPF load failure • Fallback to in-QEMU RSS steering
  • 26. QEMU command line • Multi-queue should be enabled • -smp with vCPU for each queue-pair • -device virtio-net-pci, rss=on,hash=on,ebpf_rss_fds=<fd0,fd1>
  • 27. QEMU command line • rss=on • Try to load eBPF from skeleton or by using provided file descriptors • Fallback to “built-in” RSS steering in QEMU if cannot load eBPF program • hash=on • Populate hash in virtio_net_hdr • ebpf_rss_fds - optional, provide file descriptors for eBPF program and map
  • 28. libvirt integration • QEMU should run with least possible privileges • eBPF helper • Stamping the helper during compilation time • Redirection table mapping • Additional command line options to provide file descriptors to QEMU • Patches under review
  • 29. Current status • Initial support was merged to QEMU • libvirt integration patches in QEMU and libvirt are under discussion on mailing lists • Hash population by eBPF program - pending additional work for next set of patches
  • 30. Pending patches • QEMU libvirt integration: https://lists.nongnu.org/ archive/html/qemu-devel/2021-07/msg03535.html • libvirt patches: https://listman.redhat.com/archives/ libvir-list/2021-July/msg00836.html • RSS support in Linux virtio-net driver: https:// lists.linuxfoundation.org/pipermail/virtualization/ 2021-August/055940.html • In kernel hash calculation reporting to guest driver: https://lkml.org/lkml/2021/1/12/1329
  • 31. virtio-net and eBPF future • Packet filtering with vhost • Security?
  • 33.
  • 34. Links • https://www.kernel.org/doc/Documentation/networking/scaling.txt • https://docs.microsoft.com/en-us/windows-hardware/drivers/network/ introduction-to-receive-side-scaling • https://ebpf.io • https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss- with-a-single-hardware-receive-queue • https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss- with-hardware-queuing • https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss- with-message-signaled-interrupts • https://docs.microsoft.com/en-us/windows-hardware/drivers/network/rss- hashing-functions