Setup
For the test a single WHLE board was used, identified as whle_ls1046_1
, and a PC
equipped with a network card having two 10G ports. A single PC machine was set up to emulate two separate machines serving as endpoints for the WHLE board to route between. Please note that the actual speed test results depend on the CPU power of the PC, which, for the tests to measure only WHLE performance undisturbed, must be able to handle traffic at both ends at level exceeding the ability of a single WHLE board, with a good margin. In the carried out tests the 12-core 4.5 GHz x86_64 machine was more than sufficient for the task.
Before proceeding please make sure you followed the common setup described in WHLE-LS1046 kernel DPAA drivers .
Router
Connection diagram
Connection speed from ens1f1
interface to ens1f0
on PC
will be measured, with whle_ls1046_1
configured as a router. The isolated_ns denotes network namespace in which the ens1f0
interface had to be enclosed to force PC
to send the packets through whle_ls1046_1
instead of short-circuiting to the local interface.
Network Setup
PC
root@PC:~# ip netns add isolated_ns root@PC:~# ip link set ens1f0 netns isolated_ns root@PC:~# ip netns exec isolated_ns ip addr flush ens1f0 root@PC:~# ip netns exec isolated_ns ip addr add 192.168.10.1/24 dev ens1f0 root@PC:~# ip netns exec isolated_ns ip addr add 192.168.11.1/24 dev ens1f0 root@PC:~# ip netns exec isolated_ns ip route add 192.168.30.0/24 via 192.168.10.2 root@PC:~# ip netns exec isolated_ns ip route add 192.168.31.0/24 via 192.168.11.2 root@PC:~# ip addr flush ens1f1 root@PC:~# ip address add 192.168.30.2/24 dev ens1f1 root@PC:~# ip address add 192.168.31.2/24 dev ens1f1 root@PC:~# ip route add 192.168.10.0/24 via 192.168.30.1 root@PC:~# ip route add 192.168.11.0/24 via 192.168.31.1
whle_ls1046_1
root@whle-ls1046a:~# ip address flush eth1 root@whle-ls1046a:~# ip address flush eth2 root@whle-ls1046a:~# ip address flush eth3 root@whle-ls1046a:~# ip address flush eth5 root@whle-ls1046a:~# ip address flush eth4 root@whle-ls1046a:~# ip addr add 192.168.10.2/24 dev eth5 root@whle-ls1046a:~# ip addr add 192.168.11.2/24 dev eth5 root@whle-ls1046a:~# ip addr add 192.168.30.1/24 dev eth4 root@whle-ls1046a:~# ip addr add 192.168.31.1/24 dev eth4 root@whle-ls1046a:~# ip link set dev eth4 up root@whle-ls1046a:~# ip link set dev eth5 up root@whle-ls1046a:~# echo 1 > /proc/sys/net/ipv4/ip_forward
Tests
Iperf servers
On PC
launch four instances of iperf3 servers, listening on ports 5201-5204
. The ip netns exec
command requires root access.
PC
root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5201 & root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5202 & root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5203 & root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5204 &
Iperf clients
Launching iperf3
clients is a bit more involved as the reproducibility requires controlling the source port, which is an ephemeral port assigned by the system from a specific range, usually 32768-60999 for Linux. This can achieved by temporarily narrowing the range to just a single port such that the iperf3
client has no choice but to use only one available. Use the following helper script wrapping the iperf3
call in the ephemeral ports range setup:
iperfc.sh
:
#!/usr/bin/bash set -x ip=$1 port=$2 label=$3 echo "5$(( port - 1 )) 5${port}" > /proc/sys/net/ipv4/ip_local_port_range iperf3 -c ${ip} -p ${port} --time 0 --title $3 --omit 5 & sleep 1 echo "32768 60999" > /proc/sys/net/ipv4/ip_local_port_range
Run the clients simultaneously:
PC
root@PC:~# ( ./iperfc.sh 192.168.11.1 5201 A ./iperfc.sh 192.168.10.1 5202 B ./iperfc.sh 192.168.11.1 5203 C ./iperfc.sh 192.168.11.1 5204 D ) ... B: [ 5] 0.00-1.00 sec 281 MBytes 2.36 Gbits/sec 322 382 KBytes C: [ 5] 4.00-5.00 sec 241 MBytes 2.02 Gbits/sec 523 546 KBytes (omitted) D: [ 5] 3.00-4.00 sec 278 MBytes 2.33 Gbits/sec 399 404 KBytes (omitted) A: [ 5] 2.00-3.00 sec 270 MBytes 2.27 Gbits/sec 327 443 KBytes B: [ 5] 1.00-2.00 sec 258 MBytes 2.16 Gbits/sec 292 399 KBytes C: [ 5] 0.00-1.00 sec 358 MBytes 3.00 Gbits/sec 348 556 KBytes D: [ 5] 4.00-5.00 sec 236 MBytes 1.98 Gbits/sec 234 485 KBytes (omitted) A: [ 5] 3.00-4.00 sec 304 MBytes 2.55 Gbits/sec 1248 508 KBytes B: [ 5] 2.00-3.00 sec 220 MBytes 1.85 Gbits/sec 877 393 KBytes C: [ 5] 1.00-2.00 sec 346 MBytes 2.90 Gbits/sec 437 380 KBytes ...
The addresses and ports are picked in such a way that every iperf3 flow is handled by a different core on whle_ls1046_1
(cores 0
, 2
, 1
, 3
, respectively).
Stop all the clients after some time.
PC
root@PC:~# kill $(ps a | grep 'iperf3 -[c]' | awk '{ print $1; }')
... C: [ 5] 181.00-182.00 sec 188 MBytes 1.57 Gbits/sec 81 444 KBytes D: [ 5] 180.00-181.00 sec 372 MBytes 3.12 Gbits/sec 180 707 KBytes D: [ 5] 181.00-181.55 sec 198 MBytes 2.99 Gbits/sec 165 687 KBytes D: - - - - - - - - - - - - - - - - - - - - - - - - - C: [ 5] 182.00-182.56 sec 152 MBytes 2.27 Gbits/sec 241 505 KBytes D: [ ID] Interval Transfer Bitrate Retr A: [ 5] 184.00-184.58 sec 125 MBytes 1.80 Gbits/sec 453 716 KBytes C: - - - - - - - - - - - - - - - - - - - - - - - - - D: [ 5] 0.00-181.55 sec 49.6 GBytes 2.35 Gbits/sec 68979 sender A: - - - - - - - - - - - - - - - - - - - - - - - - - C: [ ID] Interval Transfer Bitrate Retr D: [ 5] 0.00-181.55 sec 0.00 Bytes 0.00 bits/sec receiver A: [ ID] Interval Transfer Bitrate Retr B: [ 5] 183.00-183.57 sec 156 MBytes 2.28 Gbits/sec 323 515 KBytes B: - - - - - - - - - - - - - - - - - - - - - - - - - C: [ 5] 0.00-182.56 sec 50.4 GBytes 2.37 Gbits/sec 66951 sender A: [ 5] 0.00-184.58 sec 50.1 GBytes 2.33 Gbits/sec 71061 sender iperf3: interrupt - the client has terminated C: [ 5] 0.00-182.56 sec 0.00 Bytes 0.00 bits/sec receiver A: [ 5] 0.00-184.58 sec 0.00 Bytes 0.00 bits/sec receiver B: [ ID] Interval Transfer Bitrate Retr B: [ 5] 0.00-183.57 sec 48.9 GBytes 2.29 Gbits/sec 67345 sender B: [ 5] 0.00-183.57 sec 0.00 Bytes 0.00 bits/sec receiver iperf3: interrupt - the client has terminated iperf3: interrupt - the client has terminated iperf3: interrupt - the client has terminated
Sum the values from the lines with "sender" at the end
D: [ 5] 0.00-181.55 sec 49.6 GBytes 2.35 Gbits/sec 68979 sender ... C: [ 5] 0.00-182.56 sec 50.4 GBytes 2.37 Gbits/sec 66951 sender ... A: [ 5] 0.00-184.58 sec 50.1 GBytes 2.33 Gbits/sec 71061 sender ... B: [ 5] 0.00-183.57 sec 48.9 GBytes 2.29 Gbits/sec 67345 sender
The total bandwidth achieved is 2.35 + 2.37 + 2.33 + 2.29 = 9.34 Gb/s. This is the upper limit for the TCP protocol on a 10 Gb/s physical link, proving that WHLE-LS1046A board is able to handle routing at its network interface's limit using standard kernel drivers.
WHLE work analysis
Consider the snapshot from the top command ran on whle_ls1046_1
during the performance test:
The si column shows the CPU time spent in software interrupts, in this case the network interrupts almost exclusively. Nearly zero time spent by the system or user shows that the routing task is carried out in the interrupts alone. The load spread evenly at ~73% between all cores stems from picking the right parameters (ip source address, ip dest address, tcp source port, tcp dest port) defining four data flows assigned by driver's RSS to four separate CPUs. The idle time id at ~25% shows that WHLE operates at 75% capacity, providing a decent margin to account for more realistic routing tasks, with bigger routing tables and less than perfectly CPU-even traffic.
L2 Bridge
Connection diagram
Network Setup
PC
root@PC:~# ip netns add isolated_ns root@PC:~# ip link set ens1f0 netns isolated_ns root@PC:~# ip netns exec isolated_ns ip addr flush ens1f0 root@PC:~# ip netns exec isolated_ns ip addr add 192.168.30.1/24 dev ens1f0 root@PC:~# ip addr flush ens1f1 root@PC:~# ip address add 192.168.30.2/24 dev ens1f1
whle_ls1046_1
root@whle-ls1046a:~# ip address flush eth4 root@whle-ls1046a:~# ip address flush eth5 root@whle-ls1046a:~# ip link set dev eth4 down root@whle-ls1046a:~# ip link set dev eth5 down root@whle-ls1046a:~# brctl addbr br0 root@whle-ls1046a:~# brctl addif br0 eth4 root@whle-ls1046a:~# brctl addif br0 eth5 root@whle-ls1046a:~# ip link set dev br0 up root@whle-ls1046a:~# ip link set dev eth4 up root@whle-ls1046a:~# ip link set dev eth5 up
Tests
Iperf servers
On PC
launch two instances of iperf3 servers, listening on ports 5201
and 5202
. The ip netns exec
command requires root access.
PC
root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5201 & root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5202 &
Iperf clients
Run two clients simultaneously (use the iperfc.sh
script from https://conclusive.atlassian.net/wiki/spaces/CW/pages/edit-v2/398721025#Iperf-clients):
PC
root@PC:~# ( ./iperfc.sh 192.168.30.1 5201 A ./iperfc.sh 192.168.30.1 5202 B ) ... A: [ 5] 3.00-4.00 sec 559 MBytes 4.69 Gbits/sec 283 588 KBytes (omitted) B: [ 5] 2.00-3.00 sec 552 MBytes 4.64 Gbits/sec 283 595 KBytes (omitted) A: [ 5] 4.00-5.00 sec 561 MBytes 4.71 Gbits/sec 352 580 KBytes (omitted) B: [ 5] 3.00-4.00 sec 550 MBytes 4.61 Gbits/sec 248 584 KBytes (omitted) A: [ 5] 0.00-1.00 sec 560 MBytes 4.70 Gbits/sec 304 406 KBytes B: [ 5] 4.00-5.00 sec 552 MBytes 4.63 Gbits/sec 282 413 KBytes (omitted) A: [ 5] 1.00-2.00 sec 560 MBytes 4.70 Gbits/sec 254 443 KBytes B: [ 5] 0.00-1.00 sec 555 MBytes 4.66 Gbits/sec 119 602 KBytes ...
The addresses and ports are picked in such a way that each iperf3 flow is handled by a different core on whle_ls1046_1
(cores 0
, 1
, respectively).
Stop all the clients after some time.
root@PC:~# kill $(ps a | grep 'iperf3 -[c]' | awk '{ print $1; }')
... B: [ 5] 202.00-203.00 sec 548 MBytes 4.59 Gbits/sec 250 594 KBytes A: [ 5] 204.00-205.00 sec 562 MBytes 4.72 Gbits/sec 268 420 KBytes B: [ 5] 203.00-204.00 sec 552 MBytes 4.63 Gbits/sec 289 409 KBytes B: [ 5] 204.00-204.49 sec 268 MBytes 4.62 Gbits/sec 151 286 KBytes A: [ 5] 205.00-205.50 sec 280 MBytes 4.69 Gbits/sec 72 409 KBytes B: - - - - - - - - - - - - - - - - - - - - - - - - - A: - - - - - - - - - - - - - - - - - - - - - - - - - B: [ ID] Interval Transfer Bitrate Retr A: [ ID] Interval Transfer Bitrate Retr A: [ 5] 0.00-205.50 sec 112 GBytes 4.68 Gbits/sec 53456 sender A: [ 5] 0.00-205.50 sec 0.00 Bytes 0.00 bits/sec receiver B: [ 5] 0.00-204.49 sec 110 GBytes 4.63 Gbits/sec 51300 sender iperf3: interrupt - the client has terminated B: [ 5] 0.00-204.49 sec 0.00 Bytes 0.00 bits/sec receiver iperf3: interrupt - the client has terminated ...
Sum the values from the lines with "sender" at the end
A: [ 5] 0.00-205.50 sec 112 GBytes 4.68 Gbits/sec 53456 sender ... B: [ 5] 0.00-204.49 sec 110 GBytes 4.63 Gbits/sec 51300 sender
The total bandwidth achieved is 4.68 + 4.63 = 9.31 Gb/s. This is the upper limit for the TCP protocol on a 10 Gb/s physical link, proving that WHLE-LS1046A board is able to handle bridging at network interface's limit using standard kernel drivers.
WHLE work analysis
A snapshot from the top command ran on whle_ls1046_1
during the performance test:
Add Comment