Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Setup

For the test a single WHLE board was used, identified as whle_ls1046_1, and a PC equipped with a network card having two 10G ports. A single PC machine was set up to emulate two separate machines serving as endpoints for the WHLE board to route between. Please note that the actual speed test results depend on the CPU power of the PC, which, for the tests to measure only WHLE performance undisturbed, must be able to handle traffic at both ends at level exceeding the ability of a single WHLE board, with a good margin. In the carried out tests the 12-core 4.5 GHz x86_64 machine was more than sufficient for the task.

Before proceeding please make sure you followed the common setup described in WHLE-LS1046 kernel DPAA drivers .

Router

Connection diagram

Connection speed from ens1f1 interface to ens1f0 on PC will be measured, with whle_ls1046_1 configured as a router. The isolated_ns denotes network namespace in which the ens1f0 interface had to be enclosed to force PC to send the packets through whle_ls1046_1 instead of short-circuiting to the local interface.

router_whle-pc.drawio.png

Network Setup

PC
root@PC:~# ip netns add isolated_ns
root@PC:~# ip link set ens1f0 netns isolated_ns
root@PC:~# ip netns exec isolated_ns ip addr flush ens1f0
root@PC:~# ip netns exec isolated_ns ip addr add 192.168.10.1/24 dev ens1f0
root@PC:~# ip netns exec isolated_ns ip addr add 192.168.11.1/24 dev ens1f0
root@PC:~# ip netns exec isolated_ns ip route add 192.168.30.0/24 via 192.168.10.2
root@PC:~# ip netns exec isolated_ns ip route add 192.168.31.0/24 via 192.168.11.2
root@PC:~# ip addr flush ens1f1
root@PC:~# ip address add 192.168.30.2/24 dev ens1f1
root@PC:~# ip address add 192.168.31.2/24 dev ens1f1
root@PC:~# ip route add 192.168.10.0/24 via 192.168.30.1
root@PC:~# ip route add 192.168.11.0/24 via 192.168.31.1
whle_ls1046_1
root@whle-ls1046a:~# ip address flush eth1
root@whle-ls1046a:~# ip address flush eth2
root@whle-ls1046a:~# ip address flush eth3
root@whle-ls1046a:~# ip address flush eth5
root@whle-ls1046a:~# ip address flush eth4
root@whle-ls1046a:~# ip addr add 192.168.10.2/24 dev eth5
root@whle-ls1046a:~# ip addr add 192.168.11.2/24 dev eth5
root@whle-ls1046a:~# ip addr add 192.168.30.1/24 dev eth4
root@whle-ls1046a:~# ip addr add 192.168.31.1/24 dev eth4
root@whle-ls1046a:~# ip link set dev eth4 up
root@whle-ls1046a:~# ip link set dev eth5 up
root@whle-ls1046a:~# echo 1 > /proc/sys/net/ipv4/ip_forward

Tests

Iperf servers

On PC launch four instances of iperf3 servers, listening on ports 5201-5204. The ip netns exec command requires root access.

PC
root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5201 &
root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5202 &
root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5203 &
root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5204 &

Iperf clients

Launching iperf3 clients is a bit more involved as the reproducibility requires controlling the source port, which is an ephemeral port assigned by the system from a specific range, usually 32768-60999 for Linux. This can achieved by temporarily narrowing the range to just a single port such that the iperf3 client has no choice but to use only one available. Use the following helper script wrapping the iperf3 call in the ephemeral ports range setup:

iperfc.sh:

#!/usr/bin/bash
set -x
ip=$1
port=$2
label=$3
echo "5$(( port - 1 )) 5${port}" > /proc/sys/net/ipv4/ip_local_port_range
iperf3 -c ${ip} -p ${port} --time 0 --title $3 --omit 5 &
sleep 1
echo "32768 60999" > /proc/sys/net/ipv4/ip_local_port_range

Run the clients simultaneously:

PC
root@PC:~# (
    ./iperfc.sh 192.168.11.1 5201 A
    ./iperfc.sh 192.168.10.1 5202 B
    ./iperfc.sh 192.168.11.1 5203 C
    ./iperfc.sh 192.168.11.1 5204 D
)
...
B:  [  5]   0.00-1.00   sec   281 MBytes  2.36 Gbits/sec  322    382 KBytes       
C:  [  5]   4.00-5.00   sec   241 MBytes  2.02 Gbits/sec  523    546 KBytes       (omitted)
D:  [  5]   3.00-4.00   sec   278 MBytes  2.33 Gbits/sec  399    404 KBytes       (omitted)
A:  [  5]   2.00-3.00   sec   270 MBytes  2.27 Gbits/sec  327    443 KBytes       
B:  [  5]   1.00-2.00   sec   258 MBytes  2.16 Gbits/sec  292    399 KBytes       
C:  [  5]   0.00-1.00   sec   358 MBytes  3.00 Gbits/sec  348    556 KBytes       
D:  [  5]   4.00-5.00   sec   236 MBytes  1.98 Gbits/sec  234    485 KBytes       (omitted)
A:  [  5]   3.00-4.00   sec   304 MBytes  2.55 Gbits/sec  1248    508 KBytes       
B:  [  5]   2.00-3.00   sec   220 MBytes  1.85 Gbits/sec  877    393 KBytes       
C:  [  5]   1.00-2.00   sec   346 MBytes  2.90 Gbits/sec  437    380 KBytes           
...

The addresses and ports are picked in such a way that every iperf3 flow is handled by a different core on whle_ls1046_1 (cores 0, 2, 1, 3, respectively).

Stop all the clients after some time.

PC
root@PC:~# kill $(ps a | grep 'iperf3 -[c]' | awk '{ print $1; }')
...
C:  [  5] 181.00-182.00 sec   188 MBytes  1.57 Gbits/sec   81    444 KBytes       
D:  [  5] 180.00-181.00 sec   372 MBytes  3.12 Gbits/sec  180    707 KBytes       
D:  [  5] 181.00-181.55 sec   198 MBytes  2.99 Gbits/sec  165    687 KBytes       
D:  - - - - - - - - - - - - - - - - - - - - - - - - -
C:  [  5] 182.00-182.56 sec   152 MBytes  2.27 Gbits/sec  241    505 KBytes       
D:  [ ID] Interval           Transfer     Bitrate         Retr
A:  [  5] 184.00-184.58 sec   125 MBytes  1.80 Gbits/sec  453    716 KBytes       
C:  - - - - - - - - - - - - - - - - - - - - - - - - -
D:  [  5]   0.00-181.55 sec  49.6 GBytes  2.35 Gbits/sec  68979             sender
A:  - - - - - - - - - - - - - - - - - - - - - - - - -
C:  [ ID] Interval           Transfer     Bitrate         Retr
D:  [  5]   0.00-181.55 sec  0.00 Bytes  0.00 bits/sec                  receiver
A:  [ ID] Interval           Transfer     Bitrate         Retr
B:  [  5] 183.00-183.57 sec   156 MBytes  2.28 Gbits/sec  323    515 KBytes       
B:  - - - - - - - - - - - - - - - - - - - - - - - - -
C:  [  5]   0.00-182.56 sec  50.4 GBytes  2.37 Gbits/sec  66951             sender
A:  [  5]   0.00-184.58 sec  50.1 GBytes  2.33 Gbits/sec  71061             sender
iperf3: interrupt - the client has terminated
C:  [  5]   0.00-182.56 sec  0.00 Bytes  0.00 bits/sec                  receiver
A:  [  5]   0.00-184.58 sec  0.00 Bytes  0.00 bits/sec                  receiver
B:  [ ID] Interval           Transfer     Bitrate         Retr
B:  [  5]   0.00-183.57 sec  48.9 GBytes  2.29 Gbits/sec  67345             sender
B:  [  5]   0.00-183.57 sec  0.00 Bytes  0.00 bits/sec                  receiver
iperf3: interrupt - the client has terminated
iperf3: interrupt - the client has terminated
iperf3: interrupt - the client has terminated

Sum the values from the lines with "sender" at the end

D:  [  5]   0.00-181.55 sec  49.6 GBytes  2.35 Gbits/sec  68979             sender
...
C:  [  5]   0.00-182.56 sec  50.4 GBytes  2.37 Gbits/sec  66951             sender
...
A:  [  5]   0.00-184.58 sec  50.1 GBytes  2.33 Gbits/sec  71061             sender
...
B:  [  5]   0.00-183.57 sec  48.9 GBytes  2.29 Gbits/sec  67345             sender

The total bandwidth achieved is 2.35 + 2.37 + 2.33 + 2.29 = 9.34 Gb/s. This is the upper limit for the TCP protocol on a 10 Gb/s physical link, proving that WHLE-LS1046A board is able to handle routing at its network interface's limit using standard kernel drivers.

WHLE work analysis

Consider the snapshot from the top command ran on whle_ls1046_1 during the performance test:

whle-load.png

The si column shows the CPU time spent in software interrupts, in this case the network interrupts almost exclusively. Nearly zero time spent by the system or user shows that the routing task is carried out in the interrupts alone. The load spread evenly at ~73% between all cores stems from picking the right parameters (ip source address, ip dest address, tcp source port, tcp dest port) defining four data flows assigned by driver's RSS to four separate CPUs. The idle time id at ~25% shows that WHLE operates at 75% capacity, providing a decent margin to account for more realistic routing tasks, with bigger routing tables and less than perfectly CPU-even traffic.

L2 Bridge

Connection diagram

bridge_whle-pc.drawio.png

Network Setup

PC
root@PC:~# ip netns add isolated_ns
root@PC:~# ip link set ens1f0 netns isolated_ns
root@PC:~# ip netns exec isolated_ns ip addr flush ens1f0 
root@PC:~# ip netns exec isolated_ns ip addr add 192.168.30.1/24 dev ens1f0
root@PC:~# ip addr flush ens1f1
root@PC:~# ip address add 192.168.30.2/24 dev ens1f1
whle_ls1046_1
root@whle-ls1046a:~# ip address flush eth4
root@whle-ls1046a:~# ip address flush eth5
root@whle-ls1046a:~# ip link set dev eth4 down
root@whle-ls1046a:~# ip link set dev eth5 down
root@whle-ls1046a:~# brctl addbr br0
root@whle-ls1046a:~# brctl addif br0 eth4
root@whle-ls1046a:~# brctl addif br0 eth5
root@whle-ls1046a:~# ip link set dev br0 up
root@whle-ls1046a:~# ip link set dev eth4 up
root@whle-ls1046a:~# ip link set dev eth5 up

Tests

Iperf servers

On PC launch two instances of iperf3 servers, listening on ports 5201 and 5202. The ip netns exec command requires root access.

PC
root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5201 &
root@PC:~# ip netns exec isolated_ns iperf3 -s -p 5202 &

Iperf clients

Run two clients simultaneously (use the iperfc.sh script from https://conclusive.atlassian.net/wiki/spaces/CW/pages/edit-v2/398721025#Iperf-clients):

PC
root@PC:~# (
    ./iperfc.sh 192.168.30.1 5201 A
    ./iperfc.sh 192.168.30.1 5202 B
)
...
A:  [  5]   3.00-4.00   sec   559 MBytes  4.69 Gbits/sec  283    588 KBytes       (omitted)
B:  [  5]   2.00-3.00   sec   552 MBytes  4.64 Gbits/sec  283    595 KBytes       (omitted)
A:  [  5]   4.00-5.00   sec   561 MBytes  4.71 Gbits/sec  352    580 KBytes       (omitted)
B:  [  5]   3.00-4.00   sec   550 MBytes  4.61 Gbits/sec  248    584 KBytes       (omitted)
A:  [  5]   0.00-1.00   sec   560 MBytes  4.70 Gbits/sec  304    406 KBytes       
B:  [  5]   4.00-5.00   sec   552 MBytes  4.63 Gbits/sec  282    413 KBytes       (omitted)
A:  [  5]   1.00-2.00   sec   560 MBytes  4.70 Gbits/sec  254    443 KBytes       
B:  [  5]   0.00-1.00   sec   555 MBytes  4.66 Gbits/sec  119    602 KBytes       
...

The addresses and ports are picked in such a way that each iperf3 flow is handled by a different core on whle_ls1046_1 (cores 0, 1, respectively).

Stop all the clients after some time.

root@PC:~# kill $(ps a | grep 'iperf3 -[c]' | awk '{ print $1; }')
...
B:  [  5] 202.00-203.00 sec   548 MBytes  4.59 Gbits/sec  250    594 KBytes       
A:  [  5] 204.00-205.00 sec   562 MBytes  4.72 Gbits/sec  268    420 KBytes       
B:  [  5] 203.00-204.00 sec   552 MBytes  4.63 Gbits/sec  289    409 KBytes       
B:  [  5] 204.00-204.49 sec   268 MBytes  4.62 Gbits/sec  151    286 KBytes       
A:  [  5] 205.00-205.50 sec   280 MBytes  4.69 Gbits/sec   72    409 KBytes       
B:  - - - - - - - - - - - - - - - - - - - - - - - - -
A:  - - - - - - - - - - - - - - - - - - - - - - - - -
B:  [ ID] Interval           Transfer     Bitrate         Retr
A:  [ ID] Interval           Transfer     Bitrate         Retr
A:  [  5]   0.00-205.50 sec   112 GBytes  4.68 Gbits/sec  53456             sender
A:  [  5]   0.00-205.50 sec  0.00 Bytes  0.00 bits/sec                  receiver
B:  [  5]   0.00-204.49 sec   110 GBytes  4.63 Gbits/sec  51300             sender
iperf3: interrupt - the client has terminated
B:  [  5]   0.00-204.49 sec  0.00 Bytes  0.00 bits/sec                  receiver
iperf3: interrupt - the client has terminated
...

Sum the values from the lines with "sender" at the end

A:  [  5]   0.00-205.50 sec   112 GBytes  4.68 Gbits/sec  53456             sender
...
B:  [  5]   0.00-204.49 sec   110 GBytes  4.63 Gbits/sec  51300             sender

The total bandwidth achieved is 4.68 + 4.63 = 9.31 Gb/s. This is the upper limit for the TCP protocol on a 10 Gb/s physical link, proving that WHLE-LS1046A board is able to handle bridging at network interface's limit using standard kernel drivers.

WHLE work analysis

A snapshot from the top command ran on whle_ls1046_1 during the performance test:

whle-load-bridge.png

  • No labels