Router Mode: 3x WHLE Setup
Setup
For the tests three WHLE-LS1046A boards were used, identified as whle_ls1046_0
, whle_ls1046_1
, whle_ls1046_2
. The following sections describe setup common for them all.
System
Conclusive Ubuntu image was used, version ubuntu-jammy-whle-ls1046a-2024-05-15-6ccb862.img
, available at https://gitlab.conclusive.pl/devices/ubuntu-build/-/packages/109. For the flashing procedure please refer to Installing Ubuntu image.
Kernel
In tests the kernel version 6.5.6-32773-whle-ls1-g1dbc2136b533
was used, although any recent Conclusive kernels 6.5 or 6.1 should work.
No additional kernel arguments are needed, for the tests the default arguments were used:
root@whle-ls1046a:~# cat /proc/cmdline
root=UUID=4e117e32-2acc-400d-8d97-915b6c99c41e console=ttyS0,115200 earlycon=uart8250,mmio,0x21c0500 rootwait rw
Software
root@whle-ls1046a:~# apt-get update
root@whle-ls1046a:~# apt-get install -q0 iperf3 bridge-utils
Router
Connection diagram
Connection speed from whle_ls1046_2
to whle_ls1046_0
will be measured, with whle_ls1046_1
configured as router.
Drivers setup
The tests were carried out using iperf3 tool, TCP protocol, with servers on whle_ls1046_0
and clients on whle_ls1046_2
, with the direction of the main network flow being whle_ls1046_2
→ whle_ls1046_1
→ whle_ls1046_0
. The overall speed of transfer depends on the speed at which whle_ls1046_1
and whle_ls1046_2
are able to receive data, which is bound by CPU frequency and depends on how the traffic flows are shared between cores. To achieve optimal bandwidth it's necessary to (for each of whle_ls1046_1
and whle_ls1046_2
) bind the specific iperf3
flow to a single core and to make sure the bindings don't overlap.
Kernel DPAA driver allows some limited traffic control with the help of ethtool
(https://docs.kernel.org/networking/device_drivers/ethernet/freescale/dpaa.html). Core-binding is achieved by using Receive Side Scaling feature, enabled on 10G ports with
root@whle-ls1046a:~# for dev in eth4 eth5; do ethtool -N ${dev} rx-flow-hash tcp4 sfdn; done
The RSS is enabled by default and the command above shouldn't be necessary. The sfdn
argument specifies the packet header fields defining the flow and used to map it to a specific core.
s
: Source IP addressf
: Destination IP addressd
: Destination portn
: Source port.
The current set of fields used in RSS for the given network device can be printed with
root@whle-ls1046a:~# ethtool -n eth5 rx-flow-hash tcp4
TCP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]
The exact core the flow will be bound to, being based on the hash value from the above fields, is not known in advance, although it's deterministic.
Unfortunately it's apparently not possible to reduce these four fields - using any of s
, f
, d
, n
in the ethtool
call enables all the other three.
Network Setup
whle_ls1046_0
root@whle-ls1046a:~# ip address flush eth1
root@whle-ls1046a:~# ip address flush eth2
root@whle-ls1046a:~# ip address flush eth3
root@whle-ls1046a:~# ip address flush eth5
root@whle-ls1046a:~# ip address flush eth4
root@whle-ls1046a:~# ip addr add 192.168.10.1/24 dev eth4
root@whle-ls1046a:~# ip addr add 192.168.11.1/24 dev eth4
root@whle-ls1046a:~# ip link set dev eth4 up
root@whle-ls1046a:~# ip route add 192.168.30.0/24 via 192.168.10.2
root@whle-ls1046a:~# ip route add 192.168.31.0/24 via 192.168.11.2
whle_ls1046_1
root@whle-ls1046a:~# ip address flush eth1
root@whle-ls1046a:~# ip address flush eth2
root@whle-ls1046a:~# ip address flush eth3
root@whle-ls1046a:~# ip address flush eth5
root@whle-ls1046a:~# ip address flush eth4
root@whle-ls1046a:~# ip addr add 192.168.10.2/24 dev eth5
root@whle-ls1046a:~# ip addr add 192.168.11.2/24 dev eth5
root@whle-ls1046a:~# ip addr add 192.168.30.1/24 dev eth4
root@whle-ls1046a:~# ip addr add 192.168.31.1/24 dev eth4
root@whle-ls1046a:~# ip link set dev eth4 up
root@whle-ls1046a:~# ip link set dev eth5 up
root@whle-ls1046a:~# echo 1 > /proc/sys/net/ipv4/ip_forward
whle_ls1046_2
root@whle-ls1046a:~# ip address flush eth1
root@whle-ls1046a:~# ip address flush eth2
root@whle-ls1046a:~# ip address flush eth3
root@whle-ls1046a:~# ip address flush eth5
root@whle-ls1046a:~# ip address flush eth4
root@whle-ls1046a:~# ip addr add 192.168.30.2/24 dev eth5
root@whle-ls1046a:~# ip addr add 192.168.31.2/24 dev eth5
root@whle-ls1046a:~# ip link set dev eth5 up
root@whle-ls1046a:~# ip route add 192.168.11.0/24 via 192.168.31.1
root@whle-ls1046a:~# ip route add 192.168.10.0/24 via 192.168.30.1
Tests
Iperf servers
On whle_ls1046_0
launch four instances of iperf3 servers, listening on ports 5201-5204
.
whle_ls1046_0
root@whle-ls1046a:~# iperf3 -s -p 5201 &
root@whle-ls1046a:~# iperf3 -s -p 5202 &
root@whle-ls1046a:~# iperf3 -s -p 5203 &
root@whle-ls1046a:~# iperf3 -s -p 5204 &
Iperf clients
Launching iperf3
clients is a bit more involved as the reproducibility requires controlling the source port, which is an ephemeral port assigned by the system from a specific range, usually 32768-60999 for Linux. This can achieved by temporarily narrowing the range to just a single port such that the iperf3
client has no choice but to use only one available. Use the following helper script wrapping the iperf3
call in the ephemeral ports range setup:
iperfc.sh
:
#!/usr/bin/bash
set -x
ip=$1
port=$2
label=$3
echo "5$(( port - 1 )) 5${port}" > /proc/sys/net/ipv4/ip_local_port_range
iperf3 -c ${ip} -p ${port} --time 0 --title $3 --omit 5 &
sleep 1
echo "32768 60999" > /proc/sys/net/ipv4/ip_local_port_range
Run the clients simultaneously:
root@whle-ls1046a:~# (
./iperfc.sh 192.168.11.1 5201 A
./iperfc.sh 192.168.10.1 5202 B
./iperfc.sh 192.168.11.1 5203 C
./iperfc.sh 192.168.11.1 5204 D
)
...
C: [ 5] 0.00-1.00 sec 341 MBytes 2.86 Gbits/sec 93 284 KBytes
A: [ 5] 3.00-4.00 sec 341 MBytes 2.87 Gbits/sec 355 492 KBytes
B: [ 5] 2.00-3.00 sec 358 MBytes 3.00 Gbits/sec 346 465 KBytes
C: [ 5] 1.00-2.00 sec 342 MBytes 2.87 Gbits/sec 70 513 KBytes
A: [ 5] 4.00-5.00 sec 321 MBytes 2.69 Gbits/sec 340 428 KBytes
B: [ 5] 3.00-4.00 sec 362 MBytes 3.04 Gbits/sec 497 370 KBytes
C: [ 5] 2.00-3.00 sec 368 MBytes 3.09 Gbits/sec 97 230 KBytes
A: [ 5] 5.00-6.00 sec 330 MBytes 2.76 Gbits/sec 229 409 KBytes
...
The addresses and ports are picked in such a way that every iperf3 flow is handled by a different core on whle_ls1046_1
and whle_ls1046_2
(the cores 0
, 2
, 1
, 3
, respectively).
Stop all the clients after some time
root@whle-ls1046a:~# pkill ^iperf3$
...
C: [ 5] 74.00-75.00 sec 221 MBytes 1.86 Gbits/sec 626 448 KBytes
D: [ 5] 73.00-74.00 sec 224 MBytes 1.88 Gbits/sec 599 479 KBytes
B: [ 5] 76.00-76.80 sec 162 MBytes 1.71 Gbits/sec 880 455 KBytes
B: - - - - - - - - - - - - - - - - - - - - - - - - -
B: [ ID] Interval Transfer Bitrate Retr
B: [ 5] 0.00-76.80 sec 19.3 GBytes 2.16 Gbits/sec 27008 sender
B: [ 5] 0.00-76.80 sec 0.00 Bytes 0.00 bits/sec receiver
iperf3: interrupt - the client has terminated
D: [ 5] 74.00-74.77 sec 249 MBytes 2.71 Gbits/sec 438 550 KBytes
D: - - - - - - - - - - - - - - - - - - - - - - - - -
D: [ ID] Interval Transfer Bitrate Retr
D: [ 5] 0.00-74.77 sec 19.5 GBytes 2.24 Gbits/sec 27807 sender
D: [ 5] 0.00-74.77 sec 0.00 Bytes 0.00 bits/sec receiver
iperf3: interrupt - the client has terminated
A: [ 5] 77.00-77.80 sec 162 MBytes 1.69 Gbits/sec 1031 450 KBytes
A: - - - - - - - - - - - - - - - - - - - - - - - - -
A: [ ID] Interval Transfer Bitrate Retr
A: [ 5] 0.00-77.80 sec 19.7 GBytes 2.17 Gbits/sec 31143 sender
A: [ 5] 0.00-77.80 sec 0.00 Bytes 0.00 bits/sec receiver
iperf3: interrupt - the client has terminated
C: [ 5] 75.00-75.79 sec 156 MBytes 1.66 Gbits/sec 1164 636 KBytes
C: - - - - - - - - - - - - - - - - - - - - - - - - -
C: [ ID] Interval Transfer Bitrate Retr
C: [ 5] 0.00-75.79 sec 19.1 GBytes 2.16 Gbits/sec 30719 sender
C: [ 5] 0.00-75.79 sec 0.00 Bytes 0.00 bits/sec receiver
Sum the values from the lines with "sender" at the end
B: [ 5] 0.00-76.80 sec 19.3 GBytes 2.16 Gbits/sec 27008 sender
...
D: [ 5] 0.00-74.77 sec 19.5 GBytes 2.24 Gbits/sec 27807 sender
...
A: [ 5] 0.00-77.80 sec 19.7 GBytes 2.17 Gbits/sec 31143 sender
...
C: [ 5] 0.00-75.79 sec 19.1 GBytes 2.16 Gbits/sec 30719 sender
The total bandwidth achieved is 2.16 + 2.24 + 2.17 + 2.16 = 8.73 Gb/s. The high number of retries suggests the receiving endpoint whle_ls1046_2
being the weakest link.