About
This article describes how to set up WHLE-LS1046A, using standard upstream DPAA driver, as a router giving strict priority to any ssh packets, making the router's non-ssh workload nearly transparent to any ssh connections going through it.
The article aims to showcase the practical use of DPAA hardware-offloaded Multiqueue Priority Discipline (mqprio
qdisc) in conjunction with iptables
, so the setup focuses only on the situation of network interface's congestion. Controlling access to other resources, like CPU, required to make the router's workload truly transparent to ssh connections, is outside of the scope.
Connection diagram
The setup is similar to the one used in Router/Bridge Mode: PC + WHLE Setup: whle_ls1046
board acts as a router between two links connected with the testing PC
. The difference is that one of the links is 1 Gb/s instead of both being 10 Gb/s. This allows for saturating the physical link with little load on WHLE’s processing power, thus simplifying the setup and eliminating other possible factors which could influence the outcome of ssh throughput measuring experiments.
The speed of ssh connection will be measured between enxc84d4423262e
and ens1f0
interfaces on PC
. The isolated_ns denotes network namespace in which the ens1f0
interface had to be enclosed to force PC
to send the packets through whle_ls1046
instead of short-circuiting to the local interface.
Inc drawio | ||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Three scenarios for ssh connection will be considered:
no other traffic than ssh,
ssh connection over a link saturated with
iperf3
traffic:without using DPAA’s priority queues,
with the usage of DPAA’s priority queues.
Network setup
PC
Code Block | ||
---|---|---|
| ||
root@PC~# ip netns add isolated_ns
root@PC~# ip link set ens1f0 netns isolated_ns
root@PC~# ip netns exec isolated_ns ip addr flush ens1f0
root@PC~# ip netns exec isolated_ns ip addr add 192.168.10.1/24 dev ens1f0
root@PC~# ip netns exec isolated_ns ip link set dev ens1f0 up
root@PC~# ip netns exec isolated_ns ip route delete 192.168.3.0/24
root@PC~# ip netns exec isolated_ns ip route add 192.168.3.0/24 via 192.168.10.2
root@PC~# ip addr flush enxc84d4423262e
root@PC~# ip address add 192.168.3.1/24 dev enxc84d4423262e
root@PC~# ip link set dev enxc84d4423262e up
root@PC~# ip route delete 192.168.10.0/24
root@PC~# ip route add 192.168.10.0/24 via 192.168.3.2 |
whle_ls1046a
Code Block |
---|
root@whle-ls1046a:~# ip address flush eth1
root@whle-ls1046a:~# ip address flush eth5
root@whle-ls1046a:~# ip addr add 192.168.3.2/24 dev eth1
root@whle-ls1046a:~# ip addr add 192.168.10.2/24 dev eth5
root@whle-ls1046a:~# ip link set dev eth1 up
root@whle-ls1046a:~# ip link set dev eth5 up
root@whle-ls1046a:~# echo 1 > /proc/sys/net/ipv4/ip_forward |
By default the network interfaces on WHLE are controlled by NetworkManager service and the effects of the ip
commands above will be periodically overwritten with its own configuration. It may be necessary to temporarily stop the service
Code Block |
---|
root@whle-ls1046a:~# systemctl stop NetworkManager |
or to configure it to ignore the eth1
, eth5
interfaces with a configuration like
Code Block |
---|
root@whle-ls1046a:~# echo '
[main]
plugins=ifupdown,keyfile
[keyfile]
unmanaged-devices=interface-name:eth1,interface-name:eth5
' > /etc/NetworkManager/NetworkManager.conf
root@whle-ls1046a:~# systemctl restart NetworkManager |
Services setup
PC
Code Block |
---|
root@PC:~# ip netns exec isolated_ns iperf3 --server --daemon |
Keep in mind that starting the iperf3
server within the isolated network namespace isolated_ns
makes it reachable only through the 192.168.10.1
address. Attempts to connect the client through a different address will result in a cryptic Bad file descriptor
error.
Code Block |
---|
root@whle-ls1046a:~# iperf3 --client 192.168.3.1
iperf3: error - unable to send control message: Bad file descriptor |
It’s assumed that there is a ssh daemon running on PC
already.
Tests
Control case: scp
transfer through empty network
To measure the ssh throughput the scp
program will be used on some decently big file ~700 MB, assumed to be at /home/user/files/download.xz
on PC
. It will be sent to /home/user
on the same machine.
PC
Code Block |
---|
root@PC:~# time ip netns exec isolated_ns scp /home/user/files/download.xz user@192.168.3.1: |
Code Block |
---|
download.xz 100% 706MB 111.7MB/s 00:06
real 0m6,757s
user 0m3,617s
sys 0m1,786s |
The root access was needed to execute the ip netns
command. Transferring the whole file through the empty network takes around 7 seconds.
The direction of the transfer is actually important in this experiment. The notion of queue prioritization in the DPAA architecture (or any other mqprio
architecture for that matter) is only applicable to the egress traffic. Sending the local file /home/user/files/download.xz
to the “remote“ location 192.168.3.1
from isolated namespace implies the following order of processing for the majority of ssh traffic:
PC
’s CPU,PC
’sens1f0
interface (egress),whle_ls1046
’seth5
interface (ingress),whle_ls1046
’s CPU,whle_ls1046
’seth1
interface (egress),PC
’senxc84d4423262e
interface (ingress),PC
’s CPU.
Given that the maximum throughput of 1 Gb/s for the whole connection leaves plenty of space on whle_ls1046
’s CPU (let alone PC
’s) and that the ens1f0
- eth5
link is 10 Gb/s, the 1 Gb/s enxc84d4423262e
- eth1
link becomes the bottleneck, with packets congesting at the eth1
funnel where the DPAA prioritization can come into play. Having the transfer go the other way, eg. with
Code Block |
---|
root@PC:~# time ip netns exec isolated_ns scp user@192.168.3.1:/home/user/files/download.xz . |
the funnel would form at the testing machine’s enxc84d4423262e
interface.
Test case: scp
transfer on saturated link, no prioritization
Start the iperf3
flow to saturate the 1 Gb/s link.
PC
Code Block |
---|
user@PC:~$ iperf3 --client 192.168.10.1 --time 0 --reverse |
Code Block |
---|
Connecting to host 192.168.10.1, port 5201
Reverse mode, remote host 192.168.10.1 is sending
[ 5] local 192.168.3.1 port 53244 connected to 192.168.10.1 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 112 MBytes 942 Mbits/sec
[ 5] 1.00-2.00 sec 112 MBytes 942 Mbits/sec
[ 5] 2.00-3.00 sec 112 MBytes 942 Mbits/sec
... |
Once again the direction of iperf3
’s flow is important: it must match the direction scp
’s transfer, or there would be no conflict between them to arbitrate. By default iperf3
sends data from client to server. Using the --reverse
flag reverses it, ensuring that the data traverses ens1f0
(egress)→ eth5
(ingress) → eth1
(egress) → enxc84d4423262e
(ingress).
Perform the scp
transfer in another console.
PC
Code Block |
---|
root@PC:~# time ip netns exec isolated_ns scp /home/user/files/download.xz user@192.168.3.1: |
Code Block |
---|
download.xz 100% 706MB 55.8MB/s 00:12
real 0m13,106s
user 0m6,229s
sys 0m2,324s |
The time to transfer the file doubled. Meanwhile in iperf3
’s logs:
Code Block |
---|
...
[ 5] 29.00-30.00 sec 112 MBytes 941 Mbits/sec
[ 5] 30.00-31.00 sec 112 MBytes 942 Mbits/sec
[ 5] 31.00-32.00 sec 112 MBytes 942 Mbits/sec
[ 5] 32.00-33.00 sec 71.8 MBytes 602 Mbits/sec <-- scp transfer start
[ 5] 33.00-34.00 sec 56.9 MBytes 477 Mbits/sec
[ 5] 34.00-35.00 sec 57.6 MBytes 483 Mbits/sec
[ 5] 35.00-36.00 sec 57.6 MBytes 483 Mbits/sec
[ 5] 36.00-37.00 sec 57.6 MBytes 483 Mbits/sec
[ 5] 37.00-38.00 sec 57.2 MBytes 480 Mbits/sec
[ 5] 38.00-39.00 sec 55.7 MBytes 468 Mbits/sec
[ 5] 39.00-40.00 sec 55.2 MBytes 463 Mbits/sec
[ 5] 40.00-41.00 sec 55.2 MBytes 463 Mbits/sec
[ 5] 41.00-42.00 sec 55.2 MBytes 463 Mbits/sec
[ 5] 42.00-43.00 sec 55.2 MBytes 463 Mbits/sec
[ 5] 43.00-44.00 sec 55.2 MBytes 463 Mbits/sec
[ 5] 44.00-45.00 sec 58.7 MBytes 493 Mbits/sec <-- scp transfer finish
[ 5] 45.00-46.00 sec 112 MBytes 942 Mbits/sec
[ 5] 46.00-47.00 sec 112 MBytes 941 Mbits/sec
[ 5] 47.00-48.00 sec 112 MBytes 942 Mbits/sec
[ 5] 48.00-49.00 sec 112 MBytes 942 Mbits/sec
[ 5] 49.00-50.00 sec 112 MBytes 941 Mbits/sec
... |
This shows that with the default queuing discipline the 1 Gb/s link is shared evenly between iperf3
and scp
, an expected behavior where neither flow has higher priority than the other.
Test case: scp
transfer on saturated link, with prioritization
Setting iptables
Configure iptables
to assign the highest skb priority to ssh packets.
whle_ls1046a
Code Block |
---|
root@whle-ls1046a:~# iptables -t mangle -F
root@whle-ls1046a:~# iptables -t mangle -A POSTROUTING -p tcp --dport 22 -j CLASSIFY --set-class 0:f
root@whle-ls1046a:~# iptables -t mangle -A POSTROUTING -p tcp --sport 22 -j CLASSIFY --set-class 0:f |
This should result in the following table:
Code Block |
---|
root@whle-ls1046a:~# iptables -t mangle --list -v |
Code Block |
---|
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
0 0 CLASSIFY tcp -- any any anywhere anywhere tcp dpt:ssh CLASSIFY set 0:f
0 0 CLASSIFY tcp -- any any anywhere anywhere tcp spt:ssh CLASSIFY set 0:f |
The configuration makes use of the mangle
table which is designed for packet modification. While the packets themselves aren’t modified in this scenario, their socket buffer structure’s used by the kernel is, namely the priority
field.
The first command simply flushes the mangle
table’s configuration to make sure no other rules apply. The second command assigns the priority 15
to any TCP packet with the destination port being 22
. The third command does so with the source port. This effectively covers all standard ssh connections.
The actual priority assignment is done, indirectly, by the --set-class 0:f
fragment. From iptables-extensions manual:
CLASSIFY
This module allows you to set the skb->priority value (and thus clas-
sify the packet into a specific CBQ class).--set-class major:minor
Set the major and minor class value. The values are always in-
terpreted as hexadecimal even if no 0x prefix is given.
Unfortunately the documentation doesn’t provide the actual correspondence between major:minor
class specification and the affected skb->priority
value. This can be found in iptable
’s source (iptables-1.8.7/extensions/libxt_CLASSIFY.c
):
Code Block | ||
---|---|---|
| ||
static int CLASSIFY_string_to_priority(const char *s, unsigned int *p)
{
unsigned int i, j;
if (sscanf(s, "%x:%x", &i, &j) != 2)
return 1;
*p = TC_H_MAKE(i<<16, j);
return 0;
} |
and kernel’s source (include/uapi/linux/pkt_sched.h
):
Code Block | ||
---|---|---|
| ||
#define TC_H_MAJ_MASK (0xFFFF0000U)
#define TC_H_MIN_MASK (0x0000FFFFU)
...
#define TC_H_MAKE(maj,min) (((maj)&TC_H_MAJ_MASK)|((min)&TC_H_MIN_MASK)) |
From this it can be concluded that as long as major
is 0
and minor < 0x10000
then skb->priority
is simply the value of minor
. To use a different priority 10
, for example, one would have to use the --set-class 0:a
. The values of skb->priority
higher than 0xF
aren’t recognized by the mqprio
qdisc anyway.
The usage of POSTROUTING
chain signifies that the prioritization occurs right before the packet is sent to the network interface. It’s not strictly required to do it at the last moment and the FORWARD
chain could be used as well. The OUTPUT
chain, however, applies only to the packets generated by whle_ls1046
itself, so the routed packets would remain unaffected, while PREROUTING
and INPUT
chains aren’t even accepted along with the CLASSIFY
target by iptables
command.
Setting tc
discusses the Multiqueue Priority Queue Discipline (mqprio
qdisc) hardware offloading implemented by the standard kernel DPAA driver, how to set it up with the tc
command and how to monitor it.
Setting tc
The driver’s documentation mentions the following command:
Code Block |
---|
tc qdisc add dev <int> root handle 1: \
mqprio num_tc 4 map 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 hw 1 |
Set up the queues discipline for the eth1
interface.
whle_ls1046a
Code Block |
---|
root@whle-ls1046a:~# tc qdisc del dev eth1 root handle 1: root@whle-ls1046a:~# tc qdisc add dev eth1 \ root handle 1: mqprio num_tc 4 map 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 hw 1 |
The first command deletes any qdisc that may have been assigned to eth1
already. It may return an error when there is none, that’s not a problem.
This command encapsulates
traffic classes,
packets skb priority,
mapping between skb priority and traffic classes,
DPAA Frame Queues,
DPAA Work Queues,
device’s channel
The second command initiates 1024
DPAA queues in 4
different classes, each having a different DPAA priority (to distinguish it from skb priority that iptables
is concerned with).
...
Start the iperf3
flow to saturate the link.
PC
Code Block |
---|
user@PC:~$ iperf3 --client 192.168.10.1 --time 0 --reverse |
...
Perform the scp
transfer in another console.
PC
Code Block |
---|
root@PC:~# time ip netns exec isolated_ns scp /home/user/files/download.xz user@192.168.3.1: |
...