Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Control groups, or cgroups, are a way in Linux to control processes' hardware resources utilization by defining the resources limits, grouping them in a hierarchical structure and assigning processes to them. Cgroups can be used, in particular, to specify the skb priority of all network packets generated by specific process. This provides a convenient way to prioritize network traffic generated in communication with the WHLE board itself (as opposed to the traffic passing through it when it’s used as a router, a case described in Ssh Prioritization (iptables)).

Connection Diagram

The network used is very straightforward and consists of a single 1 Gb/s link between a testing machine (PC) and a WHLE board (whle_ls1046). Two iperf3 streams sending data from whle_ls1046 to PC will be competing for the link’s throughput. Different traffic classes will be used using the cgroups mechanism for the associated iperf3 processes and the resulting changes in data transfer speed will be observed.

Inc drawio
zoom1
simple0
custContentId579567620
pageId575668366
lbox1
diagramDisplayNamedirect_whle-pc_1G.drawio
contentVer1
hiResPreview0
revision1
baseUrlhttps://conclusive.atlassian.net/wiki
diagramNamedirect_whle-pc_1G.drawio
pCenter0
aspectrlxTkgMxotecUizovGCC 1
width891
linksauto
tbstyletop
isUpload1
height701

Setup

Network Setup

PC
Code Block
root@PC~# ip addr flush enxc84d4423262e
root@PC~# ip address add 192.168.3.1/24 dev enxc84d4423262e
root@PC~# ip link set dev enxc84d4423262e up
whle_ls1046
Code Block
root@whle-ls1046a:~# ip address flush eth1
root@whle-ls1046a:~# ip addr add 192.168.3.2/24 dev eth1
root@whle-ls1046a:~# ip link set dev eth1 up

By default the network interfaces on WHLE are controlled by NetworkManager service and the effects of the ip commands above will be periodically overwritten with its own configuration. It may be necessary to temporarily stop the service

Code Block
root@whle-ls1046a:~# systemctl stop NetworkManager

or to configure it to ignore the eth1 interface with a configuration like

Code Block
root@whle-ls1046a:~# echo '
[main]
plugins=ifupdown,keyfile

[keyfile]
unmanaged-devices=interface-name:eth1
' > /etc/NetworkManager/NetworkManager.conf
root@whle-ls1046a:~# systemctl restart NetworkManager

Cgroups Hierarchy Preparation

The cgroups hierarchy can be defined in many ways. Instead of creating the minimal hierarchy specific for the given scenario a more generic directory tree will be used, allowing for convenient assignment of skb priority from the 0 .. 15 range, thus covering all priority levels recognized by the tc command, to all network packets generated by a process with a given PID, in a straightforward fashion like

Code Block
languagebash
echo ‹pid› > /sys/fs/cgroup/net_prio/prio-‹skb-priority›/cgroup.procs

for example:

Code Block
languagebash
echo 730 > /sys/fs/cgroup/net_prio/prio-4/cgroup.procs

The script is as follows:

cgroups-setup.sh:
Code Block
languagebash
#!/usr/bin/bash
mkdir /sys/fs/cgroup/net_prio
mount -t cgroup -o net_prio none /sys/fs/cgroup/net_prio
mkdir /sys/fs/cgroup/net_prio/prio-{0..15}
for p in {0..15}; do
    for if in $(cd /sys/class/net/; ls); do
        echo "${if} ${p}" > /sys/fs/cgroup/net_prio/prio-${p}/net_prio.ifpriomap
    done
done
  • mkdir /sys/fs/cgroup/net_prio
    This command creates the root directory for network priority hierarchy inside the /sys/fs/cgroup which should already be present on the system. The name net_prio is arbitrary. It was chosen to reflect the name of the module used to mount the cgroups filesystem there.

  • mount -t cgroup -o net_prio none /sys/fs/cgroup/net_prio
    This command mounts the virtual filesystem used to communicate to the kernel the PIDs priority assignments. The -t cgroup signifies the cgroups V1. Unfortunately the more modern cgroups V2 cannot be used in this case as the net_prio module is not defined for it yet. Upon mounting the system the following listing should appear:

    Code Block
    root@whle-ls1046a:~# ls -1 /sys/fs/cgroup/net_prio
    cgroup.clone_children
    cgroup.procs
    cgroup.sane_behavior
    net_prio.ifpriomap
    net_prio.prioidx
    notify_on_release
    release_agent
    tasks

    Of these files only the following are relevant in further discussion:

    • net_prio.ifpriomap
      The default priorities per network interface. More details below.

    • cgroups.procs
      List of all PIDs whose packets priority isn’t modified in any way.

  • mkdir /sys/fs/cgroup/net_prio/prio-{0..15}
    Create directories prio-0, prio-1, …, prio-15 inside the /sys/fs/cgroup/net_prio. Each of them will be automatically populated with files:

    Code Block
    root@whle-ls1046a:~# ls -1 /sys/fs/cgroup/net_prio/prio-13
    cgroup.clone_children
    cgroup.procs
    net_prio.ifpriomap
    net_prio.prioidx
    notify_on_release
    tasks

    Again, only two are of concern here:

    • net_prio.ifpriomap
      The mapping of network interfaces to skb priorities, like

      Code Block
      root@whle-ls1046a:~# cat /sys/fs/cgroup/net_prio/prio-13/net_prio.ifpriomap
      lo 0
      eth0 0
      eth1 4
      eth2 4
      eth3 8
      eth4 0
      eth5 0

      While the initial discussion of cgroups mentioned assigning skb priority to PIDs, the actual priority assignment’s subject is the (PID, interface) pair. This file covers the second part.

    • cgroups.procs
      List of all PIDs whose packets are assigned the priority according to the map given in net_prio.ifpriomap.

  • echo "${if} ${p}" > /sys/fs/cgroup/net_prio/prio-${p}/net_prio.ifpriomap
    This line, executed for each network interface if, results in a uniform mapping in prio-‹p›/net_prio.ifpriomap like

    Code Block
    eth0 ‹p›
    eth1 ‹p›
    eth2 ‹p›
    eth3 ‹p›
    eth4 ‹p›
    eth5 ‹p›

    for example:

    Code Block
    root@whle-ls1046a:~# cat /sys/fs/cgroup/net_prio/prio-13/net_prio.ifpriomap
    lo 0
    eth0 13
    eth1 13
    eth2 13
    eth3 13
    eth4 13
    eth5 13

    This allows for abstracting over the interface prioritization granularity which isn’t needed.

Save the script in the cgroups-setup.sh file and run it on a WHLE-LS1046A board.

whle_ls1046a
Code Block
root@whle-ls1046a:~# chmod +x cgroups-setup.sh
root@whle-ls1046a:~# ./cgroups-setup.sh

Iperf3 Setup

PC

Two iperf3 streams will be created, with servers launched on PC and clients on whle_ls1046, with the default client → server data flow direction.

The direction of the transfer is important in this experiment. The notion of queue prioritization in the DPAA architecture (or any other mqprio architecture for that matter) is only applicable to the egress traffic. Sending data to the remote location 192.168.3.1 from isolated namespace implies the following order of processing for the majority of iperf3 traffic:

  1. whle_ls1046’s CPU,

  2. whle_ls1046’s eth1 interface (egress),

  3. PC’s enxc84d4423262e interface (ingress),

  4. PC’s CPU.

Given that the maximum throughput of 1 Gb/s for the whole connection leaves plenty of space on whle_ls1046’s CPU (let alone PC’s), the enxc84d4423262e - eth1 link becomes the bottleneck, with packets congesting at the eth1 funnel where the DPAA prioritization can come into play. Having the transfer go the other way, eg. with clients run on PC and servers on whle_ls1046, the funnel would form at the testing machine’s enxc84d4423262e interface.

Given the peculiarities of setting up iperf3 process' priority on whle_ls1046 it’s easier to track data transfer speed on PC’s side by launching iperf3 in blocking mode instead of as a daemon.

PC, console 1
Code Block
user@PC~$ iperf3 --server --port 5201
Code Block
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
PC, console 2
Code Block
user@PC~$ iperf3 --server --port 5202
Code Block
-----------------------------------------------------------
Server listening on 5202
-----------------------------------------------------------

whle_ls1046

Launching iperf3 clients on the WHLE board must follow a specific protocol:

  1. Clean any mqprio qdiscs on the eth1 interface.

  2. Run iperf3 client.

  3. Obtain the client’s PID

  4. Assign a specific net_prio priority to the given PID, using cgroups.

  5. Define the mqprio qdisc on the eth1 interface.

While the (2) < (3) < (4) ordering is pretty obvious, the rest may not be so. Practice has shown that changing packets priority of a process with an ongoing connection while the mqprio is already set up leads to inconsistent results, with the change sometimes reflected on the wire and sometimes not. In contrast, setting mqprio qdisc while all the traffic is already set up and running results in consistent behavior.

Because of this it’s useful to define some bash procedures that would implement the above ordering. First a launch_iperf_with_priority function will be defined which starts the iperf3 client and assigns it a specific priority.

whle_ls1046
Code Block
languagebash
root@whle-ls1046a:~#
launch_iperf_with_priority() {
    local port=$1
    local prio=$2
    local iperf_time=$3
    echo "Launching iperf3, port ${port}, priority ${prio}"
    iperf3 --port "${port}" --client 192.168.3.1 --time "${iperf_time}" > /dev/null &
    local pid=$(pgrep -f "iperf3 --port ${port}")
    echo "${pid}" > "/sys/fs/cgroup/net_prio/prio-${prio}/cgroup.procs"
}

The opposite operation will be realized by the kill_iperf procedure.

whle_ls1046
Code Block
languagebash
root@whle-ls1046a:~#
kill_iperf() {
    local port=$1
    pkill -f "iperf3 --port ${port}"
}

Example uage:

whle_ls1046
Code Block
root@whle-ls1046a:~# launch_iperf_with_priority 5201 0 10
Launching iperf3, port 5201, priority 0
[1] 493
root@whle-ls1046a:~# kill_iperf 5201

This would create a connection with at server at 192.168.3.1, port 5201, for 10 seconds, with the packets sent having skb priority 0. Then it will be killed without waiting for it to finish.

Building on this a third, final procedure will be defined, which coordinates launching two iperf3 streams with different priorities, for the same time period, and the creation of mqprio qdisc.

whle_ls1046
Code Block
languagebash
root@whle-ls1046a:~#
test_iperf() {
    local port1=$1
    local prio1=$2
    local port2=$3
    local prio2=$4
    local iperf_time=$5
    kill_iperf "${port1}"
    kill_iperf "${port2}"
    tc qdisc del dev eth1 root handle 1:
    launch_iperf_with_priority "${port1}" "${prio1}" "${iperf_time}"
    launch_iperf_with_priority "${port2}" "${prio2}" "${iperf_time}"
    tc qdisc add dev eth1 root handle 1: mqprio num_tc 4 \
       map 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 hw 1
    sleep ${iperf_time}
}

The tc qdisc ... command is the same as the one used in https://conclusive.atlassian.net/wiki/spaces/CW/pages/580124673/Traffic+Control+with+tc#Example - see that article for detailed description.

Tests

Same priorities

Assuming that iperf3 servers at ports 5201, 5202 are running on PC, run the following command on WHLE:

whle_ls1046
Code Block
root@whle-ls1046a:~# test_iperf 5201 4 5202 4 6
Launching iperf3, port 5201, priority 4
[1] 735
Launching iperf3, port 5202, priority 4
[2] 738

This would create two iperf3 streams with the same skb priority 4, mapping to the traffic class 1. Meanwhile, on the PC side:

PC, console 1
Code Block
Accepted connection from 192.168.3.2, port 54202
[  5] local 192.168.3.1 port 5201 connected to 192.168.3.2 port 54208
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  55.0 MBytes   462 Mbits/sec                  
[  5]   1.00-2.00   sec  56.1 MBytes   471 Mbits/sec                  
[  5]   2.00-3.00   sec  56.1 MBytes   471 Mbits/sec                  
[  5]   3.00-4.00   sec  56.1 MBytes   471 Mbits/sec                  
[  5]   4.00-5.00   sec  56.1 MBytes   471 Mbits/sec                  
[  5]   5.00-6.00   sec  56.1 MBytes   471 Mbits/sec                  
[  5]   6.00-6.04   sec  2.46 MBytes   467 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-6.04   sec   338 MBytes   469 Mbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
PC, console 2
Code Block
Accepted connection from 192.168.3.2, port 46446
[  5] local 192.168.3.1 port 5202 connected to 192.168.3.2 port 46458
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  53.9 MBytes   452 Mbits/sec                  
[  5]   1.00-2.00   sec  56.1 MBytes   471 Mbits/sec                  
[  5]   2.00-3.00   sec  56.1 MBytes   471 Mbits/sec                  
[  5]   3.00-4.00   sec  56.1 MBytes   471 Mbits/sec                  
[  5]   4.00-5.00   sec  56.1 MBytes   471 Mbits/sec                  
[  5]   5.00-6.00   sec  56.1 MBytes   471 Mbits/sec                  
[  5]   6.00-6.04   sec  3.69 MBytes   793 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-6.04   sec   338 MBytes   470 Mbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5202
-----------------------------------------------------------

The experiment shows that the link’s throughput is shared evenly for traffic in the same class. Similar results would be obtained with calls:

Code Block
test_iperf 5201 0 5202 0 6
test_iperf 5201 8 5202 8 6
test_iperf 5201 12 5202 12 6

(That would cover all 4 traffic classes defined by tc, with skb priorities different from 0, 4, 8, 12 resulting in the same classes set.)

Different priorities

Run test_iperf with different skb priorities, making sure that they map to different traffic classes, for example:

whle_ls1046
Code Block
root@whle-ls1046a:~# test_iperf 5201 0 5202 4 6
Launching iperf3, port 5201, priority 0
[1] 774
Launching iperf3, port 5202, priority 4
[2] 776

Meanwhile, on the PC side:

PC, console 1
Code Block
Accepted connection from 192.168.3.2, port 48344
[  5] local 192.168.3.1 port 5201 connected to 192.168.3.2 port 48350
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  8.84 MBytes  74.1 Mbits/sec                  
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec                  
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec                  
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec                  
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec                  
[  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-6.08   sec  8.84 MBytes  12.2 Mbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
PC, console 2
Code Block
Accepted connection from 192.168.3.2, port 42704
[  5] local 192.168.3.1 port 5202 connected to 192.168.3.2 port 42720
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   100 MBytes   839 Mbits/sec                  
[  5]   1.00-2.00   sec   112 MBytes   942 Mbits/sec                  
[  5]   2.00-3.00   sec   112 MBytes   942 Mbits/sec                  
[  5]   3.00-4.00   sec   112 MBytes   942 Mbits/sec                  
[  5]   4.00-5.00   sec   112 MBytes   942 Mbits/sec                  
[  5]   5.00-6.00   sec   112 MBytes   942 Mbits/sec                  
[  5]   6.00-6.04   sec  4.71 MBytes   937 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-6.04   sec   666 MBytes   925 Mbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5202
-----------------------------------------------------------

This shows that the traffic class 1 (skb priority 4) has a strict priority over traffic class 0 (skb priority 0). Similar results would be obtained with any of the calls:

Code Block
test_iperf 5201 0 5202 8 6
test_iperf 5201 0 5202 12 6
test_iperf 5201 4 5202 8 6
test_iperf 5201 4 5202 12 6
test_iperf 5201 8 5202 12 6

(That would cover all pairs of 4 traffic classes defined by tc, with skb priorities different from 0, 4, 8, 12 resulting in one of the classes pairs from above.)