0

Background:-

I have an arm based system, which has HTB setup on the eth and wlan interface. Here is the HTB configuration:-

tc class add dev eth1 parent 1:1 classid 1:1 htb rate 1Gbit ceil 1Gbit burst 18000b cburst 18000b
tc class add dev eth1 parent 1:1 classid 1:a100 htb rate 60Mbit ceil 60Mbit burst 18000b cburst 18000b
tc class add dev eth1 parent 1:a100 classid 1:10f htb rate 100Kbit ceil 60Mbit burst 18000b cburst 18000b
tc class add dev eth1 parent 1:10f classid 1:100 htb rate 25Kbit ceil 60Mbit burst 18000b cburst 18000b prio 3
tc class add dev eth1 parent 1:10f classid 1:101 htb rate 25Kbit ceil 60Mbit burst 18000b cburst 18000b prio 2
tc class add dev eth1 parent 1:10f classid 1:102 htb rate 25Kbit ceil 60Mbit burst 18000b cburst 18000b prio 1
tc class add dev eth1 parent 1:10f classid 1:103 htb rate 25Kbit ceil 60Mbit burst 18000b cburst 18000b prio 0

Here is there graph representation:-

+---(1:1) htb rate 1Gbit ceil 1Gbit burst 18000b cburst 18000b 
     |    Sent 200796370 bytes 152179 pkt (dropped 0, overlimits 0 requeues 0) 
     |    rate 0bit 0pps backlog 0b 0p requeues 0 
     |
     +---(1:54) htb prio 2 rate 50Mbit ceil 1Gbit burst 18000b cburst 18000b 
     |          Sent 2521539 bytes 19693 pkt (dropped 0, overlimits 0 requeues 0) 
     |          rate 0bit 0pps backlog 0b 0p requeues 0 
     |     
     +---(1:a100) htb rate 60Mbit ceil 60Mbit burst 18000b cburst 18000b 
          |       Sent 198274831 bytes 132486 pkt (dropped 0, overlimits 0 requeues 0) 
          |       rate 0bit 0pps backlog 0b 0p requeues 0 
          |
          +---(1:10f) htb rate 100Kbit ceil 60Mbit burst 18000b cburst 18000b 
               |      Sent 198274831 bytes 132486 pkt (dropped 0, overlimits 0 requeues 0) 
               |      rate 0bit 0pps backlog 0b 0p requeues 0 
               |
               +---(1:101) htb prio 2 rate 25Kbit ceil 60Mbit burst 18000b cburst 18000b 
               |           Sent 198208856 bytes 132155 pkt (dropped 82134, overlimits 0 requeues 0) 
               |           rate 0bit 0pps backlog 0b 0p requeues 0 
               |     
               +---(1:100) htb prio 3 rate 25Kbit ceil 60Mbit burst 18000b cburst 18000b 
               |           Sent 64079 bytes 299 pkt (dropped 0, overlimits 0 requeues 0) 
               |           rate 0bit 0pps backlog 0b 0p requeues 0 
               |     
               +---(1:103) htb prio 0 rate 25Kbit ceil 100Kbit burst 18000b cburst 18000b 
               |           Sent 630 bytes 7 pkt (dropped 0, overlimits 0 requeues 0) 
               |           rate 0bit 0pps backlog 0b 0p requeues 0 
               |     
               +---(1:102) htb prio 1 rate 25Kbit ceil 60Mbit burst 18000b cburst 18000b 
                           Sent 1266 bytes 25 pkt (dropped 0, overlimits 0 requeues 0) 
                           rate 0bit 0pps backlog 0b 0p requeues 0

The problem: I always achieve only 70% (max) of the ceil rate even with iperf UDP traffic in local network, with 60Mbps as uplink and downlink limit set, I barely get 40Mbps. From the above graph, you can see that the classid 1:101 (data class) has a lot of packets dropped, I am trying to understand why this happens, since it shouldn't run out of tokens when catering to throughput below the ceil rate.

Edit-1: Here is the trimmed output of qdisc tc -s -s -d q ls dev eth1

qdisc htb 1: root refcnt 5 r2q 10 default 54 direct_packets_stat 0 ver 3.17 direct_qlen 64000
 Sent 370545050 bytes 354529 pkt (dropped 86336, overlimits 443788 requeues 0) 
 backlog 0b 0p requeues 0
qdisc pfifo 101: parent 1:101 limit 10p
 Sent 356446201 bytes 252349 pkt (dropped 86263, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0

Please let me know if more info is needed to debug this.

Vo1dSpace
  • 1
  • 1

1 Answers1

0

You should check the underlayer queue of this leaf class.

In your statistics output you see the dropped packets, but without overlimits. It means that you don't have the bandwidth exceeding in that class.

If you haven't configured leaf queue explicitly, likely you have the pfifo queue (default queue) with limited queue depth. By default the pfifo queue depth equals to the tx queue length of underlayer interface. Seems like the pfifo queue are being overflowed sometimes and the exceeded packets are dropped.

How to debug and fix:

  • Check the queue discplines (qdisc-s) of interface. Unfortunately, I don't know command to extract info about single qdisc (there isn't somethink like tc q disc get ...), so check the full output of the tc -s -s -d q ls dev eth1 command.
  • Check the statistics of child queue of the ```1:101`` class.
  • Try to increase the this queue depth with explicit configuration: tc q add dev eth1 parent 1:101 handle 101: <qdisc_type> limit <queue_depth>.
  • Even better use the some fair or/and RED qdisc like sfq to avoid the flow starvation.
Anton Danilov
  • 4,874
  • 2
  • 11
  • 20
  • Thanks for the reply, I have updated the main thread with output for `tc -s -s -d q ls dev eth1`. - As you can see, the overlimits for the main root qdisc is shot high, but the child node 1:101 shows as 0 (not sure if it indicates anything). - Limit per queue is declared as 5P, 10P I am not sure what "P" means here. – Vo1dSpace Jun 08 '20 at 05:51
  • `p` means the packets. The overlimits is the second issue. Increase the leaf queue depth first. – Anton Danilov Jun 09 '20 at 18:24
  • l tried increasing txqueue_len from 1000 to 64000. The qdisc's direct_qlen also inherits the changed value. But still the throughput remains the same :( – Vo1dSpace Jun 11 '20 at 10:42
  • But what's about the queue depth of the leaf queue? Try to increase it first. – Anton Danilov Jun 11 '20 at 20:49
  • Yes, I tried to increase it, did not help. But I have the following observations:- ---- 1:10f does not lend a lot of tokens to its child 1:101 even though it has enough tokens available. It instead borrows from its parent 1:a100 and passes to its child. ---- Deleting 1:10f class and attaching 1:101 to 1:a100 directly makes everything work fine. (but I can't use this as a fix, since that class is essential for me). – Vo1dSpace Jun 13 '20 at 05:58