Rationale

I mentioned in one of the posts about how prioritizing small packets upstream is almost the proverbial silver bullet when it comes to QoS at home. I'm sure any ADSL user who uses interactive applications, such as SSH have noticed how laggy the SSH gets when you upload something from home, say your holiday pictures with scp to your web server. Also download is quite slow during upload. VoIP and online gaming will suffer too. Canonical solution is to use DSCP markings at sender end or DSCP mark based on IP address or port.

But I feel that is unnecessarily complex for typical home use scenario, since all of the important/interactive stuff are using small packets and the bandwidth hogging applications are all essentially doing MTU size packets. I've chosen <200B as small packet, which is arbitrary decision I did about decade ago when setting this up first time, I'm sure it could just as well be like 1300B. So without further rambling, I'll give IOS (ISR) and JunOS (SRX) examples how to roll this on your CPE.

IOS example

class-map match-any SMALL-PACKETS match packet length max 200 ! policy-map WAN-OUT class SMALL-PACKETS priority percent 75 class class-default fair-queue random-detect ! interface ATM0.100 point-to-point pvc 0/100 vbr-nrt 2000 2000 tx-ring-limit 3 service-policy output WAN-OUT ! !

JunOS example

ytti@gw.fi> show configuration interfaces vlan unit 0 family inet filter input FROM_LAN; ytti@gw.fi> show configuration firewall family inet filter FROM_LAN term small_packets { from { packet-length 0-200; } then { forwarding-class expedited-forwarding; next term; } } term rest { then accept; } ytti@gw.fi> show configuration class-of-service interfaces at-1/0/0 unit 0 { scheduler-map WAN_OUT; } ytti@gw.fi> show configuration class-of-service scheduler-maps WAN_OUT forwarding-class best-effort scheduler BE; forwarding-class assured-forwarding scheduler AF; forwarding-class expedited-forwarding scheduler EF; forwarding-class network-control scheduler NC; ytti@gw.fi> show configuration class-of-service schedulers BE { transmit-rate percent 5; } AF { transmit-rate percent 5; } EF { transmit-rate percent 85; } NC { transmit-rate percent 5; } ytti@gw.fi>

additional information

You need to tune in IOS your vbr-nrt to match your upstream ATM rate, modern IOS will automatically scale it down to real rate, if it's too high, so you don't have to worry about it too much. If you have very slow connection like 256kbps, you might want to put tx-ring-limit to 2. Unfortunately I've not found out how to tune tx-ring size on JNPR, and it feels bit too large by default, as IOS is somewhat more responsive during congestion.

You can test if it's working by sending large file upstream and pinging some other host. See how it looks with and without QoS applied to the egress interface. You should see high delay on ping without QoS and normal delay with QoS. In JunOS you can use 'show interfaces queue X' or 'show interface X detail' to confirm that you're seeing drops in best-efford, not in expedited-forwarding. In IOS you can use 'show policy-map interface X output' to confirm you're seeing drops in class-default not in SMALL-PACKETS.

There is one particular problem for people using ssh ControlMaster which multiplexes multiple connections under same network socket. It's really great as you only login to remote host once and further ssh/scp start without delay and without authentication, especially great if you're hopping through multiple intermediate ssh hosts can reduce delay from 4-5s to 100ms on opening ssh session. But when it comes to QoS it's quite poor, if you have interactive ssh session to your server and then you use scp to upload data to that same server, you will notice that interactive ssh is laggy even with QoS. This is of course how it should work, while your CPE will reorder packets, to send interactive (small) packets first, the far end server, will not give the unordered small packets to userspace, as TCP guarantees packet ordering, so the far end server is keeping those packet jailed until original (laggy) order is restored. Quick fix is to disable ControlMaster for scp, via scp -o ControlPath=none foo bar:'.

If you're designing L2 discovery protocol, I suppose one of your mandatory requirements is, that you can 'machine walk' the network, after you find one box. I.e. you are able to know your neighbor devices and their ports. LLDP makes no such guarantees

You have 4 mandatory TLVs, [0123], End of LLDPDU, Chassis ID, Port ID and TTL. Chassis ID has 7 subtypes which implementation is free to choose, EntPhysicalAlias (two distinct cases), IfAlias, MAC address, networkAddress, ifName or locally assigned. Port ID also has 7 subtypes which implementation is free to choose, ifAlias, entPhysicalAlias, MAC address, networkAddress, ifName, agent circuit ID, locally assigned.

Now you can send what ever trash via locally assigned and be fully compliant implementation. It seems that it would be wise to mandate sending management address (networkAddress) in ChassisID and SNMP ifindex in PortID (and any _additional_ ones you may want to send, i.e. more than 1, which is not allowed). This way you'd immediately know what OID to query and from which node. Obviously this makes assumption that we have IP address always and SNMP implementation always. If we absolutely must support some corner cases where this is not true, we should specify different mandatory requirements for devices without networkAddress and SNMP implementation. Now because of some corner cases we can never trust LLDP implementation to be useful.

Clear sign that LLDP is not actually meeting real-world demands is that PortID often is locally assigned which is populated with SNMP index, you just have to know it that given device works like this, there is no way to programmatically know it beforehand.

random musings about networks and everything

2012-03-31

Silver bullet for home QoS

Rationale

IOS example

JunOS example

additional information

2012-03-14

LLDP / 802.1AB-2009 blows