2011-08-15

When should you advertise default route?

Never

There are two typical scenarios when people carry default route in dynamic routing protocol, I'll address these separately and explain why you shouldn't do it, and what you should do instead.

CE (eBGP) PE

This is probably the most common scenario, maybe you're giving your customer default route, maybe it's your own firewall or really any situation where neighbor won't carry full routing table and neighbor isn't strictly same administrative domain.

Problem with default route here is, that if your PE gets disconnected from core, you're still originating the default route and CE is unaware of this and you're blackholing customer traffic until BGP is manually shutdown. You could conditionally advertise default, but that is just useless overhead, instead of default you should advertise to CE any aggregate route which is originated from multiple core boxes, such as your PA aggregate, or really any stable route originated from multiple places, but not local PE.

Customer would just add this to their router:

# ios ip route 0.0.0.0 0.0.0.0 192.0.2.0 name floating_default # junos route 0.0.0.0/0 { qualified-next-hop 192.0.2.0 { interface xe-0/0/0.0; } resolve; }
Now if your PE gets disconnected from core, you'll stop originating 192.0.2.0/24 and this ip route no longer will recurse to CE<->PE interface. If there is no more 192.0.2.0/24 route available anywhere, static route is invalid, and next available default route can be used. If there still is 192.0.2.0/24 available via alternative provider that will be automatically used.

Slight cosmetic complain is that if you add interface to the static route, IOS disables recursion, so you cannot enforce that the static route will disappear if next hop does not recurse behind that one interface. But it is purely cosmetic, as functionality will remain regardless if 192.0.2.0/24 will continue to exist or completely disappear. If it will continue to exist, customer will just need to local-pref/med 192.0.2.0/24 to have expected backup default selection.

PE router without full table

Typical solution is to have two RR iBGP peers to originate default route. This has the problem that RR probably aren't always in optimal forwarding path, especially in single fault, but in many cases never. So you'd stop iBGP from originating default, and you'd instead add this to every router having full bgp view:

interface Loopback1 description Anycast default ip address 192.0.2.0 255.255.255.255 no ip redirects no ip proxy-arp ! router isis passive-interface Loopback1
Obviously PE box would just have static default towards 192.0.2.0, this way PE would always forward packet towards nearest core box which is up and has full bgp table, so you always get best path egress forwarding, without having full bgp view and without having best path RR. Effectively it is as if every router has iBGP session to you and is originating default

Exception that proves the rule

If the end device does not support recursing routes, then obviously this won't work. And there still are such devices, though it's unsure if you want to be routing in such devices to begin with

2011-08-11

IPv6 ACL bypass

IPv6 designers recognized that IPv4 header has several faults, these were addressed to a different degree. Particularly annoying was IPv4 options which caused TCP/UDP/ICMP data to shift, as it made IPv4 header length variable. IPv6 header is fixed length, there is 'next-header' option, which will instruct how to parse data after IP header. Typically 'next-header' would be TCP, UDP or ICMP, and rest of packet would be exactly like in IPv4 (apart from mandatory checksum in UDP).

Where the complexity (some might say design fault) is that 'next-header' could be any large number of more exotic extension header, each of which have 'next-header' field themselves. Standard does not specify any limitation how many headers you could have, so you need to be able to parse packet up-to MTU length. The final extension header typically would contain TCP/UDP/ICMP and normal IPv4 style packet would follow.

Unfortunately no practical router has MTU wide view to the packet, you have 64B, 128B or 256B view, after which you are completely unaware of the packet content, it's just bits in memory which you cannot process in any meaningful way. Your PC won't have same problem, it does not have specialized hardware to quickly forward large amount of packets, so your PC will happily parse packet up-to the MTU length.

What this translates to is, that you can craft IPv6 packet where TCP port information is after view of router, so router will not know it is TCP packet nor what ports it is using, but the receiving PC will understand it normally. So if you have ACL rule where you are dropping some tcp/udp/icmp packets then allowing rest, those rules can be by-passed in very typical router. Example could be:

term my_smtp { from { destination-address 2001:db8::42/128; } then accept; term no_spam { from { next-header tcp; destination-port 25; } then discard; } term accept { then accept; }

Now this will be bypassed, because our 'next-header' is not tcp, but contains extension-header. But far end unmodified PC with unmodified software will treat it normally. Or maybe it is server where you allow ssh from management net, drop all packet to tcp/22 and permit rest. As long as you permit rest, instead of discard rest, bypass will work

How this should be fixed? Well IPv6 should have modified ICMP/TCP/UDP/etc to contain 'next-header' field, and mandated that they appear before any extension header, forcing non-extension headers to live in fixed bit places. Obviously ship has sailed for this fix. Now it is heavily platform dependent what will happen, cisco.com claims that they punt packets which they fail to parse correctly, this is sane, just be sure to police the punts and you have pretty good solution. Juniper before trio is pretty much lost cause.

Juniper trio is behaving remarkably well, but CLI is lagging behind. Trio will actually find TCP/UDP headers as long as there are fewer than 29 'destination-option' headers before TCP/UDP. If there are 30 'destination-option' headers before TCP/UDP packet is dropped in hardware by 'bad IPv6 options pkt DISC(9)' exception. Problem is CLI is unaware of this capability and you don't have 'protocol tcp' to define you want TCP, you only have 'next-header TCP' which only monitors the first next-header field in IP packet. If you omit 'next-header' and just match 'destination-port' and you have 29 or fewer 'destination-option' headers, JNPR will match correctly, you just lose ability to differentiate between tcp and udp. This is true for 10.4R4 and 11.2R1.

How trio should be fixed is by adding 'protocol' match in CLI (trio already classifies packet correctly) and 'bad IPv6 options pkt DISC(9)' exception should punt (via policer) instead of discard, so that RE can parse the packet correctly. You could ask that what /realistic/ packet would be dropped by trio parser, but I think that is beside the point, IPv6 standard allows for it, so you should parse it, even via punt with poor performance.

You can see packets failing trio parser via PFE:

# show jnh 0 exceptions terse Reason Type Packets Bytes ================================================================== Packet Exceptions ---------------------- bad IPv6 options pkt DISC( 9) 24808567 26495549556