The Linux networking code makes extensive use of hash tables to implement caches to support packet classification. One of these caches, the routing cache, can be used to mount effective denial of service attacks, using an algorithmic complexity attack.
The routing cache (or "dst cache") caches routing decisions for a traffic flow. A traffic flow consists of packets which have the same IPv4 source and destination address and the same TOS value in the IP header. These flows are unidirectional; for a two-way communication, two flows exist, one in each direction. Even if the cache is also called "dst cache" for historical reasons, the cache covers more than just destination addresses.
When a packet arrives, the kernel must route it. The IP routing code checks for a suitable traffic flow and reuses the cached routing decisions, if possible. Otherwise, it makes a new routing decision and creates a new traffic flow, by updating the routing cache accordingly. This routing occurs on single-homed host with disabled IP forwarding as well as on full-table routers.
The routing cache is implemented as a hash table, in a rather particular way. The bucket count is an integral power of two which is fixed on system boot and scaled according to the amount of physical RAM. The hash function is GF(2)-linear (which means that it is easy to find collisions). Collision chaining is used to store different entries which hash to the same bucket. A garbage collection mechanism ensures that the size of the cache stays below the configured maximum entry count. This entry count is scaled with available system memory, too.
Note that there are additional hash tables in the networking code. For example, IP connection tracking adds an additional hash table (which uses a different, but still rather weak hash function).
Our attack is targeted at a host and uses packets with carefully chosen source addresses and TOS values to trigger collisions in the lower bits of the routing cache hash function. (Note that these collisions have nothing to do with colliding packets on the wire.) As a result, all these packets create distinct flows which are stored in a linear list hooked to a single bucket to a hash table. In essence, this reduces the hash table to a linear list, and finding entries becomes extremely expensive when the list is very long. (This effect is detailed in the paper cited below.)
The effectiveness of the attack depends significantly on the maximum size of routing cache. As described above, the default maximum size depends on the amount of physical RAM present in the machine. Therefore, machines with more RAM are more vulnerable if they operate in the default configuration. For example, we were able to freeze a machine with four gigabytes of RAM with a stream of about 400 packets per second. (The same machine remained unaffected when we used random source addresses instead of source addresses that lead to collisions in the hash function.)
Of course, the hash function is extremely simple, but this is not source of the problem. Even though it is possible to find collisions by solving a rather small system of linear equations over the field of two elements, the main problem results from the fact that the attacker can determine the hash bucket for traffic flows (and send packets based on that information).
Red Hat published a security advisory which includes a patch (see below) which changes the hash function to a non-linear, keyed hash function. While the the hash function is not cryptographically strong, it is certainly much more complicated (if not even impossible) for an attacker to trigger collisions. (As an additional protection, the key is changed every ten minutes.) In our experience, the patch, applied to stock Linux 2.4.20, works reasonably well in typical denial of service situations.
If you cannot apply the patch and are confronted with an attack of this type, there are two options to protect machines: setting rate limits using
iptables, or decreasing the routing cache size. Choosing suitable rate limits is very complicated, so it is not recommended. You can decrease the routing cache size using the
/proc interface. (If the total size of the cache is reduced, the maximum length of a collision chain is reduced, too, and this particular attack is no longer possible.) As
root, run the following commands:
# echo 4096 > /proc/sys/net/ipv4/route/max_size # echo 2048 > /proc/sys/net/ipv4/route/gc_thresh #
(On most systems, you can edit
/etc/sysctl.conf to make these changes permanent.)
However, note that this approach of decreasing the cache size has a severe impact on routing performance if the number of parallel flows processed by the machine exceeds the routing cache size.
How significant is this problem?
The Linux IP stack is not very robust against various types of denial of service attacks. As a result, this problem is unlikely to have any practical consequences. We recommend to apply the patch to fix this problem during routine maintenance and not to change the maximum routing cache size preventively because of the potential performance impact.
Are other vendors affected by the problem?
At this time, we do not believe that Cisco IOS routers or machines running Solaris or one of the BSD variants are affected. However, only source code inspection can reveal if a product is affected, and vendors are encouraged to verify that their products are unaffected. Flow-based routing using hash tables is particularly prone to this vulnerability, and implementations of this principle should therefore be scrutinized.
Is exploit code available publicly?
To our knowledge, this kind of attack is currently (July 2003) not in the wild, and no widely available attack tool supports it.
Does this attack affect only routers?
No, it is also relevant for hosts. The routing cache includes both source and destination addresses, and it is possible to spoof source addresses accordingly. However, routers are at somewhat greater risk because to attack them, you can choose the destination addresses in a way that trigger collisions which does not require
root privileges (or special IP packet generation code) on the attacking host.
I've read that it is impossible to spoof source addresses on the current Internet, thanks to ingress filtering, so this attack is not a problem, right?
While proper ingress (and egress) filtering is a standard practice to reduce source address spoofing, it isn't universally applied throughout the Internet. A lot of denial of service attacks still use spoofed source addresses and arrive at the intended victim.
Is it possible to use rate limits to counter the attack?
It is possible, but not recommended. To protect machines which large amounts of memory in default configurations, ridiculously low rate limits would be required which would enable denial of service attacks on their own. Note that all rate limits which protect the routing cache have to be applied in the
PREROUTING chain, as the standard
INPUT chain is processed after a packet has already updated the routing cache.
Netfilter connection tracking uses a huge hash table as well. Is it affected?
Yes, we believe that it is affected by the essentially same problem. The Red Hat patch corrects Netfilter connection tracking, too.
Will the Red Hat patch fix other performance issues with the routing cache?
Unfortunately, the answer is no. We now have multiple reports that Linux routers break down according to the inefficiency of the routing cache under stress, at incredible low packet rates. These problems continue to exist and are likely to persist until the kernel developers eliminate the routing cache.
Which Linux version resolves this issue?
This issue is resolved in Linux 2.4.21.
Why took it so long before this bug was fixed?
Kernel developers were contacted at the beginning of April, when the issue was independently discovered in the Linux kernel, not in February, when the first technical report was written by Scott Crosby and Dan Wallach.
Scott A. Crosby, Dan S. Wallach, Denial of Service via Algorithmic Complexity Attacks
The patch for Linux 2.4.20 that has been published by Red Hat.
2003-05-18: added CVE reference
2003-05-19: corrected incorrect wording ("hash table" instead of "hash function" at the end of the first section)
2003-07-30: reformatted, still no attacks
2003-07-31: Linux 2.4.21 contains a fix