Less known Solaris features - IP Multipathing (Part 4): Foundations 3

IPMP vs. Link aggregation

Link aggregation is available on many switches for quite a while now. With link aggregation it is possible to bundle a number of interfaces into a single logical interface. There is a failure protection mechanism in Link Aggregation as well. At start it was somewhat similar to the link based failure detection. When the link is down on a member of an aggregation, the switch takes the link out of the aggregation and put it’s back as soon as the link get’s up again. Later something similar to the probe-based mechanism found it’s way into the Ethernet standards. It’s called LACP. With LACP special frames are used on a link to determine if the other side of the connection is in the same aggregate (It was common configuration error in early days to have non-matching aggregation configuration) and if there is really an Ethernet connection between both switches.I won’t go in the details now, as this will be the topic of another tutorial in the next days. But the main purpose of link aggregation is to create a bigger pipe when a single Ethernet connection isn’t enough. So … why should you use IPMP? The reason is a simple one. When you use link aggregation, all your connections have to terminate on the same switch, thus this mechanisms won’t really help you in the case of a switch failure. The mechanisms of IPMP doesn’t work in the Layer 2 of the network, it works in the third layer and so it doesn’t have this constraint. The connections of an IPMP group can end in different switches, they can have different speeds, they could be even of a different technology, as long they use IP (this was more of advantage in the past, today in the “Ethernet Everything” age this point lost its appeal). I tend to say that link aggregation is a performance technology with some high availability capabilities, where as IPMP is a high-availability technology with some performance capabilities.

Loadspreading

A source of frequent questions is the load spreading feature in IPMP. Customers have asked me if this comparable to the aggregation. My answer is “Yes, but not really!” Perhaps this is the right moment to explain a thing about IPMP. When you look at the interfaces of a classic IPMP configuration, it looks like the IP addresses are assigned to physical interfaces. But that isn’t the truth. When you send out data on such an interface, it’s spread on all active (Active doesn’t mean functional. An interface can be functional but it isn’t used by IP traffic. An interface can be declared as a standby interface, thus it may be functional but the IPMP subsystem wouldn’t use it. That’s useful when you have a 10 GBe Interface and a 1 GBe Interface. You don’t want the 1 GBE interface for normal use, but it’s better than nothing in the case the 10 GBe interface fails) interfaces of such a group. But you have to be cautious: IPMP can do this only for outbound traffic. As IPMP is a server-only technology, there is no counterpart for it on the switch. So there is no load spreading on the switch. The switches doesn’t know about this situation. When an inbound packet reaches the default gateway, the router uses the usual mechanisms to get the ethernet address of the IP address and sends the data to this ethernet address. As there can be just one ethernet address for every IP address, the inbound communication will always use just one interface. This isn’t a problem for many workloads as many server applications send more data than they receive (For example a webserver). But as soon your application receives at lot of data (For example a fileserver), you should opt for another load distribution mechanism. However there is a trick to circumvent this constraint: A single IPMP group can provide several data addresses. By carefully distributing this data addresses over the physical interfaces you are able to distribute the inbound load as well. So when you are able to use multiple IP addresses you could do such a manual spreading of the inbound load. However real load spreading mechanisms with the help of the switches (Like the bundling of Ethernet links via LACP) will yield a much better distribution for the inbound traffic in many cases. But this disadvantage comes with an advantage: You are not bound to a single switch to use this load spreading. You could terminate every interface of you server in a separate switch and the IPMP group still spreads the traffic on all interfaces. That isn’t possible with the standard link aggregation technologies of Ethernet. I want to end this section a short warning: Both aggregation technologies will not increase your bandwidth when you have just a single IP data stream. Both technologies will use the same Ethernet interface for a communication relation between client and server. It’s possible to separate them even based on Layer 4 properties, but at the end the single ftp download will use just one of your lines. This is necessary to prevent out-of-order packets (You want to prevent this out of performance reasons) due to different trip times of the data on separate links.

Classic IPMP vs. new IPMP

There are many similar concepts in Classic IPMP and New IPMP. But the both implementations have important differences as well. The most important difference is the binding of the data address to the interfaces.

in.mpathd

There is a component in both variants that controls all the mechanisms surrounding IP multipathing. It’s the \verb=in.mpathd= daemon.

jmoekamp@hivemind:~$ ps -ef | grep "mpathd" | grep -v "grep"
    root  4523     1   0   Jan 19 ?           8:22 /lib/inet/in.mpathd

This daemon is automatically started by ifconfig, as soon you are configuring something in conjunction with IPMP on your system. The in.mpathd process is responsible for network adapter failure detection, repair detection, recovery, automatic failover and failback.