2.1. Address Resolution Protocol (ARP)

2.1. Address Resolution Protocol (ARP)
Prev	Chapter 2. Ethernet	Next

Address Resolution Protocol (ARP) hovers in the shadows of most networks. Because of its simplicity, by comparison to higher layer protocols, ARP rarely intrudes upon the network administrator's routine. All modern IP-capable operating systems provide support for ARP. The uncommon alternative to ARP is static link-layer-to-IP mappings.

ARP defines the exchanges between network interfaces connected to an Ethernet media segment in order to map an IP address to a link layer address on demand. Link layer addresses are hardware addresses (although they are not immutable) on Ethernet cards and IP addresses are logical addresses assigned to machines attached to the Ethernet. Subsequently in this chapter, link layer addresses may be known by many different names: Ethernet addresses, Media Access Control (MAC) addresses, and even hardware addresses. Disputably, the correct term from the kernel's perspective is "link layer address" because this address can be changed (on many Ethernet cards) via command line tools. Nevertheless, these terms are not realistically distinct and can be used interchangeably.

2.1.1. Overview of Address Resolution Protocol

Address Resolution Protocol (ARP) exists solely to glue together the IP and Ethernet networking layers. Since networking hardware such as switches, hubs, and bridges operate on Ethernet frames, they are unaware of the higher layer data carried by these frames ^[9]. Similarly, IP layer devices, operating on IP packets need to be able to transmit their IP data on Ethernets. ARP defines the conversation by which IP capable hosts can exchange mappings of their Ethernet and IP addressing.

ARP is used to locate the Ethernet address associated with a desired IP address. When a machine has a packet bound for another IP on a locally connected Ethernet network, it will send a broadcast Ethernet frame containing an ARP request onto the Ethernet. All machines with the same Ethernet broadcast address will receive this packet ^[10]. If a machine receives the ARP request and it hosts the IP requested, it will respond with the link layer address on which it will receive packets for that IP address. N.B., the arp_filter sysctl will alter this behaviour somewhat.

Once the requestor receives the response packet, it associates the MAC address and the IP address. This information is stored in the arp cache. The arp cache can be manipulated with the ip neighbor and arp commands. To learn how and when to manipulate the arp cache, see Section B.1, “arp”.

In Example 1.2, “Testing reachability of a locally connected host with ping”, we used ping to test reachability of masq-gw. Using a packet sniffer to capture the sequence of packets on the Ethernet as a result of tristan's attempt to ping, provides an example of ARP in flagrante delicto. Consult the example network map for a visual representation of the network layout in which this traffic occurs.

This is an archetypal conversation between two computers exchanging relevant hardware addressing in order that they can pass IP packets, and is comprised of two Ethernet frames.

Example 2.1. ARP conversation captured with tcpdump ^[11]

[root@masq-gw]# tcpdump -ennqti eth0 \( arp or icmp \)
tcpdump: listening on eth0
0:80:c8:f8:4a:51 ff:ff:ff:ff:ff:ff 42: arp who-has 192.168.99.254 tell 192.168.99.35             
0:80:c8:f8:5c:73 0:80:c8:f8:4a:51 60: arp reply 192.168.99.254 is-at 0:80:c8:f8:5c:73            
0:80:c8:f8:4a:51 0:80:c8:f8:5c:73 98: 192.168.99.35 > 192.168.99.254: icmp: echo request (DF)    
0:80:c8:f8:5c:73 0:80:c8:f8:4a:51 98: 192.168.99.254 > 192.168.99.35: icmp: echo reply

This broadcast Ethernet frame, identifiable by the destination Ethernet address with all bits set (ff:ff:ff:ff:ff:ff) contains an ARP request from tristan for IP address 192.168.99.254. The request includes the source link layer address and the IP address of the requestor, which provides enough information for the owner of the IP address to reply with its link layer address.

The ARP reply from masq-gw includes its link layer address and declaration of ownership of the requested IP address. Note that the ARP reply is a unicast response to a broadcast request. The payload of the ARP reply contains the link layer address mapping.

The machine which initiated the ARP request (tristan) now has enough information to encapsulate an IP packet in an Ethernet frame and forward it to the link layer address of the recipient (00:80:c8:f8:5c:73).

The final two packets in Example 2.1, “ARP conversation captured with tcpdump ” display the link layer header and the encapsulated ICMP packets exchanged between these two hosts. Examining the ARP cache on each of these hosts would reveal entries on each host for the other host's link layer address.

This example is the commonest example of ARP traffic on an Ethernet. In summary, an ARP request is transmitted in a broadcast Ethernet frame. The ARP reply is a unicast response, containing the desired information, sent to the requestor's link layer address.

An even rarer usage of ARP is gratuitous ARP, where a machine announces its ownership of an IP address on a media segment. The arping utility can generate these gratuitous ARP frames. Linux kernels will respect gratuitous ARP frames ^[12].

Example 2.2. Gratuitous ARP reply frames

[root@tristan]# arping -q -c 3 -A -I eth0 192.168.99.35
[root@masq-gw]# tcpdump -c 3 -nni eth2 arp
tcpdump: listening on eth2
06:02:50.626330 arp reply 192.168.99.35 is-at 0:80:c8:f8:4a:51 (0:80:c8:f8:4a:51) 
06:02:51.622727 arp reply 192.168.99.35 is-at 0:80:c8:f8:4a:51 (0:80:c8:f8:4a:51) 
06:02:52.620954 arp reply 192.168.99.35 is-at 0:80:c8:f8:4a:51 (0:80:c8:f8:4a:51)

The frames generated in Example 2.2, “Gratuitous ARP reply frames” are ARP replies to a question never asked. This sort of ARP is common in failover solutions and also for nefarious sorts of purposes, such as ettercap.

Unsolicited ARP request frames, on the other hand, are broadcast ARP requests initiated by a host owning an IP address.

Example 2.3. Unsolicited ARP request frames

[root@tristan]# arping -q -c 3 -U -I eth0 192.168.99.35
[root@masq-gw]# tcpdump -c 3 -nni eth2 arp
tcpdump: listening on eth2
06:28:23.172068 arp who-has 192.168.99.35 (ff:ff:ff:ff:ff:ff) tell 192.168.99.35
06:28:24.167290 arp who-has 192.168.99.35 (ff:ff:ff:ff:ff:ff) tell 192.168.99.35
06:28:25.167250 arp who-has 192.168.99.35 (ff:ff:ff:ff:ff:ff) tell 192.168.99.35
[root@masq-gw]# ip neigh show

These two uses of arping can help diagnose Ethernet and ARP problems--particularly hosts replying for addresses which do not belong to them.

To avoid IP address collisions on dynamic networks (where hosts are turning on and off, connecting and disconnecting and otherwise changing IP addresses) duplicate address detection becomes important. Fortunately, arping provides this functionality as well. A startup script could include the arping utility in duplicate address detection mode to select between IP addresses or methods of acquiring an IP address.

Example 2.4. Duplicate Address Detection with ARP

[root@tristan]# arping -D -I eth0 192.168.99.147; echo $?
ARPING 192.168.99.47 from 0.0.0.0 eth0
Unicast reply from 192.168.99.47 [00:80:C8:E8:1E:FC] for 192.168.99.47 [00:80:C8:E8:1E:FC] 0.702ms
Sent 1 probes (1 broadcast(s))
Received 1 response(s)
1
[root@tristan]# tcpdump -eqtnni eth2 arp
tcpdump: listening on eth2
0:80:c8:f8:4a:51 ff:ff:ff:ff:ff:ff 60: arp who-has 192.168.99.147 (ff:ff:ff:ff:ff:ff) tell 0.0.0.0
0:80:c8:e8:1e:fc 0:80:c8:f8:4a:51 42: arp reply 192.168.99.147 is-at 0:80:c8:e8:1e:fc (0:80:c8:e8:1e:fc)
[root@masq-gw]# ip neigh show

Address Resolution Protocol, which provides a method to connect physical network addresses with logical network addresses is a key element to the deployment of IP on Ethernet networks.

2.1.2. The ARP cache

In simplest terms, an ARP cache is a stored mapping of IP addresses with link layer addresses. An ARP cache obviates the need for an ARP request/reply conversation for each IP packet exchanged. Naturally, this efficiency comes with a price. Each host maintains its own ARP cache, which can become outdated when a host is replaced, or an IP address moves from one host to another. The ARP cache is also known as the neighbor table.

To display the ARP cache, the venerable and cross-platform arp admirably dispatches its duty. As with many of the iproute2 tools, more information is available via ip neighbor than with arp. Example 2.5, “ARP cache listings with arp and ip neighbor” below illustrates the differences in the output between the output of these two different tools.

Example 2.5. ARP cache listings with arp and ip neighbor

[root@tristan]# arp -na
? (192.168.99.7) at 00:80:C8:E8:1E:FC [ether] on eth0
? (192.168.99.254) at 00:80:C8:F8:5C:73 [ether] on eth0
[root@tristan]# ip neighbor show
192.168.99.7 dev eth0 lladdr 00:80:c8:e8:1e:fc nud reachable
192.168.99.254 dev eth0 lladdr 00:80:c8:f8:5c:73 nud reachable

A major difference between the information reported by ip neighbor and arp is the state of the proxy ARP table. The only way to list permanently advertised entries in the neighbor table (proxy ARP entries) is with the arp.

Entries in the ARP cache are periodically and automatically verified unless continually used. Along with net/ipv4/neigh/$DEV/gc_stale_time, there are a number of other parameters in net/ipv4/neigh/$DEV which control the expiration of entries in the ARP cache.

When a host is down or disconnected from the Ethernet, there is a period of time during which other hosts may have an ARP cache entry for the disconnected host. Any other machine may display a neighbor table with the link layer address of the recently disconnected host. Because there is a recently known-good link layer address on which the IP was reachable, the entry will abide. At gc_stale_time the state of the entry will change, reflecting the need to verify the reachability of the link layer address. When the disconnected host fails to respond ARP requests, the neighbor table entry will be marked as incomplete

Here are a the possible states for entries in the neighbor table.

Table 2.1. Active ARP cache entry states

ARP cache entry state	meaning	action if used
permanent	never expires; never verified	reset use counter
noarp	normal expiration; never verified	reset use counter
reachable	normal expiration	reset use counter
stale	still usable; needs verification	reset use counter; change state to delay
delay	schedule ARP request; needs verification	reset use counter
probe	sending ARP request	reset use counter
incomplete	first ARP request sent	send ARP request
failed	no response received	send ARP request

To resume, a host (192.168.99.7) in tristan's ARP cache on the example network has just been disconnected. There are a series of events which will occur as tristan's ARP cache entry for 192.168.99.7 expires and gets scheduled for verification. Imagine that the following commands are run to capture each of these states immediately before state change.

Example 2.6. ARP cache timeout

[root@tristan]# ip neighbor show 192.168.99.7
192.168.99.7 dev eth0 lladdr 00:80:c8:e8:1e:fc nud reachable     
[root@tristan]# ip neighbor show 192.168.99.7
192.168.99.7 dev eth0 lladdr 00:80:c8:e8:1e:fc nud stale         
[root@tristan]# ip neighbor show 192.168.99.7
192.168.99.7 dev eth0 lladdr 00:80:c8:e8:1e:fc nud delay         
[root@tristan]# ip neighbor show 192.168.99.7
192.168.99.7 dev eth0 lladdr 00:80:c8:e8:1e:fc nud probe         
[root@tristan]# ip neighbor show 192.168.99.7
192.168.99.7 dev eth0  nud incomplete

	Before the entry has expired for 192.168.99.7, but after the host has been disconnected from the network. During this time, `tristan` will continue to send out Ethernet frames with the destination frame address set to the link layer address according to this entry.
	It has been `gc_stale_time` seconds since the entry has been verified, so the state has changed to stale.
	This entry in the neighbor table has been requested. Because the entry was in a stale state, the link layer address was used, but now the kernel needs to verify the accuracy of the address. The kernel will soon send an ARP request for the destination IP address.
	The kernel is actively performing address resolution for the entry. It will send a total of `ucast_solicit` frames to the last known link layer address to attempt to verify reachability of the address. Failing this, it will send `mcast_solicit` broadcast frames before altering the ARP cache state and returning an error to any higher layer services.
	After all attempts to reach the destination address have failed, the entry will appear in the neighbor table in this state.

The remaining neighbor table flags are visible when initial ARP requests are made. If no ARP cache entry exists for a requested destination IP, the kernel will generate mcast_solicit ARP requests until receiving an answer. During this discovery period, the ARP cache entry will be listed in an incomplete state. If the lookup does not succeed after the specified number of ARP requests, the ARP cache entry will be listed in a failed state. If the lookup does succeed, the kernel enters the response into the ARP cache and resets the confirmation and update timers.

After receipt of a corresponding ARP reply, the kernel enters the response into the ARP cache and resets the confirmation and update timers.

For machines not using a static mapping for link layer and IP addresses, ARP provides on demand mappings. The remainder of this section will cover the methods available under linux to control the address resolution protocol.

2.1.3. ARP Suppression

Complete ARP suppression is not difficult at all. ARP suppression can be accomplished under linux on a per-interface basis by setting the noarp flag on any Ethernet interface. Disabling ARP will require static neighbor table mappings for all hosts wishing to exchange packets across the Ethernet.

To suppress ARP on an interface simply use ip link set dev $DEV arp off as in Example B.7, “Using ip link set to change device flags” or ifconfig $DEV -arp as in Example C.5, “Setting interface flags with ifconfig”. Complete ARP suppression will prevent the host from sending any ARP requests or responding with any ARP replies.

2.1.4. The ARP Flux Problem

When a linux box is connected to a network segment with multiple network cards, a potential problem with the link layer address to IP address mapping can occur. The machine may respond to ARP requests from both Ethernet interfaces. On the machine creating the ARP request, these multiple answers can cause confusion, or worse yet, non-deterministic population of the ARP cache. Known as ARP flux ^[13], this can lead to the possibly puzzling effect that an IP migrates non-deterministically through multiple link layer addresses. It's important to understand that ARP flux typically only affects hosts which have multiple physical connections to the same medium or broadcast domain.

This is a simple illustration of the problem in a network where a server has two Ethernet adapters connected to the same media segment. They need not have IP addresses in the same IP network for the ARP reply to be generated by each interface. Note the first two replies received in response to the ARP broadcast request. These replies arrive from conflicting link layer addresses in response to this request. Also notice the greater time required for the sending and receiving hosts to process the broadcast ARP request frames than the unicast frames which follow (probes two and three).

Example 2.7. ARP flux

[root@real-client]# arping -I eth0 -c 3 10.10.20.67
ARPING 10.10.20.67 from 10.10.20.33 eth0
Unicast reply from 10.10.20.67 [00:80:C8:7E:71:D4]  11.298ms
Unicast reply from 10.10.20.67 [00:80:C8:E8:1E:FC]  12.077ms
Unicast reply from 10.10.20.67 [00:80:C8:E8:1E:FC]  1.542ms
Unicast reply from 10.10.20.67 [00:80:C8:E8:1E:FC]  1.547ms
Sent 3 probes (1 broadcast(s))
Received 4 response(s)

There are four solutions to this problem. The common solution for kernel 2.4 harnesses the arp_filter sysctl, while the common solution for kernel 2.2 takes advantage of the hidden sysctl. These two solutions alter the behaviour of ARP on a per interface basis and only if the functionality has been enabled.

Alternate solutions which provide much greater control of ARP (possibly documented here at a later date) include Julian Anastasov's ip arp tool and his noarp route flag. While these tools were conceived in the course of the Linux Virtual Server project, they have practical application outside this realm.

2.1.4.1. ARP flux prevention with `arp_filter`

One method for preventing ARP flux involves the use of net/ipv4/conf/$DEV/arp_filter. In short, the use of arp_filter causes the recipient (in the case below, real-server) to perform a route lookup to determine the interface through which to send the reply, instead of the default behaviour (shown above), replying from all Ethernet interfaces which receive the request.

The arp_filter solution can have unintended effects if the only route to the destination is through one of the network cards. In Example 2.8, “Correction of ARP flux with conf/$DEV/arp_filter”, real-client will demonstrate this. This instructive example should highlight the shortcomings of the arp_filter solution in very complex networks where finer-grained control is required.

In general, the arp_filter solution sufficiently solves the ARP flux problem. First, hosts do not generate ARP requests for networks to which they do not have a direct route (see Section 4.2, “Routing to Locally Connected Networks”) and second, when such a route exists, the host normally chooses a source address in the same network as the destination. So, the arp_filter solution is a good general solution, but does not adequately address the occasional need for more control over ARP requests and replies.

Example 2.8. Correction of ARP flux with conf/$DEV/arp_filter

[root@real-server]# echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
[root@real-server]# echo 1 > /proc/sys/net/ipv4/conf/eth0/arp_filter
[root@real-server]# echo 1 > /proc/sys/net/ipv4/conf/eth1/arp_filter
[root@real-server]# ip address show dev eth0
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
    link/ether 00:80:c8:e8:1e:fc brd ff:ff:ff:ff:ff:ff
    inet 10.10.20.67/24 scope global eth0
[root@real-server]# ip address show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
    link/ether 00:80:c8:7e:71:d4 brd ff:ff:ff:ff:ff:ff
    inet 192.168.100.1/24 brd 192.168.100.255 scope global eth1    
[root@real-client]# arping -I eth0 -c 3 10.10.20.67
ARPING 10.10.20.67 from 10.10.20.33 eth0
Unicast reply from 10.10.20.67 [00:80:C8:E8:1E:FC]  0.882ms
Unicast reply from 10.10.20.67 [00:80:C8:E8:1E:FC]  1.221ms
Unicast reply from 10.10.20.67 [00:80:C8:E8:1E:FC]  1.487ms        
Sent 3 probes (1 broadcast(s))
Received 3 response(s)
[root@real-client]# arping -I eth0 -c 3 192.168.100.1
ARPING 192.168.100.1 from 10.10.20.33 eth0
Unicast reply from 192.168.100.1 [00:80:C8:E8:1E:FC]  0.877ms
Unicast reply from 192.168.100.1 [00:80:C8:E8:1E:FC]  1.517ms
Unicast reply from 192.168.100.1 [00:80:C8:E8:1E:FC]  1.661ms      
Sent 3 probes (1 broadcast(s))
Received 3 response(s)
[root@real-client]# ip neighbor del 192.168.100.1 dev eth0         
[root@real-client]# ip address add 192.168.100.2/24 brd + dev eth0 
[root@real-client]# arping -I eth0 -c 3 192.168.100.1
ARPING 192.168.100.1 from 192.168.100.2 eth0
Unicast reply from 192.168.100.1 [00:80:C8:7E:71:D4]  0.804ms
Unicast reply from 192.168.100.1 [00:80:C8:7E:71:D4]  1.381ms
Unicast reply from 192.168.100.1 [00:80:C8:7E:71:D4]  2.487ms      
Sent 3 probes (1 broadcast(s))
Received 3 response(s)

	Set the sysctl variables to enable the `arp_filter` functionality. After this, you might expect that ARP replies for 10.10.20.67 would only advertise the link layer address on eth0 (00:80:c8:e8:1e:fc).
	Here is the expected behaviour. Only one reply comes in for the IP 10.10.20.67 after the `arp_filter` sysctl has been enabled. The reply originates from the interface on `real-server` which actually hosts the IP address. Note that the source address on the ARP queries is 10.10.20.33, and that the ARP query causes `real-server` to perform a route lookup on 10.10.20.33 to choose an interface from which to send the reply.
	Here, `real-client` requests the link layer address of the host 192.168.100.1, but the source IP on the request packet (chosen according to the rules for source address selection) is 10.10.20.33. When `real-server` looks up a route to this destination, it chooses its eth0, and replies with the link layer address of its eth0. Conventional networking needs should not run afoul of this oddity of the `arp_filter` ARP flux prevention technique.
	Remove the entry in the neighbor table before testing again.
	By adding an IP address in the same network as the intended destination (which would be rather common where multiple IP networks share the same medium or broadcast domain), the kernel can now select a different source address for the ARP request packets.
	Note the source address of the ARP queries is now 192.168.100.2. When `real-server` performs a route lookup for the 192.168.100.0/24 destination, the chosen path is through eth1. The ARP reply packets now have the correct link layer address.

In general, the arp_filter solution should suffice, but this knowledge can be key in determining whether or not an alternate solution, such as an ARP filtering solution are necessary.

2.1.4.2. ARP flux prevention with `hidden`

The ARP flux problem can also be combatted with a kernel patch by Julian Anastasov, which was incorporated into the 2.2.14+ kernel series, but never into the 2.4+ kernel series. Therefore, the functionality may not be available in all kernels.

The sysctl net/ipv4/conf/$DEV/hidden toggles the generation of ARP replies for requested IPs. It marks an interface and all of its IP addresses invisible to other interfaces for the purpose of ARP requests. When an ARP request arrives on any interface, the kernel tests to see if the IP address is locally hosted anywhere on the machine. If the IP is found on any interface, the kernel will generate a reply.

Since this is not always desirable, the hidden sysctl can be employed. This prevents the kernel from finding the IP address when testing to see what IP addresses are locally hosted. The kernel can always find IPs hosted on the interface on which the packet arrived, but it cannot find addresses which are hidden.

As shown in Example 2.9, “Correction of ARP flux with net/$DEV/hidden”, not only can ARP flux be corrected, but sensitive information about the IP addresses available on a linux box can be safeguarded ^[14]. This makes the hidden sysctl useful for preventing unwanted IP disclosure via ARP on multi-homed hosts, in addition to preventing ARP flux on hosts connected to the same network medium.

Example 2.9. Correction of ARP flux with net/$DEV/hidden

[root@real-client]# arping -I eth0 -c 1 172.19.22.254
ARPING 172.19.22.254 from 172.19.22.2 eth0
Unicast reply from 172.19.22.254 [00:60:F5:08:8A:2D]  0.704ms
Unicast reply from 172.19.22.254 [00:60:F5:08:8A:2E]  0.844ms
Unicast reply from 172.19.22.254 [00:60:F5:08:8A:2F]  0.918ms
Unicast reply from 172.19.22.254 [00:60:F5:08:8A:2C]  0.974ms
Sent 1 probes (1 broadcast(s))
Received 4 response(s)
[root@real-server]# for i in all eth2 eth3 eth4 eth5 ; do
> echo 1 > /proc/sys/net/ipv4/conf/$i/hidden
> done
[root@real-client]# arping -I eth0 -c 2 172.19.22.254
ARPING 172.19.22.254 from 172.19.22.2 eth0
Unicast reply from 172.19.22.254 [00:60:F5:08:8A:2D]  0.710ms
Unicast reply from 172.19.22.254 [00:60:F5:08:8A:2D]  0.624ms
Sent 2 probes (1 broadcast(s))
Received 2 response(s)

These are two examples of methods to prevent ARP flux. Other alternatives for correcting this problem are documented in Section 2.3, “ARP filtering”, where much more sophisticated tools are available for manipulation and control over the ARP functions of linux.

^[9] Some networking equipment vendors have built devices which are sold as high performance switches and are capable of performing operations on higher layer contents of Ethernet frames. Typically, however, a switching device is not capable of operating on IP packets.

^[10] The kernel uses the Ethernet broadcast address configured on the link layer device. This is rarely anything but ff:ff:ff:ff:ff:ff. In the extraordinary event that this is not the Ethernet broadcast address in your network, see Section B.3.7, “Changing hardware or Ethernet broadcast address with ip link set”.

^[11] tcpdump is one of a number of utilities for watching packets visible to an interface. For further introduction to tcpdump, see Section G.5, “tcpdump”.

^[12] I have repeatedly tested using arping in gratuitous ARP mode, and have found that linux kernels appear to respect gratuitous ARP. This is a surprise. Does anybody have ideas about this? Must research!

^[13] I have seen it called names other than ARP flux--anybody out there heard of this called anything besides ARP flux?

^[14] Consider a masquerading firewall which answers ARP requests on a public segment for IPs hosted on an internal interface. This amounts to inadvertent exposure of internal addressing, and can be used by an attacker as part of a data-gathering or reconaissance operation on a network.