Sun Microsystems, Inc.
spacerspacer
spacer www.sun.com docs.sun.com |
spacer
black dot
 
 
27.  IP Network Multipathing (Overview) Administering Multipathing Groups With a Single Physical Interface  Previous   Contents   Next 
   
 

Removing Network Adapters From Multipathing Groups

When you execute the ifconfig command's group parameter with a null string, the interface is removed from the existing group. See "How to Remove an Interface From a Group". Be careful when removing interfaces from a group. If some other interface in the multipathing group failed, a failover could have happened earlier. For example, if hme0 failed previously, all addresses are failed over to hme1 (if hme1 is part of the same group). The removal of hme1 from the group causes in.mpathd to return all the failover addresses to some other interface in the group. If no other interfaces are functioning in the group, failover might not restore all the network accesses.

Similarly, when an interface is part of the group and the interface needs to be unplumbed, you should remove the interface from the group first. Then ensure that the interface has all the original IP addresses configured on it. The in.mpathd daemon tries to restore the original configuration of an interface that is removed from the group. You need to ensure that the configuration is restored before unplumbing the interface. Refer to "Multipathing Daemon" to see how interfaces look before and after a failover.

Detached Network Adapters

Dynamic Reconfiguration (DR) uses IP Network Multipathing to decommission a specific network device without impacting existing IP users. Before a NIC is DR-detached (off lined), all failover IP addresses that are hosted on that NIC are automatically failed over to another NIC in the same IP Network Multipathing group. The test addresses are brought down and the NIC is unplumbed.

With the IP Network Multipathing reboot-safe feature, the static IP addresses in the /etc/hostname.* file that are associated with the missing card are hosted automatically on an alternate interface within the same IP Network Multipathing group. However, these addresses are returned to the original interface automatically if the original interface is inserted back into the system at a later time.

Multipathing Daemon

The in.mpathd multipathing daemon detects failures and repairs by sending out probes on all the interfaces that are part of a group. The in.mpathd multipathing daemon also detects failures and repairs by monitoring the RUNNING flag on each interface in the group. When an interface is part of a group and has a test address, the daemon starts sending out probes for determining failures on that interface. If the daemon does not receive any replies to five consecutive probes, or the RUNNING flag is not set, the interface is considered to have failed. The probing rate depends on the failure detection time. By default, failure detection time is 10 seconds. Thus, the probing rate is one probe every two seconds. To avoid synchronization in the network, probing is not periodic. If five consecutive probes fail, in.mpathd considers the interface as failed and performs a failover of the network access from the failed interface to another interface in the group that is functioning properly. If a standby interface is configured, it is chosen for failover of the IP addresses, and broadcasts and multicast memberships. If no standby interface exists, the interface with the least number of IP addresses is chosen. Refer to the man page in.mpathd(1M) for more information.

The following two examples show a typical configuration and how the configuration automatically changes when an interface fails. When the hme0 interface fails, notice that all addresses move from hme0 to hme1.


Example 27-1 Interface Configuration Before an Interface Failure

hme0: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
     inet 19.16.85.19 netmask ffffff00 broadcast 19.16.85.255
     groupname test
hme0:1: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 
     index 2 inet 19.16.85.21 netmask ffffff00 broadcast 129.146.85.255
hme1: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
     inet 19.16.85.20 netmask ffffff00 broadcast 19.16.85.255
     groupname test
hme1:1: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 
     index 2 inet 19.16.85.22 netmask ffffff00 broadcast 129.146.85.255
hme0: flags=a000841<UP,RUNNING,MULTICAST,IPv6,NOFAILOVER> mtu 1500 index 2
     inet6 fe80::a00:20ff:feb9:19fa/10
     groupname test
hme1: flags=a000841<UP,RUNNING,MULTICAST,IPv6,NOFAILOVER> mtu 1500 index 2
     inet6 fe80::a00:20ff:feb9:1bfc/10
     groupname test


Example 27-2 Interface Configuration After an Interface Failure

hme0: flags=19000842<BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 0 index 2
        inet 0.0.0.0 netmask 0 
        groupname test
hme0:1: flags=19040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> 
        mtu 1500 index 2 inet 19.16.85.21 netmask ffffff00 broadcast 129.146.85.255
hme1: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 19.16.85.20 netmask ffffff00 broadcast 19.16.85.255
        groupname test
hme1:1: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 
        index 2 inet 19.16.85.22 netmask ffffff00 broadcast 129.146.85.255
hme1:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 6
        inet 19.16.85.19 netmask ffffff00 broadcast 19.16.18.255
hme0: flags=a000841<UP,RUNNING,MULTICAST,IPv6,NOFAILOVER,FAILED> mtu 1500 index 2
        inet6 fe80::a00:20ff:feb9:19fa/10 
        groupname test
hme1: flags=a000841<UP,RUNNING,MULTICAST,IPv6,NOFAILOVER> mtu 1500 index 2
        inet6 fe80::a00:20ff:feb9:1bfc/10 
        groupname test

You can see that the FAILED flag is set on hme0 to indicate that hme0 has failed. You can also see that hme1:2 is now created. hme1:2 was originally hme0. The address 19.16.85.19 then becomes accessible through hme1. Multicast memberships that are associated with 19.16.85.19 can still receive packets, but now through hme1. When the failover of address 19.16.85.19 from hme0 to hme1 occurred, a dummy address 0.0.0.0 was created on hme0. The dummy address is removed when a subsequent failback takes place. The dummy address is created so that hme0 can still be accessed. hme0:1 cannot exist without hme0.

Similarly, failover of the IPv6 address from hme0 to hme1 occurred. In IPv6, multicast memberships are associated with interface indexes. They also fail over from hme0 to hme1. All the addresses that in.ndpd configures also move, this action is not shown in the examples.

The in.mpathd daemon continues to probe through the failed NIC, hme0. After the daemon receives 10 consecutive replies for a default failure detection time of 10 seconds, the daemon considers the interface repaired and invokes the failback. After failback, the original configuration is reestablished.

See in.mpathd(1M) man page for a description of all error messages there are logged on the console during failures and repairs.

Multipathing Configuration File

The in.mpathd daemon uses the settings in the /etc/default/mpathd configuration file to invoke multipathing. Changes to this file are read by in.mpathd at startup and on SIGHUP. This file contains the following default settings and information:

#
# Time taken by mpathd to detect a NIC failure in ms. The minimum time
# that can be specified is 100 ms.
# 
FAILURE_DETECTION_TIME=10000
#

# Failback is enabled by default. To disable failback turn off this option
#
FAILBACK=yes
#

# By default only interfaces configured as part of multipathing groups 
# are tracked. Turn off this option to track all network interfaces 
# on the system
#
TRACK_INTERFACES_ONLY_WITH_GROUPS=yes

"How to Configure the Multipathing Configuration File" shows the steps you perform to configure the /etc/default/mpathd configuration file.

Failure Detection Time

You can set a lower value of failure detection time. Sometimes these values might not be achieved if the load on the network is too high. Then in.mpathd prints a message on the console, indicating that the time cannot be met. The daemon also prints the time that it can meet currently. If the response comes back correctly, in.mpathd meets the failure detection time that is provided in this file.

Failback

After a failover, failbacks occurs when the failed interface is repaired. However, in.mpathd does not fail back the interface if FAILBACK is set to no.

As noted in "Detecting Physical Interface Failures", automatic failback is supported for physical interfaces that are not present at system boot. See "How to Recover a Physical Interface That Was Not Present at System Boot".

Track Interfaces Only With Groups Option

By turning off this option, in.mpathd tracks all interfaces in the system. When a failure is detected, an appropriate message is logged on the console. For this option to function properly, Ethernet addresses on all the interfaces must be unique.

 
 
 
  Previous   Contents   Next