|
|
This chapter describes
After the servers are started, each server contacts the other. After contact is established, the main server provides the backup server with a private pool of IP addresses that it can use in the event of a failure. The main server then updates the backup server whenever it performs an operation for a DHCP client.
Under normal conditions, the main server continues to update the backup server and the backup server allows the main server to service DHCP client requests.
If a failure occurs on the main server, the backup server takes over and renews the addresses of the existing clients and offers addresses to new clients. When the main server is operational again, it automatically reintegrates with the backup server without administrator intervention.
The failover protocol is designed to protect against several kinds of failures:
Each server operates differently in each of these regimes. Table 3-1 describes the server operations.
| Regime | Main Server | Backup Server |
|---|---|---|
Responsive to all DHCP client requests and allocates IP addresses to new clients from its pool of available IP addresses. It has allocated to the backup server some IP addresses for the backup server to use if communications are interrupted. | Unresponsive to DHCP client requests except renewals or rebinding requests. The backup server has requested and received a set of IP addresses to use for allocation to new DHCP clients if communication with the main server is interrupted. | |
Responsive to all DHCP client requests. It cannot tell if the backup server has gone down or if the backup server is just unable to communicate. It operates normally, although it cannot reallocate an IP address from one DHCP client to another while in this regime. | Cannot tell if the main server is down or simply not communicating. In either case, the backup server is responsive to all DHCP client requests and can allocate IP addresses from its pool of available addresses it has received from the main server. | |
Servers usually transition between Normal and Communications Interrupted as one or the other server goes up and down. | ||
The running server is guaranteed that the other server is down. The running server has control of all of the IP addresses, can offer any configured lease time or lease extension period, and at any time can reallocate an IP address from one client to another. A server will only transition to Partner Down if it is informed that the other partner is indeed down. The notification can be either through the protocol (used when the partner knows that it is going down) or because the server was unable to communicate with its partner, it automatically entered the Communication Interrupted regime, and the administrator used the setPartnerDown command. The setPartnerDown command tells the server that its partner is down. You could configure failover to affect an automatic transition from Communications Interrupted to Partner Down after the safe period has passed, but doing so would run the risk of duplicate IP address allocations if the partner is not actually down. | ||
Ideally you would let the servers move from the Normal to the Communications Interrupted regimes and back again, since these are safe, and you would never need to use administrative intervention to move a server into the Partner Down regime. In some cases, however, this is not practical because a server running in the Communications Interrupted regime is not using the available IP addresses efficiently, and this may restrict the amount of time a server can effectively service DHCP clients.
There are restrictions on either server running in the Communications Interrupted regime that do not apply to a server running in the Partner Down regime:
In addition, if the backup server is running in Communication Interrupted regime, the following restriction apply:
The length of time a server can successfully run in the Communications Interrupted regime is limited only by the number of IP addresses that have been allocated to it, and the corresponding arrival rate of the DHCP client DISCOVER packets for new clients. When there is a high arrival rate of new DHCP clients or a high turnover rate of the client IP addresses, you may need to move the server into the Partner Down regime more quickly.
You need to configure the main server to allocate a percentage of the currently available addresses in each scope's address pool to the backup server. These addresses are then not available to the main server to allocate to DHCP clients. The backup server uses these addresses in the event that it is running, but cannot talk to the main server, and has not been told that the main server is down.
The question is what percentage of addresses from the main server should be given to the backup.There is no single percentage answer that will suffice for all environments. It depends on the arrival rate of new DHCP clients and the reaction time of your network administration staff.
The backup server needs enough addresses from each scope to satisfy the requests of all new DHCP clients that arrive during the period in which the backup does not know whether or not the main server is down.
If during the day, the administrative staff is able to respond within a two-hour period to a Communications Interrupted and determine whether the main server is working, then the backup server needs enough addresses to support a reasonable upper bound on the number of new DHCP clients that might arrive during that two-hour period.
If during off-hours, the administrative staff is able to respond within a 12-hour period to the same situation, and considering that the arrival rate of previously unheard-from DHCP clients is also less, then the backup server needs enough addresses to support a reasonable upper bound on the number of DHCP clients that might arrive during that 12-hour period
Consequently, the number of addresses over which the backup requires sole control would be the greater of the two numbers, and would ultimately be expressed as a percentage of the currently available (unreserved) addresses in each scope.
If you are using client-class, remember that some clients can only use some set of scopes and other clients can only use other sets of scopes.
When using dynamic BOOTP, do the following:
If both servers are still operating, but cannot communicate, you have no choice but to leave them in COMMUNICATIONS-INTERRUPTED state. In most situations, however, when one server is down for an extended period and the operational server can no longer function effectively in COMMUNICATIONS-INTERRUPTED state, it must be moved into the PARTNER-DOWN state.
There are two ways that a server can move into this state:
Configuring the safe period entails some risk, because it allows one server to enter the PARTNER-DOWN state when the other server may not be down. If this should occur, duplicate IP addresses could be allocated.
The purpose of the safe period is to allow network operations staff some time to react to a server moving into the COMMUNICATIONS- INTERRUPTED state. During the safe period the only requirement is that the network operations staff determine if both servers are still running---and if they are, to either fix the network communications failure, or to take one of the servers down before the expiration of the safe period.
The length of the safe period is installation specific, and depends in large part on the number of unallocated IP addresses within the subnet address pool and the expected frequency of arrival of previously unknown DHCP clients requiring IP addresses. Many environments should be able to support safe periods of several days.
During this safe period, either server allows renewals from any existing client. The only limitation is the need for IP addresses for the DHCP server to hand out to new DHCP clients and the need to reallocate IP addresses to different DHCP clients.
The number of extra IP addresses required is equal to the expected total number of new DHCP clients encountered during the safe period. This is dependent on the arrival rate of new DHCP clients, not on the total number of outstanding leases on IP addresses.
Even if you can only afford a short safe period, because of a dearth of IP addresses or a very high arrival rate of new DHCP clients, then substantial benefit is provided by allowing the DHCP subsystem to ride through minor problems that can be fixed within an hour. In such cases, there is no possibility that duplicate IP address allocation exists, and re-integration after the failure is solved will be automatic and require no operator intervention.
To use failover you need to:
You can configure your network in a variety of ways---from the simplest in which a server has a backup server, to more complicated arrangements. The following are typical configurations:
In Figure 3-1 there is a main server and its backup server.

In Figure 3-2 there are several main servers and a single backup server.

In Figure 3-3 there are two servers that share the network and the backup responsibilities.

![]()
![]()
![]()
![]()
![]()
![]()
![]()
Posted: Thu Feb 3 10:40:59 PST 2000
Copyright 1989 - 2000©Cisco Systems Inc.