Level: Introductory Jay Allen (allen5@us.ibm.com), IBM Linux Competency Center, IBM Clifford White (ctwhite@us.ibm.com), IBM Linux Competency Center, IBM
29 Nov 2001 As an organization adds applications and services, centralizing authentication and password services can increase security and decrease administrative and developer headaches. However, consolidating any service onto a single server creates reliability concerns. High-availability is especially critical for enterprise authentication services, because in many cases the entire enterprise will come to a stop when authentication stops working. This paper describes how we create a reliable, highly available authentication server using open source software.
The open source software we use
We use an LDAP (Lightweight Directory Access Protocol) server to provide authentication services to which various applications can subscribe. To provide a highly available LDAP server, we use the heartbeat package from the Linux-HA initiative (www.linux-ha.org). We also provide an example of setting up the Apache web server to use LDAP authentication.
Some background on LDAP
We use the OpenLDAP package (www.openldap.org) that's part of several Linux distributions. It ships with RedHat 7.1, and the current download version is 2.0.11.
The LDAP standard is defined in RFC's 2251 and 2253. Several commercial implementations of LDAP exist including the University of Michigan and Netscape implementations. The OpenLDAP foundation was created as "a collaborative effort to develop a robust, commercial-grade, fully featured, and open source LDAP suite of applications and development tools." (See www.openldap.org.) OpenLDAP version 1.0 was released August 1998. The current major version is 2.0, released 31 August 2000 and adds LDAPv3 support.
Like any good network service, LDAP is designed to run across multiple servers. Two features of LDAP are used -- replication and referral.
The referral mechanism lets you split the LDAP namespace across multiple servers and arrange LDAP servers in a hierarchy. LDAP allows only one master server for a particular directory namespace.
Replication is driven by the OpenLDAP replication daemon, slurpd. Slurpd periodically wakes up and checks a log file on the master for any updates. The updates are then pushed to the slave servers. Read requests can be answered by either server, but updates can be performed only on the master. Update requests to a slave generate a referral message which gives the address of the master server. It is the client's responsibility to chase the referral and re-try the update. OpenLDAP has no built-in way to distribute queries across replicated servers, so you must use an IP sprayer/fanout program, such as balance.
Figure 1
To achive our reliability goals, we cluster together a pair of servers. We could use shared storage between the servers and maintained one copy of the data. But for simplicity, we choose to do a shared-nothing implementation. LDAP databases are typically small and update frequency is low. (Hint: If your LDAP data set is large, consider dividing the namespace into smaller pieces with referrals.) The shared-nothing setup does require some care when restarting a failed node: any new changes must be added to the database on the failed node before restart. We'll show an example later.
Cluster software and configuration
To start with, let's clear up a minor confusion. Most HA (High Availability) clusters have a system-keepalive function called the "heartbeat." A heartbeat is used by the HA software to monitor the health of the nodes in the cluster. The Linux-HA (www.linux-ha.org) group provides open source clustering software. Their package is named Heartbeat (currently Heartbeat-0.4.9). This can lead to some understandable confusion. (Well, it confuses me sometimes.) In this paper, we will refer to the Linux-HA package as "Heartbeat" and the general concept as "heartbeat".
The Linux -HA project was started in 1998 as an outgrowth of the Linux -HA HOWTO, written by Harald Milz. The project is currently led by Alan Robertson with many other contributors. Version 0.4.9 was released in early 2001.
Heartbeat monitors node health through communication media, usually serial and Ethernet. It is best to have multiple redundant media, so we are usid a serial line and an Ethernet link. Each node runs a daemon process (called a 'heartbeat') The master daemon forks child processes to read and write to each heartbeat media, and a status process. When a node death is detected, Heartbeat runs shell scripts to start (or stop) services on the secondary node. By design, these scripts use the same syntax as the system init scripts (normally found in /etc/init.d). Default scripts are furnished for filesystem, web server and virtual IP failovers.
Given two matching LDAP servers, there are several configurations we could use. First, we could do a 'cold standby'. The master node would have a virtual IP and a running server. The secondary node would be sitting idle. On failure of the master node, the server instance and IP would move to the cold node. This is simple to implement, but data synchronization between the master and secondary servers could be a problem. To solve that, we can configure the cluster with live servers on both nodes. The master node runs the master LDAP server, and the secondary node runs a slave instance. Updates to the master are immediately pushed to the slave via slurpd.
Figure 2
Failure of the master node leaves our secondary to respond to queries, but now we cannot update. To accommodate updates, on a failover we'll restart the secondary server and promote it to be the master server.
Figure 3
This gives us full LDAP services, but adds one gotcha -- if updates are made to the secondary server, we'll have to fixup the primary before allowing it to restart. Heartbeat supports a 'nice failback' option which bars a failed node from re-acquiring resouces after a failover, which would be preferrable. In this paper, we show a restart by hand. Our sample configuration will use the Heartbeat-supplied virtual IP facility. If heavy query loads need to be supported, the virtual IP could be replaced with an IP sprayer distributing queries to both master and slave servers. In this case, update requests made to the slave would result in a referral. Follow-up of referrals is not automatic; this functionality must be built into the client application. The master and slave nodes are identically configured except for the replication directives. The master configuration file indicates the location of the replication log file (line 16 ) and has a listing of the slave servers which are replication targets with credential information. (lines 34-36).
34 replica host=slave5:389
35 binddn="cn=Manager,dc=lcc,dc=ibm,dc=com";
36 bindmethod=simple credentials=secret
|
The slave configuration file does not indicate the master server; instead, it lists the credentials needed for replication. (line 33)
33 updatedn "cn=Manager,dc=lcc,dc=ibm,dc=com"
|
General Heartbeat preparation
There are several good examples of basic Heartbeat configuration available. (See references at the end of the article.) Here are the relevant bits from our configuration. Our configuration is very simple, so there aren't many bits. By default, all configuration files are kept in /etc/ha.d/.
ha.cf Contains global definitions for the cluster. We use the default values for all timeouts.
# Timeout intervals
keepalive 2
# keepalive could be set to 1 second here
deadtime 10
initdead 120
# define our communications
# serial serialportname ...
serial /dev/ttyS0
baud 19200
# Ethernet information
udpport 694
udp eth1
# and finally, our node id's
# node nodename ... -- must match uname -n
node slave5
node slave6
|
haresources This is where the failover is configured. The interesting stuff is at the bottom of the file.
slave6 192.168.10.51 slapd
|
Here we have indicated three things. The primary owner of the resource is the node 'slave6' (This name _must_ match the output of 'uname -n' of the machine you intend to be primary). Our service address (the virtual IP) is '192.168.10.51'. (This example was done on a private lab network, thus the 192.168 address.) The service script is called 'slapd' . Hearbeat will look for scripts in /etc/ha.d/resource.d and /etc/init.d.
The service script
For the simple cold standby case, we could use the standard /etc/init.d/slapd script without modification. We'd like to do some special things, so we created our own slapd script, which is stored in /etc/ha.d/resource.d/. Heartbeat places this directory first in its search path, so we do not have to worry about the /etc/init.d/slapd script being run. However, you should check to be certain slapd is no longer started on boot. (Remove any S*slapd files from your /etc/rc.d tree) First, on line 17 and 18 we indicate the startup configuration files for the slapd server.
The script follows standard init.d syntax, thus the startup information is contained withing the test_start() function which begins at line 21. First we stop any instances of slapd currently running. On line 39 we start the master server using the master configuration file. Our design will follow this rule: If both the primary and secondary nodes are up, start slapd as master on the primary, start slapd as slave on the secondary and start the replication daemon. If only one node is up, start slapd as master. The virtual IP is tied to the slapd master. To accomplish this, we must know which node is executing the script, and if we are the primary node, we need to know the state of the secondary node. The important stuff will be in the 'start' branch of the script. Since we have indicated a primary node in the Heartbeat configuration, we know when the test_start() function runs, it is running on the Heartbeat primary. (Since Heartbeat uses /etc/init.d/ scripts, all scripts are called with the argument "start|stop|restart" ) When calling a script, Heartbeat sets many environment variables. Here's the one we're interested in:
HA_CURHOST=slave6
We can use the 'HA_CURHOST' value to tell us when we are executing on the primary node (slave6) and when we are in a failover (HA_CURHOST would be 'slave5'). Now we need to know the state of the other node. To find this out, we can ask Heartbeat. We'll use the api_test.c file provided with Heartbeat and create a simple client to ask node status. (The api_test.c file does a bunch more with the client, we simply removed the bits we didn't need, and added one output statement). Notice line 31 in the program where the query is performed.
Heartbeat query source listing
After compiling, we install the file in /etc/ha.d/resource.d/. The program is named 'other_state'. Here is a link to the full failover script, again we start with the example script supplied with Heartbeat and add a few modifications:
Startup Script
Testing
We can now start Heartbeat on both servers. The Heartbeat documentation includes some information on testing the basic setup, so we won't repeat it here. With two heartbeat media connected, you should see six heartbeat processes running. To verify failover, we do several tests. To provide a client for testing, we create a simple KDE application which queries the servers and displays the state of the connection. A real client would only query the virtual IP in this instance, but we query all three IP for illustration purposes. We send ten thousand queries per hour for this test.
Figure 3
S6 is our master LDAP server, S5 is the active standby. The Virtual IP is the lower box. In the normal state, both S5 and S6 show green, indicating successful queries.
First, we stop the heartbeat process on the master node. In this case the slave machine acquires the resources after the 10 second node timeout occurs, as shown in the log excerpt: The takeover includes an additional delay of 2 seconds inside the startup script.
Sep 7 10:28:21 slave5 heartbeat: info: Running
/etc/ha.d/rc.d/shutdone shutdone
Sep 7 10:28:32 slave5 heartbeat[3381]: WARN: node slave6: is dead
Sep 7 10:28:32 slave5 heartbeat[3381]: info: Link slave6:/dev/ttyS0 dead.
Sep 7 10:28:32 slave5 heartbeat[3381]: info: Link slave6:eth1
dead.
Sep 7 10:28:32 slave5 heartbeat: info: Running /etc/ha.d/rc.d/status status
Sep 7 10:28:32 slave5 heartbeat: info: Running /etc/ha.d/rc.d/ifstat ifstat
Sep 7 10:28:32 slave5 heartbeat: info: Running /etc/ha.d/rc.d/ifstat ifstat
Sep 7 10:28:32 slave5 heartbeat: info: Taking over resource group 192.168.10.51
Sep 7 10:28:32 slave5 heartbeat: info: Acquiring resource group: slave6 192.168.10.51 slapd
Sep 7 10:28:32 slave5 heartbeat: info: Running /etc/ha.d/resource.d/IPaddr 192.168.10.51 start
Sep 7 10:28:32 slave5 heartbeat: info: ifconfig eth0:0 192.168.10.51 netmask 255.255.255.0 \
broadcast 192.168.10.255
Sep 7 10:28:32 slave5 heartbeat: info: Sending Gratuitous Arp for 192.168.10.51 on eth0:0 [eth0]
Sep 7 10:28:32 slave5 heartbeat: info: Running /etc/ha.d/resource.d/slapd start
Sep 7 10:28:32 slave5 heartbeat: info: /etc/ha.d/resource.d/slapd: Starting
|
Here is the query flow as seen by our application:
Figure 4
The primary is down, and the virtual IP now is serviced by the secondary. S5 and the virtual IP show green, server S6 is unavailable, and the indicator is red.
After restarting the cluster, we create a failure by removing power from the primary node. Again the resources were acquired by the secondary node after the 10 second timeout expired. Finally, we simulate a complete failure of the interconnects between the two nodes, by unplugging both the serial and Ethernet interfaces. This loss of inter-node communication results in both machines attempting to act as the primary node. This condition is known as split-brain. The default behavior for Heartbeat in this case shows why Heartbeat requires multiple interconnect media using separate media. In a shared-storage setup, the storage interconnect can also be used as a heartbeat media, which decreases the chance of a split brain. Here is a sample from the ha-log showing the shutdown:
heartbeat: 2001/09/07_14:49:46 info: mach_down takeover complete.
heartbeat: 2001/09/07_14:50:36 ERROR: TTY write timeout on [/dev/ttyS0] (no connection?)
heartbeat: 2001/09/07_14:52:53 WARN: Cluster node slave6 returning after partition
heartbeat: 2001/09/07_14:52:53 info: Heartbeat shutdown in progress.
heartbeat: 2001/09/07_14:52:53 ERROR: 105 lost packet(s) for [slave6] [191:297]
heartbeat: 2001/09/07_14:52:53 ERROR: lost a lot of packets!
heartbeat: 2001/09/07_14:52:53 info: Link slave6:eth1 up.
heartbeat: 2001/09/07_14:52:53 WARN: Late heartbeat: Node slave6: interval 211920 ms
heartbeat: 2001/09/07_14:52:53 info: Node slave6: status active
heartbeat: 2001/09/07_14:52:53 info: Giving up all HA resources.
heartbeat: 2001/09/07_14:52:53 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2001/09/07_14:52:53 info: Running /etc/ha.d/rc.d/ifstat ifstat
heartbeat: 2001/09/07_14:52:53 info: Running /etc/ha.d/rc.d/shutdone shutdone
heartbeat: 2001/09/07_14:52:53 info: Releasing resource group: slave6 192.168.10.51 slapd
heartbeat: 2001/09/07_14:52:53 info: Running /etc/ha.d/resource.d/slapd stop
heartbeat: 2001/09/07_14:52:53 info: /etc/ha.d/resource.d/slapd: Shutting down
heartbeat: 2001/09/07_14:52:53 info: Running /etc/ha.d/resource.d/IPaddr 192.168.10.51 stop
heartbeat: 2001/09/07_14:52:53 info: IP Address 192.168.10.51 released
heartbeat: 2001/09/07_14:52:54 info: All HA resources relinquished.
heartbeat: 2001/09/07_14:52:54 info: Heartbeat shutdown in progress.
heartbeat: 2001/09/07_14:52:54 info: Giving up all HA resources.
heartbeat: 2001/09/07_14:52:54 info: All HA resources relinquished.
heartbeat: 2001/09/07_14:52:55 info: Heartbeat shutdown complete.
|
This problem should be considered when choosing timeout values. If the timeout is too short, a heavily loaded system may falsely trigger a takeover, resulting in an appaarent spilt-brain shutdown. See the Linux-ha FAQ document for more information on this.
Recovery after a failover
If updates have been made to the LDAP namespace while the master LDAP server is down, the LDAP databases must be re-synchronized prior to restarting the master server. There are two ways to do this. If a service interruption is possible, the databases can be hand-copied after the LDAP server has been stopped. (Datafiles are kept by default in /usr/local/var.) You can also use OpenLDAP replication to restore the database, without the service interruption. First, start the LDAP server on the former master node as a slave. Then start the slurpd daemon on the current master. Changes received while the former master was out-of service will be pushed from the new master. Finally, stop the slave LDAP server on the former master node, and start Heartbeat. This will result in a failback to the original configuration.
LDAP configuration for Apache
Here is an example of an application subscribing to an LDAP server. The application is the Apache Web server, using the mod_auth_ldap package.
Conclusions
This is a very simple example of using open source software to create some highly available basic network services. Network services including LDAP seldom require huge servers. The additional reliability provided by clustering and the duplication of servers and datafiles can increase service availability. The system worked under all tests, failing over in less than 15 seconds in all cases. Given a good understanding of system loads and utilization, failover time could be reduced below this threshold.
Disclaimer
DISCLAIMER: The foregoing article is based on labaratory tests undertaken in a laboratory environment. Results in particular customer installations may vary based on a number of factors, including workload and configuration in each particular installation. Therefore, the above information is provided on an AS IS basis. The WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED. Use of this information is at user's sole risk.
Resources
About the authors  | |  | Jay D. Allen works by day on the leading edge of IT for IBM, mostly with Linux. By night, Jay works on the trailing edge of IT, mostly with DEC PDP-11s and other antiques. Contact him at allen5@us.ibm.com. |
 | |  | Clifford White is a solutions engineer for IBM and works for IBM's Linux Competency Center. Contact him at ctwhite@us.ibm.com. |
Rate this page
|