Microsoft KB Archive/266274

= How to Troubleshoot Cluster Service Startup Issues =

Article ID: 266274

Article Last Modified on 3/1/2007

-

APPLIES TO


 * Microsoft Windows 2000 Advanced Server
 * Microsoft Windows 2000 Datacenter Server
 * Microsoft Windows NT Server 4.0 Enterprise Edition

-



This article was previously published under Q266274



IMPORTANT: This article contains information about modifying the registry. Before you modify the registry, make sure to back it up and make sure that you understand how to restore the registry if a problem occurs. For information about how to back up, restore, and edit the registry, click the following article number to view the article in the Microsoft Knowledge Base:

256986 Description of the Microsoft Windows Registry



SUMMARY
This article describes basic troubleshooting steps you can use to diagnose Cluster service startup issues. Although this is not a comprehensive list of all the issues that can cause the Cluster service not to start, it does address about 90 percent of startup issues.



MORE INFORMATION
When the Cluster service initially starts, it attempts to join an existing cluster. For this to occur, the Cluster service must be able to contact an existing cluster node. If the join procedure does not succeed, the cluster continues to the form stage; the main requirement of this stage is the ability to mount the quorum device.

These are the steps in the startup process in order:
 * Authenticate the Service account.
 * Load the local copy of the cluster database.
 * Use information in the local database to try to contact other nodes to begin the join procedure. If a node is contacted and authentication is successful, the join procedure is successful.
 * If no other node is available, the Cluster service uses the information in the local database to mount the quorum device and updates the local copy of the database by loading the latest checkpoint file and replaying the quorum log.

Troubleshooting Cluster Service Startup Issues
WARNING: If you use Registry Editor incorrectly, you may cause serious problems that may require you to reinstall your operating system. Microsoft cannot guarantee that you can solve problems that result from using Registry Editor incorrectly. Use Registry Editor at your own risk.

 Verify that the cluster node that is having problems is able to properly authenticate the Service account. You can determine this by logging on to the computer with the Cluster service account, or by checking the System event log for Cluster service logon problem event messages. Verify that the %SystemRoot%\Cluster folder contains a valid Clusdb file and that the Cluster service attempted to start. Start Registry Editor (Regedt32.3xe) and verify that the following registry key is valid and loaded:

HKEY_LOCAL_MACHINE\Cluster

The cluster hive should have a structure that is very similar to Cluster Administrator. Make note of the network and quorum keys. If the database is not valid, you can copy and use the cluster database from a live node. If all nodes do not have a valid cluster database, see the following article in the Microsoft Knowledge Base:

224999 How to Use the Cluster TMP file to Replace a Damaged Clusdb File

 If the node is not the first node in the cluster, check connectivity to other cluster nodes across all available networks. Use the Ping.exe tool to verify TCP/IP connectivity, and use Cluster Administrator to verify that the Cluster service can be contacted. Use the TCP/IP addresses of the network adapters in the other nodes in the Connect to dialog box in Cluster Administrator. If it cannot contact any other node, the service continues with the form phase. It attempts to locate information about the quorum in the local cluster database, and then tries to mount the disk. If the quorum disk cannot be mounted, the service does not start. If another node has successfully started and has ownership of the quorum, the service does not start. This is usually caused by connectivity or authentication issues. If this is not the case, you can check the status of the quorum device by starting the service with the -fixquorum switch, and attempt to bring the quorum disk online, or change the quorum location for the service. Also, check the System event log for disk errors. If the quorum disk successfully comes online, it is likely that the quorum is corrupted. To correct this issue, see the following Microsoft Knowledge Base articles: Windows NT 4.0:

172951 How to Recover from a Corrupted Quorum Log

Windows 2000:

245762 Recovering from a Lost or Corrupted Quorum Log

 Check the attributes of the Cluster.log file to make sure that it is not read-only, and make sure that no policy is in effect that prevents modification of the Cluster.log file. If either of these conditions exist, the Cluster service cannot start.

If these steps do not resolve the problem, you should take additional troubleshooting steps. The cluster log file can be valuable in additional troubleshooting. By default, cluster logging is enabled on Windows 2000-based computers that are running the Cluster service. To enable cluster logging on Windows NT 4.0-based computers, see the following Microsoft Knowledge Base article:

168801 How to Enable Cluster Logging in Microsoft Cluster Server

Additional query words: mscs

Keywords: kbhowto kbtshoot kbclustering KB266274

-

[mailto:TECHNET@MICROSOFT.COM Send feedback to Microsoft]

© Microsoft Corporation. All rights reserved.