Microsoft KB Archive/822050

= Cluster Service Stops Responding on a Cluster Node When You Restart the Active Node =

Article ID: 822050

Article Last Modified on 10/30/2006

-

APPLIES TO


 * Microsoft Windows 2000 Advanced Server
 * Microsoft Windows 2000 Datacenter Server
 * Microsoft Windows Server 2003, Datacenter Edition (32-bit x86)
 * Microsoft Windows Server 2003, Enterprise Edition (32-bit x86)

-





SYMPTOMS
When you restart the active node of a server cluster that consists of two or more nodes, you experience all the following symptoms:  If you are running Cluster Administrator on the remaining nodes, you receive the following error message when you try to connect to the cluster:

Cluster ' ' is no longer available.

 If you try to start Cluster Administrator, Cluster Administrator stops responding, and you may receive the following error message:

An error occurred trying to open the cluster at ' ':

The interface is unknown.

Error ID: 1717 (000006b5).

 When you view the contents of Cluster.log, you see information similar to the following:

[FM] OnlineGroup: Failed on resource e3f4af72-6454-4199-b9af-fa6f57032a65. Status 70 Microsoft Clustering Service suffered an unexpected fatal error at line 701 of source module D:\nt\private\cluster\service\fm\group.c. The error code was 70.

 When the restarted cluster node starts successfully, the Cluster Administrator program that is running on the other nodes responds as you expect.



CAUSE
This issue occurs if you pause one node of a server cluster and then you restart the active cluster node. When the active node restarts, the paused node tries to bring resource groups online. Because this node is paused, the node cannot make additional connections, and it cannot bring the Quorum disk group online. Error code 70 corresponds to the following error message:

The remote server has been paused or is in the process of being started.

Note These results will also occur in clusters that have more than two nodes. Even though a non-paused node exists in a working state when the active node is restarted, if the paused node is the first node that is contacted to take ownership of the quorum disk. The non-paused node does not have the opportunity to arbitrate for the quorum disk.



RESOLUTION
To resolve this issue, resume the paused cluster node before you restart the active cluster node.

Note Before you resume a paused cluster node, you must first determine if a cluster node is paused.  Click Start, click Run, type cmd in the Open box, and then click OK.</li> At the command prompt, type cluster node, and then press ENTER. Output that is similar to the following appears.

Note The following sample output is based on a two-node cluster configuration. If you have more than two nodes, the additional nodes will also appear in the list.

<pre class="fixed_text">Node          Node ID Status -- --- - CLUSTER-1           1 Paused CLUSTER-2           2 Up

Note If the only cluster node that is not paused is in the process of restarting, you receive the following error message:

System error 1753 has occurred.

There are no more endpoints available from the endpoint mapper.

</li> At the command prompt, type cluster node  /resume (where   is the name of the cluster node) and then press ENTER.

For example, type cluster node  /resume, and then press ENTER. Information appears that is similar to the following:

<pre class="fixed_text">Resuming node 'cluster-1'...

Node          Node ID Status -- --- - CLUSTER-1           1 Up

</li></ol>

Additional query words: MSCS

Keywords: kbprb KB822050

-

[mailto:TECHNET@MICROSOFT.COM Send feedback to Microsoft]

© Microsoft Corporation. All rights reserved.