Microsoft KB Archive/814623

= Cluster Shared Storage Remains Online After Multiple Hard Disks Fail =

Article ID: 814623

Article Last Modified on 2/22/2007

-

APPLIES TO


 * Microsoft Windows 2000 Advanced Server
 * Microsoft Windows 2000 Datacenter Server

-





SYMPTOMS
Assume that you have a cluster services shared storage group that contains a redundant array of independent disks (RAID) array. When multiple hard disks in the RAID array fail, the Physical disk resource that corresponds to the shared storage group may unexpectedly remain online.

Even though the RAID array has failed in this scenario, the Physical disk resource remains online and does not fail small computer system interface (SCSI) reserve checks; nor does it fail the LooksAlive and the IsAlive polling that is performed by the cluster services.



CAUSE
This issue occurs because the current Microsoft Cluster Services code determines that if the reservation on a hard disk is successful, the disk is alive even if the write operation to sector 12 on the hard disk is unsuccessful. Therefore, the LooksAlive function is also successful. Additionally, if there is no hard disk activity, the IsAlive function may retrieve data from the in-memory cache instead of from the hard disk itself.



RESOLUTION
A supported hotfix is now available from Microsoft, but it is only intended to correct the problem that is described in this article. Only apply it to systems that are experiencing this specific problem. This hotfix may receive additional testing. Therefore, if you are not severely affected by this problem, Microsoft recommends that you wait for the next Microsoft Windows 2000 service pack that contains this hotfix.

To resolve this problem immediately, contact Microsoft Product Support Services to obtain the hotfix. For a complete list of Microsoft Product Support Services phone numbers and information about support costs, visit the following Microsoft Web site:

http://support.microsoft.com/contactus/?ws=support

Note In special cases, charges that are ordinarily incurred for support calls may be canceled if a Microsoft Support Professional determines that a specific update will resolve your problem. The usual support costs will apply to additional support questions and issues that do not qualify for the specific update in question. The English version of this hotfix has the file attributes (or later) that are listed in the following table. The dates and times for these files are listed in coordinated universal time (UTC). When you view the file information, it is converted to local time. To find the difference between UTC and local time, use the Time Zone tab in the Date and Time tool in Control Panel.   Date         Time   Version        Size    File name 20-Jun-2003 10:46  5.0.2195.6759  75,568  Clusdisk.sys



STATUS
Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the &quot;Applies to&quot; section of this article.



MORE INFORMATION
This update changes the cluster services code so that if a write operation to sector 12 is unsuccessful, the hard disk is determined to have failed. For additional information about how to obtain a hotfix for Windows 2000 Datacenter Server, click the following article number to view the article in the Microsoft Knowledge Base:

265173 The Datacenter Program and Windows 2000 Datacenter Server Product

Keywords: kbqfe kbfix kbbug KB814623

-

[mailto:TECHNET@MICROSOFT.COM Send feedback to Microsoft]

© Microsoft Corporation. All rights reserved.