Microsoft KB Archive/325615

From BetaArchive Wiki
Knowledge Base


Article ID: 325615

Article Last Modified on 12/3/2007



APPLIES TO

  • Microsoft Windows 2000 Service Pack 1
  • Microsoft Windows 2000 Service Pack 2
  • Microsoft Windows 2000 Service Pack 3
  • Microsoft Windows 2000 Advanced Server
  • Microsoft Windows 2000 Advanced Server
  • Microsoft Windows 2000 Advanced Server
  • Microsoft Windows Server 2003, Standard Edition (32-bit x86)
  • Microsoft Windows Server 2003, Enterprise Edition (32-bit x86)
  • Microsoft Windows Server 2003, Datacenter Edition (32-bit x86)
  • Microsoft Windows Small Business Server 2003 Premium Edition
  • Microsoft Windows Small Business Server 2003 Standard Edition



This article was previously published under Q325615

SYMPTOMS

When you try to create a new mirror volume or add a mirror to an existing volume on a dynamic disk, you may receive the following error message:

Logical Disk Manager

Operation aborted due to disk I/O error.

Disk Management now displays a yellow exclamation mark on the disk that contained the I/O error. If you right-click the drive and then click Reactivate, this clears the error.

CAUSE

If you look in the system event log, you see event messages similar to the following posted by DMIO that report read or write errors from the affected disk

Event Type: Information
Event Source:   dmio
Event Category: None
Event ID:   29
Computer:       Computer_name
Description:    dmio: Harddisk1 read error at block 3328: status 0xc000009c

Event Type: Information
Event Source:   dmio
Event Category: None
Event ID:   26
Computer:   Computer_name
Description:    dmio: Found a bad block on disk Harddisk1 at block number 3328 
                    

where the status of 0xC000009C is STATUS_DEVICE_DATA_ERROR.

If you perform a Chkdsk against the source volume by using the /F /R switches to locate bad sectors and recover readable information, Chkdsk successfully marks those sectors (clusters) as "bad", as seen in the in the Bad Sectors section of Chkdsk output.

EXAMPLE:

C:\>chkdsk D: /f /r
The type of the file system is NTFS.
Volume label is MASTER.

CHKDSK is verifying files (stage 1 of 5)...
File verification completed.
CHKDSK is verifying indexes (stage 2 of 5)...
Index verification completed.
CHKDSK is verifying security descriptors (stage 3 of 5)...
Security descriptor verification completed.
CHKDSK is verifying file data (stage 4 of 5)...
File data verification completed.
CHKDSK is verifying free space (stage 5 of 5)...
Free space verification is complete.
Adding 1 bad clusters to the Bad Clusters File.  <-- bad sectors marked.
Correcting errors in the Volume Bitmap.
Windows has made corrections to the file system.

   2047999 KB total disk space.
        20 KB in 2 files.
         4 KB in 9 indexes.
         2 KB in bad sectors.  <-- (NOTE BAD SECTOR(S)
     12851 KB in use by the system.
     12288 KB occupied by the log file.
   2035122 KB available on disk.

      2048 bytes in each allocation unit.
   1023999 total allocation units on disk.
   1017561 allocation units available on disk.
                    

As a result of Chkdsk finding bad blocks, the following event message is also posted:

Event Type: Warning
Event Source:   dmio
Event Category: None
Event ID:   35
Computer:   Computer_name
Description: dmio: Disk Harddisk1 block 3168 (mountpoint D:): Uncorrectable read error 
                    

Although Chkdsk instructs the file system not to use those sectors, when you try to establish the mirror again, you still receive the same error message from Logical Disk Manager (LDM) and the same DMIO system event log messages.

When it establishes software mirrors on dynamic disks, DMIO does a sector-by-sector copy of the source disk to the destination disk. DMIO does not know or care which sectors contain data or which sectors may have been marked "bad" by Chkdsk. Chkdsk marks those bad sectors only in the file system (FAT, FAT32, or NTFS), so that the file system does not try to use them. DMIO operates below the file system, and if it finds I/O errors while reading from a sector on the source disk or while trying to write the data to the destination disk, it aborts the mirroring operation.

RESOLUTION

If the disk that contains the bad block is a small computer system interface (SCSI) disk drive, back up the data from the affected disk, and then perform a low-level format of the affected disk. To do this, use a utility supplied by the disk manufacturer, or use the disk controller itself, to mark those sectors "bad" at the hardware level. Then, re-create the volume and try to establish the dynamic mirror. If the mirroring is successful, restore your data to the now mirrored volume. If the disk that contains bad blocks is an integrated device electronics (IDE) disk, you must back up the data and then replace the disk with a new IDE disk before you establish the mirror.

STATUS

This behavior is by design.

MORE INFORMATION

SECTOR SPARING:

Sector Sparing technology permits software to communicate directly with a disk to mark a defective sector (block) as "bad." The SCSI specification defines command (0x7) for reassigning of bad blocks. When Windows comes across a bad disk block on a dynamic disk, Dmio.sys may call IOCTL_DISK_REASSIGN_BLOCKS for any bad sectors found.

The IOCTL_DISK_REASSIGN_BLOCKS operation maps defective blocks to new location on the disk. This request instructs the device to reassign the bad block address to a good block from its spare-block pool, and then the SCSI drive either returns a status of "failed" or "success." If successful, the SCSI drive remaps that sector with a spare, so that whenever a read or write operation is performed on the original bad sector number (or numbers), the SCSI drive redirects the I/O to the newly remapped sector (or sectors).

Although IDE drives do not support this sector sparing functionality and do not accept external software commands to tell the disk that it contains a bad block, some IDE drives do support internal remapping of sectors as they start going bad. This functionality is handled by the internal firmware on the IDE disk itself. If internal remapping is aggressive enough, the operating system (and therefore the end user) never know that they have experienced bad blocks.

Although sector sparing on SCSI disks is supported, this feature is not implemented immediately on read errors, but only during a write operation following a read failure.

After a read failure, DMIO records the offset of the bad sector into a bad sector list. The next write that hits one of the recorded bad sectors triggers IOCTL_DISK_REASSIGN_BLOCKS. There is a timeout of five minutes associated with how long a sector can stay in the bad sector list without being written to, which triggers the reassignment.

This behavior is most effective for healthy mirrors. When a read from one plex fails, the bad sector is recorded. DMIO then reads from the other plex and writes back to the first plex. The bad sector is reassigned during that write operation to maintain data integrity.

In the Add Mirror case where the bad sector is on the source disk, the bad sector number from the source disk is recorded in the bad sector list but is never written to. Therefore, it is never reassigned (unless the user finds a way to force a write to that bad sector within five minutes from the read failure), and therefore the mirror operation is aborted.

In the Add Mirror case where the bad sector is on the destination disk, the write failure does not add the bad sector number to the bad sector list. Therefore, it is not reassigned and the mirror operation is aborted.

Therefore, even though the code dealing with bad sectors is always active, the critical part (calling IOCTL_DISK_REASSIGN_BLOCKS) occurs only when a read operation and then a write-back operation are run on an already established mirror. This current design prevents a mirror from being established if a read or write error occurs on either the source or destination disk.

ADDITIONAL NOTE:

When DMIO successfully performs sector sparing (reassigns bad blocks) during a write operation on an already established mirror, the following event messages are posted.

Event Type: Information
Event Source:   dmio
Event Category: None
Event ID:   23
Computer:   Computer_name
Description: dmio: Reassigning bad block number 3328 on disk Harddisk1 

Event Type: Information
Event Source:   dmio
Event Category: None
Event ID:   24
Computer:   Computer_name
Description: dmio: Reassign bad block(s) on disk Harddisk1 succeeded 
                    

By design, Autochk.exe and Chkdsk.exe do not perform sector sparing. These utilities only record the defective sectors (blocks) in the bad cluster tables managed by the (FAT, FAT32, or NTFS) file system on the volume. If Chkdsk finds any clusters that contain one or more bad blocks, the clusters are marked as "bad" so that the file system does not try to use those clusters.


Additional query words: LDM

Keywords: kberrmsg kbprb KB325615