Microsoft KB Archive/822400

= Description of disaster recovery options for Microsoft SQL Server =

Article ID: 822400

Article Last Modified on 11/2/2007

-

APPLIES TO


 * Microsoft SQL Server 2005 Standard Edition
 * Microsoft SQL Server 2005 Developer Edition
 * Microsoft SQL Server 2005 Enterprise Edition
 * Microsoft SQL Server 2005 Express Edition
 * Microsoft SQL Server 2005 Workgroup Edition
 * Microsoft SQL Server 2000 Standard Edition

-



SUMMARY
This article discusses various solutions for recovering data from a Microsoft SQL Server database, if a disaster occurs. This article also discusses the advantages and the disadvantages of each solution.

Disaster recovery is a process that you can use to help recover information systems and data, if a disaster occurs.

Some examples of disasters include a natural or a man-made disaster such as a fire, or a technical disaster such as a two-disk failure in a Redundant Array of Independent Disks (RAID) 5 array.

Disaster recovery planning is the work that is devoted to preparing all the actions that must occur in response to a disaster. The planning includes the selection of a strategy to help recover valuable data. The selection of the appropriate disaster recovery strategy depends on your business requirements.

Note The solutions that are discussed in this article only provide general descriptions of the technologies that you can use. These general descriptions are for comparing the various disaster recovery methods and the disaster recovery plans. Before you decide on which disaster recovery solution is best for you, make sure that you look at each of the suggested disaster recovery solutions in more detail. After discussing each disaster recovery solution, this article contains links where you can find additional information about that solution.

Failover clustering
Microsoft SQL Server 2000 failover clustering is designed to failover automatically if a hardware failure or a software failure occurs. You can use SQL Server 2000 failover clustering to create a failover cluster for a single instance of SQL Server 2000 or for multiple instances of SQL Server 2000. Failover clustering allows a database system to automatically switch the processing of an instance of SQL Server from a failed server to a working server. Therefore, failover clustering is helpful if an operating system failure occurs or if you perform a planned upgrade of the database system resources. Also, failover clustering increases server availability with no downtime.

Because failover clustering is designed for high server availability with almost no server downtime, the clustered nodes should be geographically close to each other. Failover clustering may not be useful if a disk array failure occurs.

Note To implement failover clustering, you must install Microsoft SQL Server 2000 Enterprise Edition.

The following operating systems support failover clustering:
 * Microsoft Windows NT 4.0, Enterprise Edition
 * Microsoft Windows 2000 Advanced Server
 * Microsoft Windows 2000 Datacenter Server
 * Microsoft Windows Server 2003, Enterprise Edition
 * Microsoft Windows Server 2003, Datacenter Edition

These operating systems include an installable component, Microsoft Cluster Service (MSCS). To implement failover clustering for SQL Server, you must install MSCS.

For more information about MSCS and its installation, click the following article number to view the article in the Microsoft Knowledge Base:

259267 Microsoft Cluster Service installation resources

Advantage
You have high server availability. Failover clustering automatically occurs if the primary server fails.

Disadvantages

 * You incur a greater expense. The maintenance of two servers is two times the cost of maintaining a single server. Because you have to maintain two servers at the same time, it is more expensive to install and maintain clustered nodes.
 * Servers should be in the same location. If the branches of the organization are across the globe and the Active/Active clusters must be implemented in the branches, the networking and the storage infrastructure that you have to use is very different from a standard quorum device server cluster. Therefore, although it is possible, it is best not to use geographically distant servers.
 * You have no protection against a disk array failure.
 * Failover clustering does not allow you to create failover clusters at the database level or at the database object level, such as the table level.

For more information about failover clustering, visit the following Microsoft Web site:

http://msdn2.microsoft.com/en-us/library/aa174512(SQL.80).aspx

For more information about failover clustering, click the following article numbers to view the articles in the Microsoft Knowledge Base:

243218 Installation order for SQL Server 2000 Enterprise Edition on Microsoft Cluster Server

822250 Support WebCast: Microsoft SQL Server 2000 failover clustering disaster recovery procedures

For more information about the Microsoft Support Policy for a SQL Server failover cluster, click the following article number to view the article in the Microsoft Knowledge Base:

327518 The Microsoft support policy for a SQL Server failover cluster

Database mirroring
Database mirroring is a primarily software solution for increasing database availability. You can only implement mirroring on a per-database basis. Mirroring only works with databases that use the full recovery model. The simple and bulk-logged recovery models do not support database mirroring. Therefore, all bulk operations are always fully logged. Database mirroring works with any supported database compatibility level.

Advantages

 * Database mirroring increases data protection.
 * Database mirroring increases availability of a database.
 * Database mirroring improves the availability of the production database during upgrades.

Disadvantages

 * The mirror database should be identical to the principal database. For example, all objects, logins, and permissions should be identical.
 * Database mirroring involves the transfer of information from one computer to another computer over a network. Therefore, the security of the information that SQL Server transfers is very important.

Peer-to-peer transactional replication
Peer-to-peer transactional replication is designed for applications that might read or might modify the data in any database that participates in replication. Additionally, if any servers that host the databases are unavailable, you can modify the application to route traffic to the remaining servers. The remaining servers contain identical copies of the data.

Advantages

 * Read performance is improved because you can spread activity across all nodes.
 * Aggregate update performance, insert performance, and delete performance for the topology resembles the performance of a single node because all changes are propagated to all nodes.

Disadvantages

 * Peer-to-peer replication is available only in SQL Server 2005 Enterprise Edition.
 * All participating databases must contain identical schemas and data.
 * We recommend that each node use its own distribution database. This configuration eliminates the potential for SQL Server 2005 to have a single point of failure.
 * You cannot include tables and other objects in multiple peer-to-peer publications within a single publication database.
 * You must have a publication enabled for peer-to-peer replication before you create any subscriptions.
 * You must initialize subscriptions by using a backup or by setting the value of the subscription synchronization type to replication support only.
 * Peer-to-peer transactional replication does not provide conflict detection or conflict resolution.
 * We recommend that you do not use identity columns.

Maintenance of a warm standby server
You can create and maintain a warm standby server by using either of the following methods:
 * Log shipping
 * Transactional replication

More information about each of these two methods follows.

Log shipping
Log shipping is included in the resource kit for Microsoft SQL Server 7.0, and it is fully incorporated in the Microsoft SQL Server 2000 Enterprise Edition and in the Microsoft SQL Server 2000 Developer Edition. Log shipping uses a standby server that is not used during regular operations. A standby server is useful to help recover data if a disaster occurs. You can only use log shipping at the database level. You cannot use it at the instance level.

When a standby server is restoring transaction logs, the database is in exclusive mode and it is unusable. However, you can run batch reporting jobs between transaction log restorations or Database Console Commands (DBCC) checks to continuously verify the integrity of the standby server. For applications such as decision support servers that require continuous processing on a database server, log shipping is not an appropriate option.

The latency on the standby server is based on how frequently the transaction log backups are taken at the primary server and then applied at the standby server. If the primary server fails, you may lose the changes that were made by the transactions that occurred after your most recent transaction log backup.

For example, if transaction log backups are taken every 10 minutes, transactions during the most recent 10 minutes may be lost. This does not necessarily mean that the data updates that are made to the primary server during the latency period will be lost. Typically, new updates in the primary transaction log can be recovered and applied at the warm standby server with only a small delay in switching from the primary server to the standby server. The main purpose of log shipping is to maintain a warm standby server. If maintaining a warm standby server is your main objective, log shipping is likely to be more appropriate than the other solutions that this article discusses.

Advantages

 * You can recover all database activities. The recovery includes any objects that were created such as tables and views. It also includes security changes such as the new users who were created and any permission changes.
 * You can restore the database faster. The restoration of the database and the transaction log is based on low-level page formats. Therefore, log shipping speeds up the restoration process and results in the fast recovery of data.

Disadvantages

 * The database is unusable during the restoration process because the database is in exclusive mode on the standby server.
 * There is a lack of granularity. During the restoration process, all the changes in the primary server are applied at the standby server. You cannot use log shipping to apply changes to a few tables and to reject the remaining changes.
 * There is no automatic failover of applications. When the primary server fails because of a disaster, the standby server does not failover automatically. Therefore, you must explicitly redirect the applications that connect to the primary server to the standby (failover) server.

Note If your main purpose is to maintain a warm standby server, Microsoft recommends that you use log shipping. The warm standby server reflects all the transactions that occur on the primary server. However, you cannot use the standby server when the primary server is available.

For more information about how to set up a warm standby server by using log shipping, click the following article numbers to view the articles in the Microsoft Knowledge Base:

323135 Microsoft SQL Server 2000 - How to set up log shipping (white paper)

325220 Support WebCast: Microsoft SQL Server 2000 log shipping

For more information about log shipping, visit the following Microsoft Web sites:

http://msdn2.microsoft.com/en-us/library/aa213785(SQL.80).aspx

http://www.microsoft.com/downloads/details.aspx?familyid=7395ec1b-199f-42bc-a31b-2056adf73f94

Transactional replication
You can also use transactional replication to maintain a warm standby server. Transactional replication replicates the data on one server (the publisher) to another server (the subscriber) with less latency than log shipping. You can implement transactional replication at the database object level such as the table level. Therefore, Microsoft recommends that you use transactional replication when you have less data to protect, and you must have a fast recovery plan.

You can use a push subscription to enforce transactional replication between two servers with the primary server as the publisher and the standby server as the subscriber. Transactional replication ensures data replication. When the publisher fails, the subscriber can be used.

This solution is vulnerable to the failure of the publisher and the subscriber at the same time. In such a scenario, you cannot protect the data. In all other scenarios such as the failure of a distributor or a subscriber, it is best to resynchronize the data in the subscriber with the data in the publisher.

You should use transactional replication to maintain a warm standby server only when you do not implement schema changes or you do not implement other changes to the database such as security changes that replication does not support.

Note Replication is not designed for the maintenance of warm standby servers. With replication, you can use replicated data at the subscriber to generate reports. You can also use replication for other general uses without having to perform processing on the relatively busy publisher.

Advantages

 * You can read data on a subscriber while you apply changes.
 * Changes are applied with less latency.

Note This advantage may not be applicable if either of the following is true:
 * Replication agents are not set to Continuous.
 * Replication agents are stopped because of errors that may occur during replication.

Transactional replication may take more time to apply changes because large batch updates must be performed during the replication.

Disadvantages
 Schema changes or security changes that are performed at the publisher after establishing replication will not be available at the subscriber. The distributor in transactional replication uses an Open Database Connectivity (ODBC) connection or an OLE Database (OLEDB) connection to distribute data. However, log shipping uses the RESTORE TRANSACTION low-level Transact-SQL statement to distribute the transaction logs. A RESTORE TRANSACTION statement is much faster than an ODBC connection or an OLEDB connection. Typically, switching servers erases replication configurations. Therefore, you have to configure replication two times:

When you switch to the subscriber.

When you switch back to the publisher.

 If a disaster occurs, you must manually switch servers by redirecting all the applications to the subscriber.

For more information about replication, click the following article number to view the article in the Microsoft Knowledge Base:

195757 Frequently asked questions - SQL Server 7.0 - Replication



Backup and restore feature
The Backup and Restore feature of SQL Server provides an important safeguard to help protect critical data that you store in SQL Server databases. You can create a copy of a database (a backup copy) by using the Backup and Restore feature, and then store the copy of the database in a location that is protected from the potential failure of the server that runs the instance of SQL Server. If you experience a database system failure or database corruption, you can then use the backup copy to re-create the database or to restore the database.

When you plan disaster recovery by using the Backup and Restore feature, also determine how critical the data in the database is. Additionally, determine the restoration requirements for the database. For example, determine the following restoration requirements:  The point that you restore the database to. You have to decide which of the following two you want to do:

Restore the database to the condition of the night before the failure.

Restore the database to the condition of a point of time as close as possible to the time of failure.

 How long the database can be unavailable. Whether you must restore the database immediately.</li></ul>

After you determine the restoration requirements, you can plan a backup process that maintains a set of backups to meet the requirements

You can only restore a database to the condition of the point of time where you performed the most recent backup. Transactions that occurred after that backup may be lost. Therefore, Microsoft recommends that you use the Backup and Restore feature only for non-mission-critical database applications.

Advantages

 * You can back the database up to removable media to help protect against disk failures.
 * You do not have to depend on the network as you do when you use failover clustering or log shipping.

Disadvantages

 * When you back up the database, you cannot perform operations such as table creation, index creation, database shrinking, or nonlogged operations.
 * If a failure occurs, you may lose your most recent data.
 * If a disaster occurs, you must manually restore the database.

Note Before you use your Backup and Restore procedure in a production environment, it is best to thoroughly test this procedure in a test environment.

For more information about the Backup and Restore feature, click the following article numbers to view the articles in the Microsoft Knowledge Base:

325257 Support WebCast: SQL Server 2000 Database Recovery: Backup and Restore

281122 Description of restoring file and filegroup backups in SQL Server

For more information about the Backup and Restore feature, visit the following Microsoft Web sites:

http://msdn2.microsoft.com/en-us/library/aa196617(SQL.80).aspx

http://msdn2.microsoft.com/en-us/library/aa196685(SQL.80).aspx

http://msdn2.microsoft.com/en-us/library/aa178143(SQL.80).aspx

Disk redundancy of data by using a redundant array of independent disks (RAID)
A RAID stores redundant data on multiple disks to provide greater reliability and less downtime for servers. RAID levels 0, 1, and 5 are generally used as recovery options for SQL Server. The RAID technologies that are mentioned allow for the failure and the consequent replacement of a single disk without the server going offline. If multiple disk failures occur, data may not be recoverable. Therefore, Microsoft recommends that you combine redundant data management with a Backup and Restore procedure to help make sure that you do not lose data if a hardware failure or other disaster occurs.

RAID 0 uses striping technology for faster access whereas RAID 1 uses mirroring technology for data reliability. A common technique used in relational database management involves using RAID 0 and RAID 1 together. In this technique, two identical striped arrays of drives are constantly updated so that the information that is stored on both the arrays is the same. If one array fails, the other array automatically takes over until the original array is brought back online.

RAID 5 (also known as striping with parity) uses a single striped disk array with parity bits written together with the data. When any one disk fails, the parity bits can be used to calculate the missing data until you replace the disk. When you replace the disk, you can use the parity information and the remaining data to re-create the data from the failed disk and to copy the re-created data to the new disk. All these operations occur without database system downtime. A RAID provides many other options and features for you to help make sure that your database systems experience as little downtime as possible.

Advantage
You do not lose data if any one disk fails.

Disadvantages

 * It may take a long time to recover the data.
 * If multiple disks fail, you may not be able to recover valuable data.

For more information about RAID, click the following article number to view the article in the Microsoft Knowledge Base:

100110 Overview of redundant arrays of inexpensive disks (RAID)

<div class="references_section">