Microsoft KB Archive/323247

= Overview of Basic File Replication Service Concepts and Terminology =

PSS ID Number: 323247

Article Last Modified on 4/29/2003

-

The information in this article applies to:


 * Microsoft Windows 2000 Server SP2
 * Microsoft Windows 2000 Server SP3
 * Microsoft Windows 2000 Advanced Server SP2
 * Microsoft Windows 2000 Advanced Server SP3

-



This article was previously published under Q323247



SUMMARY
This article contains an overview of basic Windows 2000 File Replication Service (FRS) concepts and terminology.



MORE INFORMATION
FRS is used by Active Directory to synchronize system policies and logon scripts that are stored in the system volume across domain controllers. It is also used by the Microsoft Distributed File System (DFS) to synchronize content between assigned members in a replica set. FRS is a multithreaded replication engine. This means that FRS can copy and maintain shared files and folders on multiple servers simultaneously. When changes occur, content is synchronized immediately in sites, and by a schedule between sites.

The multi-master replication model that is used by FRS makes it possible for updates to occur independently on any server in the domain. Any domain controller or member server can propagate changes to replicated files and folders on any other domain controller or member server.

FRS does not guarantee the order in which files arrive. Files start replication in sequential order based on when the files are closed, but file size and link speed determine the order of completion. Because FRS replicates whole files, the whole file is replicated even if you only change a single byte in the file.

FRS is automatically installed on Windows 2000 domain controllers and is configured to start automatically. By default, for member servers, the startup type is set to manual.

Key Terms and Concepts
FRS uses specific terminology that is generally used to describe its features, activities, properties, and functions. The following is a list of key terms and concepts that are associated with FRS:
 * Change Order: When a change is made to a file or folder on a replica member, the information about that change such as the name of the file, and the ID of the member that is used to construct a message is named a change order. The change order is sent to the member's outbound partners that, if the change is accepted, requests the associated staging file. After the change is installed on their individual replica tree, they each propagate the change order to their outbound partners.
 * DFS: DFS is a separate service in Windows 2000 that provides a method to construct a global namespace that spans multiple servers and shares on a network. DFS transparently links file servers and shared folders, and then maps them to a single hierarchy so they can be accessed from one location, although the data is actually distributed in different locations. In this way it can provide high availability, load sharing, and reduced latency by referring the client to a server in a close or local site. Domain-based DFS uses FRS to replicate content between Windows 2000–based servers that host DFS roots or replica sets.
 * File Event Time: The time at which a file system modification to a file or folder is made. This may not be the same as the &quot;file create&quot; or &quot;last-write&quot; time. For example, when you restore a file from a backup tape, the &quot;file create&quot; and &quot;last-write&quot; times are preserved, but the file event time is the time when the actual file restoration is performed.
 * File GUID: The file GUID identifies the file or folder. It is created and managed by the replication service. The file GUID and the replication version number and event time, is stored in the File ID Table in the FRS database. Corresponding files and folders across all replica set members have the same file GUID.
 * File ID Table: This is a table in the FRS database that contains an entry with the version and identity information for each file and folder in the replica tree.
 * File Object ID: See the File GUID description.
 * File Version Number: Each time an update to a file is &quot;published&quot; by FRS, the file version number is incremented. The file version number is used to resolve concurrent updates that originate from more than one member of the replica set. The version number is only incremented by the member that originated the file update. Other members that propagate the update do not change the version number.
 * Identity-Based Replication: All objects in a replica tree have a unique ID assigned to them. In FRS, the NTFS file system Object ID attribute that contains a 16-byte GUID is used. The same object on all replica members has the same object ID. This makes the unambiguous location of the object by using the object's GUID and the corresponding parent GUID possible.
 * Inbound Log: The set of change orders that arrive from all inbound replica partners are placed in the inbound replica log in the order that they are received. The inbound log is a table in the FRS database.
 * Inbound Replica Partners of Computer : The set of computers that provide data to computer   for a replica tree. This is also known as &quot;upstream partners&quot;.
 * Initial master: The first member in a replica set that is the starting point for automatic replication. This means that the files and folders in that replica are replicated to other replicas for the first replication cycle.
 * Last Writer Wins: A reconciliation policy that is used to decide the outcome when two or more users on different computers modify the same file in a replica tree. In most cases, FRS uses the file event time that is associated with each user's change to decide the version of the file to keep. The remaining versions are lost.
 * Local Change Order: A change order that is created because of a change to a file or folder on the local computer. The local computer becomes the originator of the change order and constructs a staging file.
 * Loosely Coherent: Data that is coherent is data that is the same across the network. If data is coherent, data on all servers is synchronized. One type of software system that provides data coherency is a revision control system (RCS). Such a system is typically fairly simple, with only one user being able to modify a specified file at a time. Other users can read the file but cannot change it. In FRS, it is not possible to provide coherent data in a multi-master server environment that is composed of hundreds or thousands of members because not all servers may be connected at the same time. Even if they are connected at the same time, the cost to synchronize is prohibitive. In FRS, the contents of a replica tree are loosely coherent. This means that after all objects are replicated, all replica trees on all connected members have the same data.
 * MD5: A one-way hashing algorithm that is cryptographically secure. The MD5 message-digest algorithm takes a message of arbitrary length as input, and then produces as output a 128-bit &quot;fingerprint&quot; or &quot;message digest&quot; of the input.
 * Morphed Directory Name: When two users create a folder that uses the same name on two different replicas, FRS detects the name conflict during replication. One of the &quot;create&quot; operations wins the name and the other loses it. The operation that loses the name uses a modified name that is composed of the original name with NTFRS_ appended to the end of the name, where the   are hexadecimal numbers.
 * Multi-Master Replication: A replication model where any computer accepts and replicates changes to any other computers that are a part of the replication configuration. This differs from other replication models where a single master computer stores the copy of the information that is to be replicated and other computers store backup copies.
 * Originator GUID: Each member of a replica set has a unique GUID that is assigned to it. All change orders that are produced by this member carry the originator GUID that is saved in the File ID Table.
 * Outbound Log: The set of change orders that is generated for a replica tree. The changes can be generated either locally or from an inbound replica partner. These change orders are eventually sent to all outbound replica partners. The outbound log is a table in the FRS database.
 * Outbound Replica Partners of Computer : The set of computers to which computer   provides data for a replica tree. This is also known as &quot;downstream partners&quot;.
 * Pre-Install Area: A hidden subfolder that is located under the replica root. When a newly-created file or folder is replicated, it is first created and installed in the pre-Install area. When the installation completes, the file or folder is renamed to its target location in the replica tree.
 * Remote Change Order: A change order that is received from an inbound or upstream partner that originated elsewhere in the replica set.
 * Replica Partner : The immediate upstream and downstream partners of a replica member are referred to as its replication partners. Upstream partners are also referred to as inbound partners. Downstream partners are also referred to as outbound partners.
 * Replica Root: The root folder of a replica tree.
 * Replica Set: Two or more copies of a shared folder that participate in replication. Each copy must be located on a different computer.
 * Replica: A member of a replica set that contains a copy of a shared folder or file.
 * Replica Tree: The contents of the folder that is replicated among the members of a replica set.
 * Replication: The process of copying data from one computer to another so that, in the absence of more changes, the data converges to the same content over time. Replication may cover hundreds of computers, not all of which may be accessible at the same time, so data convergence may take several days. Also, there is no guarantee that the order that replicated updates are delivered is the same from one computer to another. Replication enhances availability and file sharing by duplicating shared files.
 * Replication Latency: The delay between a data action that is performed on one replica member and the same action that is performed through replication on another member. This can range from seconds to days depending on various factors including replication schedules, networking components, and server loads.
 * Replication Schedule: Each member of a replica set has a schedule that controls when it participates in replication. Additionally, each connection between members can have a schedule that controls when replication occurs between those members. By default, replication is always &quot;on&quot; if no schedule is specified.
 * Replication Topology: A description of the interconnections between replica set members. These interconnections determine the path that data takes as it replicates to all replica members.
 * Retry Change Order: A change order that is in a stage of completion but is blocked for some reason, and must be retried at a later time. Both local and remote change orders can be Retry Change Orders.
 * Staging Area: This is a folder on each member of a replica set where propagated files are &quot;staged&quot; before they are installed locally on the member. This is performed so that files are not locked for a long time while FRS moves files over a slow or congested link.
 * Staging File: When a file or folder in a replica tree is changed, a local change order is generated and a staging file, that contains the contents of the object, is created. The staging file is then replicated between members of a replica set.
 * Update Sequence Number (USN): NTFS maintains a monotonically increasing sequence number for each volume. This is the update sequence number. Each time a modification is made to a file on the volume, the USN is incremented.
 * USN Journal: NTFS maintains a persistent change log that tracks all files on the volume. The Update Sequence Number (USN) journal records each operation that is performed on a file, such as when a file is created, deleted, or modified. FRS uses the data from the journal to monitor local changes that are made to a replica.
 * Version Vector: This is a vector of Update Sequence Numbers (USNs), in which there is one entry per member of a replica set. All change orders carry the Originator GUID of the originating member and the associated USN. As each member of a replica set receives the update, it tracks the USN in a vector slot that is assigned to the originating member. This vector now describes how up-to-date the replica tree is with respect to each member. The version vector is then used to filter out updates from inbound partners that may have already received the update. It is also delivered to the inbound partner when the two members join. When a new connection is created, the version vector is used to scan the File ID Table for more recent updates that are not seen by the new outbound partner.

