Microsoft KB Archive/224829

= Description of Windows 2000 and Windows Server 2003 TCP Features =

Article ID: 224829

Article Last Modified on 10/30/2006

-

APPLIES TO


 * Microsoft Windows 2000 Server
 * Microsoft Windows 2000 Advanced Server
 * Microsoft Windows 2000 Professional Edition
 * Microsoft Windows 2000 Datacenter Server
 * Microsoft Windows Server 2003, Enterprise Edition (32-bit x86)
 * Microsoft Windows Server 2003, Standard Edition (32-bit x86)
 * Microsoft Windows Server 2003, Datacenter Edition (32-bit x86)

-



This article was previously published under Q224829



Important This article contains information about how to modify the registry. Make sure to back up the registry before you modify it. Make sure that you know how to restore the registry if a problem occurs. For more information about how to back up, restore, and modify the registry, click the following article number to view the article in the Microsoft Knowledge Base:

256986 Description of the Microsoft Windows registry



SUMMARY
This article describes the following TCP features in Microsoft Windows 2000 and Microsoft Windows Server 2003:
 * TCP Window Size
 * TCP Options Now Supported
 * Windows Scaling - RFC 1323
 * Timestamp - RFC 1323
 * Protection against Wrapped Sequence Numbers (PAWS)
 * Selective Acknowledgments (SACKS) - RFC 2018
 * TCP Retransmission Behavior and Fast Retransmit

The TCP features can be changed by changing the entries in the registry.



MORE INFORMATION
Warning Serious problems might occur if you modify the registry incorrectly by using Registry Editor or by using another method. These problems might require that you reinstall your operating system. Microsoft cannot guarantee that these problems can be solved. Modify the registry at your own risk.

TCP window size
The TCP receive window size is the amount of receive data (in bytes) that can be buffered during a connection. The sending host can send only that amount of data before it must wait for an acknowledgment and window update from the receiving host. The Windows TCP/IP stack is designed to self-tune itself in most environments, and uses larger default window sizes than earlier versions.

Instead of using a hard-coded default receive window size; TCP adjusts to even increments of the maximum segment size (MSS), which is negotiated during connection setup. Adjusting the receive window to even increments of the MSS increases the percentage of full-sized TCP segments utilized during bulk data transmissions.

The receive window size is determined in the following manner:
 * 1) The first connection request sent to a remote host advertises a receive window size of 16K (16,384 bytes).
 * 2) When the connection is established, the receive window size is rounded up to an even increment of the MSS.
 * 3) The window size is adjusted to 4 times the MSS, to a maximum size of 64K, unless the window scaling option (RFC 1323) is used.

Note See the "Windows scaling" section.

For Ethernet connections, the window size will normally be set to 17,520 bytes (16K rounded up to twelve 1460-byte segments). The window size may decrease when a connection is established to a computer that supports extended TCP head options, such as Selective Acknowledgments (SACKS) and Timestamps. These two options increase the TCP header size to more than 20 bytes, which results in less room for data.

In previous versions of Windows NT, the window size for an Ethernet connection was 8,760 bytes, or six 1460-byte segments.

To set the receive window size to a specific value, add the TcpWindowSize value to the registry subkey specific to your version of Windows. To do this, follow these steps:  Click Start, click Run, type Regedit, and then click OK. Expand the registry subkey specific to your version of Windows:  For Windows 2000, expand the following subkey:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\Interfaces

 For Windows Server 2003, expand the following subkey:

  On the Edit menu, point to New, and then click DWORD Value.</li> Type TcpWindowSize in the New Value box, and thne press Enter</li> Click Modify on the Edit menu.</li> Type the desired window size in the Value data box.

Note. The valid range for window size is 0-0x3FFFC000 Hexadecimal.</li></ol>

This value is not present by default. When you add the TcpWindowSize value, it overrides the default window size algorithm discussed above.

Note TcpWindowSize can also be added to the Parameters key to set the window size globally for all interfaces.

TCP options now supported
In the past, TCP options were used primarily for negotiating maximum segment sizes. In Windows, TCP options are used for Window Scaling, Time Stamp, and Selective ACK.

There are two types of TCP options:
 * 1) A single octet TCP option, which is used to indicate a specific option kind.
 * 2) A multiple octet TCP option, which consists of an option kind, an option length and a series of option octets.

The following list shows each TCP option kind, length, name, and description.

Kind: 0

Length: 1

Option: End of Option List

Description: This is used when padding is needed for the last TCP option.

Kind: 1

Length: 1

Option: No Operation

Description: This is used when padding is needed and more TCP options follow within the same packet.

Kind: 2

Length: 4

Option: Maximum Segment Size

Description: This indicates the maximum size for a TCP segment that can be sent across the network.

Kind: 3

Length: 3

Option: Window Scale Option

Description: Identifies the scaling factor to be used when using window sizes larger than 64k.

Kind: 8

Length: 10

Option: Time Stamp Option

Description: Used to help calculate the Round Trip Time (RTT) of packets transmitted.

Kind: 4

Length: 2

Option: TCP SACK permitted

Description: Informs other hosts that Selective Acks are permitted.

Kind: 5

Length: Varies

Option: TCP SACK Option

Description: Used by hosts to identify whether out-of-order packets were received.

Windows scaling
For more efficient use of high bandwidth networks, a larger TCP window size may be used. The TCP window size field controls the flow of data and is limited to 2 bytes, or a window size of 65,535 bytes.

Since the size field cannot be expanded, a scaling factor is used. TCP window scale is an option used to increase the maximum window size from 65,535 bytes to 1 Gigabyte.

The window scale option is used only during the TCP 3-way handshake. The window scale value represents the number of bits to left-shift the 16-bit window size field. The window scale value can be set from 0 (no shift) to 14.

To calculate the true window size, multiply the window size by 2^S where S is the scale value.

For Example:

If the window size is 65,535 bytes with a window scale factor of 3.

True window size = 65535*2^3

True window size = 524280

The following Network Monitor trace shows how the window scale option is used:

TCP: ....S., len:0, seq:725163-725163, ack:0, win:65535, src:1217 dst:139(NBT Session)

TCP: Source Port = 0x04C1

TCP: Destination Port = NETBIOS Session Service

TCP: Sequence Number = 725163 (0xB10AB)

TCP: Acknowledgement Number = 0 (0x0)

TCP: Data Offset = 44 (0x2C)

TCP: Reserved = 0 (0x0000)

+ TCP: Flags = 0x02 : ....S.

TCP: Window = 65535 (0xFFFF)

TCP: Checksum = 0x8565

TCP: Urgent Pointer = 0 (0x0)

TCP: Options

+ TCP: Maximum Segment Size Option

TCP: Option Nop = 1 (0x1)

'''TCP: Window Scale Option

TCP: Option Type = Window Scale

TCP: Option Length = 3 (0x3)

TCP: Window Scale = 3 (0x3)'''

TCP: Option Nop = 1 (0x1)

TCP: Option Nop = 1 (0x1)

+ TCP: Timestamps Option

TCP: Option Nop = 1 (0x1)

TCP: Option Nop = 1 (0x1)

+ TCP: SACK Permitted Option

It's important to note that the window size used in the actual 3-way handshake is NOT the window size that is scaled. This is per RFC 1323 section 2.2, "The Window field in a SYN (for example, a [SYN] or [SYN,ACK]) segment itself is never scaled."

This means that the first data packet sent after the 3-way handshake is the actual window size. If there is a scaling factor, the initial window size of 65,535 bytes is always used. The window size is then multiplied by the scaling factor identified in the 3-way handshake. The table below represents the scaling factor boundaries for various window sizes.

For Example:

If the window size in the registry is entered as 269000000 (269M) in decimal, the scaling factor during the 3-way handshake is 13, because a scaling factor of 12 only allows a window size up to 268,431,360 bytes (268M).

The initial window size in this example would be calculated as follows:

65,535 bytes with a window scale factor of 13.

True window size = 65535*2^13

True window size = 536,862,720

When the value for window size is added to the registry and its size is larger than the default value, Windows attempts to use a scale value that accommodates the new window size.

The Tcp1323Opts value in the following registry key can be added to control scaling windows and timestamp:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters


 * 1) On the toolbar click Start, click Run, and then type Regedit to start the Registry Editor.
 * 2) In the Registry Editor, click Edit, point to New, and then click DWORD Value.
 * 3) In the New Value box, type Tcp1323Opts, press ENTER, and then on the Edit menu, click Modify. Note The valid range is 0,1,2 or 3 where:

0 (disable RFC 1323 options)

1 (window scale enabled only)

2 (timestamps enabled only)

3 (both options enabled)

This registry entry controls RFC 1323 timestamps and window scaling options. Timestamps and Window scaling are enabled by default, but can be manipulated with flag bits. Bit 0 controls window scaling, and bit 1 controls timestamps.

Timestamps
Previously, the TCP/IP stack used one sample per window of data sent to calculate the round trip time (RTT). A timer (retransmit timer) was set when the packet was sent, until the acknowledgment was received. For example, if the window size was 64,240 bytes (44 full segments) on an Ethernet network, only one of every 44 packets were used to recalculate the round trip time. With a maximum window size of 65,535 bytes, this sampling rate was sufficient. Using window scaling, and a maximum window size of 1 Gigabyte, this RTT sampling rate is not sufficient.

The TCP Timestamp option can now be set to be used on segments (data and ACK) deemed appropriate by the stack, to perform operations such as RTT computation, PAWS check, and so on. Using this data, the RTT can be accurately calculated with large window sizes. RTT is used to calculate retransmission intervals. Accurate RTT and retransmission timeouts are needed for optimum throughput.

When TCP time stamp is used in a TCP session, the originator of the session sends the option in its first packet of the TCP three way handshake (SYN packet). Either side can then use the TCP option during the session.

TCP Timestamps Option (TSopt):

The timestamp option field can be viewed in a Network Monitor trace by expanding the TCP options field, as shown below:

TCP: Timestamps Option

TCP: Option Type = Timestamps

TCP: Option Length = 10 (0xA)

TCP: Timestamp = 2525186 (0x268802)

TCP: Reply Timestamp = 1823192 (0x1BD1D8)

Protection against Wrapped Sequence Numbers (PAWS)
The TCP sequence number field is limited to 32 bits, which limits the number of sequence numbers available. With high capacity networks and a large data transfer, it is possible to wrap sequence numbers before a packet traverses the network. If sending data on a 1 Giga-byte per second (Gbps) network, the sequence numbers could wrap in as little as 34 seconds. If a packet is delayed, a different packet could potentially exist with the same sequence number. To avoid confusion in the event of duplicate sequence numbers, the TCP timestamp is used as an extension to the sequence number. Packets have current and progressing time stamps. An old packet has an older time stamp and is discarded.

Selective Acknowledgements (SACKs)
Windows introduces support for a performance feature known as Selective Acknowledgement, or SACK. SACK is especially important for connections that use large TCP window sizes. Prior to SACK, a receiver could only acknowledge the latest sequence number of a contiguous data stream that had been received, or the "left edge" of the receive window. With SACK enabled, the receiver continues to use the ACK number to acknowledge the left edge of the receive window, but it can also acknowledge other blocks of received data individually. SACK uses TCP header options, as shown below.

SACK uses two types of TCP Options.

The TCP Sack-Permitted Option is used only in a SYN packet (during the TCP connection establishment) to indicate that it can do selective ACK.

The second TCP option, TCP Sack Option, contains acknowledgment for one or more blocks of data. The data blocks are identified using the sequence number at the start and at the end of that block of data. This is also known as the left and right edge of the block of data.

Kind 4 is TCP Sack-Permitted Option, Kind 5 is TCP Sack Option. Length is the length in bytes of this TCP option.

Tcp Sack Permitted:

Tcp SACK Option:

With SACK enabled (default), a packet or series of packets can be dropped, and the receiver informs the sender which data has been received, and where there may be "holes" in the data. The sender can then selectively retransmit the missing data without a retransmission of blocks of data that have already been received successfully. SACK is controlled by the SackOpts registry parameter.

The SackOpts value in the following registry key can be edited to control the use of selective acknowledgements:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters


 * 1) On the toolbar click Start, click Run, and then type Regedit to start the Registry Editor.
 * 2) Locate and click the above key in the Registry Editor, and then click Modify on the Edit menu.
 * 3) Type the desired value in the Value data box.

NOTE: The valid binary value is 0 or 1, the default value is 1. This parameter controls whether or not Selective ACK (SACK - RFC 2018) support is enabled.

The following Network Monitor trace illustrates a host acknowledging all data up to sequence number 54857341, plus the data from sequence number 54858789-54861685. The missing data is from 54857341 to 54858788.

TCP: .A...., len:0, seq:925104-925104, ack:54857341, win:32722, src:1242 dst:139

TCP: Source Port = 0x04DA

TCP: Destination Port = NETBIOS Session Service

TCP: Sequence Number = 925104 (0xE1DB0)

TCP: Acknowledgement Number = 54857341 (0x3450E7D)

TCP: Data Offset = 44 (0x2C)

TCP: Reserved = 0 (0x0000)

+ TCP: Flags = 0x10 : .A....

TCP: Window = 32722 (0x7FD2)

TCP: Checksum = 0x4A72

TCP: Urgent Pointer = 0 (0x0)

TCP: Options

TCP: Option Nop = 1 (0x1)

TCP: Option Nop = 1 (0x1)

+ TCP: Timestamps Option

TCP: Option Nop = 1 (0x1)

TCP: Option Nop = 1 (0x1)

TCP: SACK Option

TCP: Option Type = 0x05

TCP: Option Length = 10 (0xA)

TCP: Left Edge of Block = 54858789 (0x3451425)

TCP: Right Edge of Block = 54861685 (0x3451F75)

TCP retransmission behavior and fast retransmit
TCP Retransmission

As a review of normal retransmission behavior, TCP starts a retransmission timer when each outbound segment is handed down to the Internet Protocol (IP). If no acknowledgment has been received for the data in a given segment before the timer expires, then the segment is retransmitted.

The retransmission timeout (RTO) is adjusted continuously to match the characteristics of the connection using Smoothed Round Trip Time (SRTT) calculations as described in RFC 793. The timer for a given segment is doubled after each retransmission of that segment. Using this algorithm, TCP tunes itself to the normal delay of a connection.

Fast Retransmit

TCP retransmits data before the retransmission timer expires under some circumstances. The most common of these occurs due to a feature known as fast retransmit. When a receiver that supports fast retransmits receives data with a sequence number beyond the current expected one, and then it is likely that some data was dropped. To help inform the sender of this event, the receiver immediately sends an ACK, with the ACK number set to the sequence number that it was expecting. It will continue to do this for each additional TCP segment that arrives. When the sender starts to receive a stream of ACKs that is acknowledging the same sequence number, it is likely that a segment has been dropped. The sender will immediately resend the segment that the receiver is expecting, without waiting for the retransmission timer to expire. This optimization greatly improves performance when packets are frequently dropped.

By default, Windows resends a segment if it receives three ACKs for the same sequence number, (one ACK and 2 duplicates) and that sequence number lags the current one. This is controllable with the TcpMaxDupAcks registry parameter.

The TcpMaxDupAcks value in the following registry key can be edited to control the number of ACKs necessary to start a fast retransmits:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters


 * 1) On the toolbar click Start, click Run, and then type Regedit to start the Registry Editor.
 * 2) Locate and click the above key in the Registry Editor, and then click Modify on the Edit menu.
 * 3) Type the desired value in the Value data box.

NOTE: The valid range is 1-3, the default value is 2.

This parameter determines the number of duplicate ACKs that must be received for the same sequence number of sent data before "fast retransmit" is triggered to resend the segment that has been dropped in transit.

Keywords: kbenv kbinfo kbnetwork KB224829

-

[mailto:TECHNET@MICROSOFT.COM Send feedback to Microsoft]

© Microsoft Corporation. All rights reserved.