Microsoft KB Archive/917398

= Support for the Hong Kong collation in SQL Server 2005 =

Article ID: 917398

Article Last Modified on 3/16/2007

-

APPLIES TO


 * Microsoft SQL Server 2005 Express Edition
 * Microsoft SQL Server 2005 Standard Edition
 * Microsoft SQL Server 2005 Workgroup Edition
 * Microsoft SQL Server 2005 Developer Edition
 * Microsoft SQL Server 2005 Enterprise Edition

-



Bug number: 331555 (SQLBUDT)



SUMMARY
This article describes the support for the Hong Kong collation in Microsoft SQL Server 2005. This article also includes information about how to work with character data in SQL Server 2005.



MORE INFORMATION
The Hong Kong collation is a new collation that is supported in SQL Server 2005. The Hong Kong collation uses the Microsoft Windows Server 2003 sorting table. Therefore, the Hong Kong collation supports supplementary characters.

The Hong Kong Special Administrative Region (SAR) government has defined the Hong Kong Supplementary Character Set (HKSCS). The HKSCS is a supplementary character set that includes Chinese characters that are used in Hong Kong but that are not contained in the Big5 standard character set. Two code allocation schemes for the HKSCS exist, one for the Big5 standard character set and one for ISO 10646/Unicode. The current version of HKSCS is HKSCS-2001.

HKSCS-2001 is not supported natively in Windows XP, Windows 2000, and Windows Server 2003. If you specify a database to use the Hong Kong collation before you install the HKSCS-2001 package, non-Unicode data is stored as Big5 encoding by using codepage 950.

For client computers that interact with an instance of SQL Server and that use the Hong Kong collation, download and install the HKSCS package. For instances of SQL Server 2005 that use the Hong Kong collation, you may have to download the HKSCS-2001 package, depending on whether the data is defined as Unicode or as non-Unicode. To download the HKSCS package, visit the following Microsoft Web site:

http://www.microsoft.com/hk/hkscs/default.aspx

Note The HKSCS-2001 package for Windows 2000 and Windows XP can also be applied to Windows Server 2003.

When you define data types that reflect language in SQL Server, you can use non-Unicode character data types or Unicode character data types. Non-Unicode character data types are char, varchar, and text. When you define the collation for a table column, a variable, or a parameter, the code page for the represented characters is specified. The represented characters are limited to the characters in the code page. For example, if a char column's collation is Chinese_PRC_BIN, the characters of that column are limited to the characters in code page 936. If you want to use a Hong_Kong collation, such as Chinese_Hong_Kong_Stroke_90_BIN, on a non-Unicode character column, you must download the HKSCS-2001 package that contains the full set of characters for that code page. Client computers that interact with SQL Server 2005 in this environment must have the HKSCS-2001 package installed to recognize all characters and to prevent possible data loss.

Unicode character data types are nchar, nvarchar, and ntext. The Unicode specification defines a single encoding scheme for most characters that are widely used in business around the world. For example, assume that a Chinese_Hong_Kong_Stroke_90_BIN collation is defined on an nchar column, on an nvarchar column, or on an ntext column. The collation dictates only the sorting and the comparison rules on that data, not the code page. In this environment, you do not have to download the HKSCS-2001 package on the computer that contains the instance of SQL Server 2005. However, a client computer that interacts with SQL Server must still have the HKSCS-2001 package installed.

Note Versions of Windows that are later than Windows Server 2003 are expected to support the HKSCS natively. However, the support is limited to Unicode encoding only.

We strongly recommend that you use Unicode character data types so that your application receives the best performance and language compatibility. If you store character data that reflects multiple languages, always use Unicode data types instead of the non-Unicode data types. You may experience a significant performance gain by using Unicode data types because fewer code-page conversions will be required. Significant limitations are associated with non-Unicode data types because a non-Unicode computer will be limited to the use of a single code page.

