Microsoft KB Archive/904803

= Character data is represented incorrectly when the code page of the client computer differs from the code page of the database in SQL Server 2005 =

Article ID: 904803

Article Last Modified on 3/16/2007

-

APPLIES TO


 * Microsoft SQL Server 2005 Developer Edition
 * Microsoft SQL Server 2005 Enterprise Edition
 * Microsoft SQL Server 2005 Standard Edition
 * Microsoft SQL Server 2005 Express Edition

-





SYMPTOMS
Consider the following scenario:
 * In Microsoft SQL Server 2005, you use SQL Server Management Studio to query character data from a SQL Server database that uses a non-Unicode data type. For example, the SQL Server database uses the char data type, the varchar data type, or the text data type.
 * The code page of the client computer differs from the code page of the database. The code page is associated with the collation of the database.

In this scenario, character data is represented incorrectly.

For example, you may experience one of the following problems:
 * The character data is represented as a question mark (?). You may see this problem if you inserted or updated the character data as a non-Unicode data type before you queried the character data. This problem occurs if you make this change by using SQL Server Management Studio on a client computer that has a different code page.
 * The character data is represented as corrupted data. The character data of code page  is stored in a non-Unicode column of code page  . Additionally, the character data is not translated. This problem occurs when you query the character data by using SQL Server Management Studio.

Note When you query the character data by using SQL Query Analyzer in Microsoft SQL Server 2000, the character data is represented correctly if the Perform translation for character data setting (the Auto Translate parameter) is disabled. The Auto Translate parameter is a parameter of the ConnectionString property for Microsoft OLE DB Provider for SQL Server and for Microsoft .NET Framework Data Provider for OLE DB.



CAUSE
This problem occurs because the character data of code page  is stored in a non-Unicode column of code page. Additionally, the character data is not translated correctly. We do not support storing the character data of code page  in a column of code page.

In SQL Server 2005, when you use a string literal of a non-Unicode data type, the string literal is converted by using the database's default code page that is derived from the database's collation. Storing the character data of code page  in a column of code page   may cause data loss or data corruption.

If the character data is represented as corrupted data, the data can be represented correctly only if you disable the Auto Translate parameter for Microsoft OLE DB Provider for SQL Server or for Microsoft .NET Framework Data Provider for OLE DB.

Note SQL Server Management Studio uses Microsoft .NET Framework Data Provider for SQL Server to connect to the SQL Server database. This data provider does not support the Auto Translate parameter.



WORKAROUND
To work around this problem, use one of the following methods.

Method 1: Use a Unicode data type instead of a non-Unicode data type
Change the columns to a Unicode data type to avoid all the problems that are caused by code page translation. For example, use the nchar data type, the nvarchar data type, or the ntext data type.

For more information about storing Unicode data, click the following article number to view the article in the Microsoft Knowledge Base:

239530 You must precede all Unicode strings with a prefix N when you deal with Unicode string constants in SQL Server

Method 2: Use an appropriate collation for the database
If you must use a non-Unicode data type, always make sure that the code page of the database and the code page of any non-Unicode columns can store the non-Unicode data correctly. For example, if you want to store code page 949 (Korean) character data, use a Korean collation for the database. For example, use the Korean_Wansung_CI_AS collation for the database.

Method 3: Use the binary data type or the varbinary data type
If you want the database to directly store and retrieve the exact byte values of the characters that are handled without trying to perform appropriate code page translation, use the binary data type or the varbinary data type.

Method 4: Use a different tool to store and retrieve data, and disable the Auto Translate parameter
Warning We do not test or support storing the character data of code page  in a column of code page. This operation may cause linguistically incorrect query results, incorrect string matching or ordering, and unexpected code page translation (data corruption). We encourage you to use one of the other methods to work around this problem.

When you use Microsoft OLE DB Provider for SQL Server to connect to a database that has a different code page and you try to query character data from a non-Unicode data type column, you must make sure that you store the untranslated characters to the database.

Note The following example assumes that the code page of the client computer is Korean (CP949) and that the code page of the SQL Server database is English (CP1252). You must replace the placeholders in the code examples with values that are appropriate for your situation.

To work around this problem, follow these steps:   Manually convert the characters to raw data, and then insert the data into the database by using the code page of the database. To do this, use code that is similar to the following code example. string strsrc=&quot;가&quot;;string strsrc=&quot;가&quot;; string strtag=Encoding.GetEncoding(1252).GetString(Encoding.GetEncoding(949).GetBytes (strsrc)); sql=&quot;insert into values ('&quot; + strtag + &quot;')&quot;; // code for updating the database; Note This code example is written in C#.   When you want to query the data, use Microsoft OLE DB Provider for SQL Server or Microsoft .NET Framework Data Provider for SQL Server to connect to the database, and then set the Auto Translate parameter to False. To do this, use code that is similar to the following code example. OleDbConnection conn=new OleDbConnection(&quot;Provider=SQLOLEDB;&quot; +   &quot; Initial Catalog = ;&quot;+    &quot;User id= ; Password= ;&quot;+    &quot;Auto Translate=False&quot;    ); // code for representing the character data; 



STATUS
Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the &quot;Applies to&quot; section.



Steps to reproduce the problem
 On the client computer that has Korean (CP949) as the default code page, start SQL Server Management Studio. Connect to a database that has English (CP1252) as the default code page.</li>  Create a table in the database by using the following line of code. Create table tbTest (A char(20), NA nchar(10), Comment char(20)) </li>  Insert a Korean character to the database by using the following query. Insert into tbTest (A,NA,Comment) values('가',N'가','SQL2005/INSERT') </li>  Create a select query to retrieve the data by using the following line of code. select * from tbTest </li></ol>

You receive the following results. The value in column A is a question mark. A                   NA         Comment -- ?                   가          SQL2005/INSERT

<div class="references_section">