Microsoft KB Archive/294169

= DOC: Explanation of Length Arguments for Unicode ODBC Functions =

Article ID: 294169

Article Last Modified on 5/12/2003

-

APPLIES TO


 * Microsoft Open Database Connectivity 3.5
 * Microsoft Data Access Components 2.1
 * Microsoft Data Access Components 2.5
 * Microsoft Data Access Components 2.6
 * Microsoft Data Access Components 2.7

-



This article was previously published under Q294169



SUMMARY
The ODBC Driver Manager version 3.5 or later supports both ANSI and Unicode versions of all functions that accept pointers to character strings or SQLPOINTER in their arguments. The Unicode functions are implemented as functions with a suffix of &quot;W&quot;, such as SQLExecDirectW and SQLGetInfoW.

For many of these Unicode functions, the ODBC Programmer's Reference provides incorrect or ambiguous descriptions for some of the function arguments. Specifically, this problem relates to arguments that are used to specify the length of character string input and output values.



MORE INFORMATION
The problems in the ODBC documentation involve whether a length argument should specify the number of bytes or the number of characters in an input or output string. For the ANSI version of these functions, the count of bytes in a string is equal to the count of characters (because ANSI characters occupy just one byte each), so the terms &quot;bytes&quot; and &quot;characters&quot; can effectively be used interchangeably. However, with Unicode strings, each character occupies two bytes. Therefore, it is very important to distinguish whether length inputs to a Unicode function must specify a count of bytes or of characters.

In some cases, the documentation uses the word &quot;bytes&quot; when the word &quot;characters&quot; should have been used instead. In other cases, the documentation is not necessarily incorrect, but specifies that a &quot;length&quot; of a string is required, without explicitly documenting whether this value should be a count of bytes or a count of characters.

Regardless of what the documentation says for each ODBC function, the following paragraph from the Unicode section of &quot;Chapter 17: Programming Considerations&quot; in the ODBC Programmer's Reference is the ultimate rule to use for length arguments in Unicode functions:

&quot;Unicode functions that always return or take strings or length arguments are passed as count-of-characters. For functions that return length information for server data, the display size and precision are described in number of characters. When a length (transfer size of the data) could refer to string or nonstring data, the length is described in octet lengths. For example, SQLGetInfoW will still take the length as count-of-bytes, but SQLExecDirectW will use count-of-characters.&quot;

This means that if the argument in question describes the length of another argument that is always a string (typically represented as a SQLCHAR), then the length reflects the number of characters in the string. If the length argument describes another argument that could be a string or some other data type (typically represented as a SQLPOINTER), the length is in bytes.

The following table lists all of the ODBC API functions that have Unicode versions. For each function listed, the table highlights the known problems in the ODBC documentation, and provides a more complete description of the length arguments in question.

Using the SQLDriverConnect example from the table above, the InConnectionString and OutConnectionString arguments are both defined as SQLCHAR*, so StringLength1 and BufferLength should indicate the number of characters in the strings. In contrast, consider the SQLGetInfo function. This function takes an input of InfoValuePtr, and the length of this input is passed in BufferLength. Because InfoValuePtr can contain both strings and other types of data, it is a SQLPOINTER. Therefore, applying the rules explained above, BufferLength will be a count of bytes, rather than characters.

