Microsoft KB Archive/212704

= BUG: Special Characters Are Getting Converted Inside String =

Article ID: 212704

Article Last Modified on 1/23/2004

-

APPLIES TO


 * Microsoft Internet Explorer 3.0
 * Microsoft Internet Explorer 3.01
 * Microsoft Internet Explorer 3.02
 * Microsoft Internet Explorer 4.0 128-Bit Edition
 * Microsoft Internet Explorer 4.01 Service Pack 2
 * Microsoft Internet Explorer 4.01 Service Pack 1
 * Microsoft Internet Explorer 4.01 Service Pack 2

-



This article was previously published under Q212704



SYMPTOMS
Internet Explorer converts all instances of named entities inside an HTML document, even when common convention dictates that it should not, such as inside tag attribute quoted strings.

For example, Internet Explorer would treat the following opening anchor tag as if the URL contained a less than (<) symbol in the middle.





RESOLUTION
Change any instances of ampersands in the HTML document to the following:

"&amp;"

If the page should be converting the ampersand combination as a named entity, ensure that the named entity is correctly terminated by a semicolon. Change query string parameters for URLs that are not generated by form submittals so they don't use names similar to typical named entities.



STATUS
Microsoft has confirmed that this is a bug in the Microsoft products that are listed at the beginning of this article. This problem was corrected in Internet Explorer 5.



Steps to Reproduce Behavior
The following HTML page demonstrates this bug: Entity Parsing Demonstration  Right-click on the links to see the URL in Properties

curren problem lt problem</a><BR>

<input type=button value="Hello&gt"><BR> </BODY> </HTML> When viewing this page in Internet Explorer 4, the strings inside the HTML tags are parsed as if they contained entities, despite the common convention of parsing only entities that are terminated by semicolons. As a result, the ampersand-"curren" in the middle first URL is converted to the currency character, the ampersand-"lt" in the middle of the second URL is converted to a less-than symbol, and the ampersand-"gt" and the end of the button value is converted to a greater than symbol.

The incorrectly parsed URLs can be viewed in the Internet Explorer status bar when mousing over the hyperlinks, or by right-clicking on the hyperlink and choosing the Properties option.

NOTE: In Internet Explorer 5, numeric entities may still be converted in inconvenient situations, as in the following example: <A HREF="javascript:"dostuff(http://somesite.asp?queryvalue1=%3fstuff%3f');"> NOTE: In Internet Explorer 5, this has been fixed to convert only named entities if they are an exact match and they are followed by a non-alphanumeric character or are at the end of the string.

For example, this fixes the parsing of the two hyperlinks in the above sample HTML. The button will still read "Hello>" because the ampersand-"gt" entity falls at the end of the string.

Keywords: kbbug kbhtml kbpending KB212704

-

[mailto:TECHNET@MICROSOFT.COM Send feedback to Microsoft]

© Microsoft Corporation. All rights reserved.