Microsoft KB Archive/206891

= PRB: Value of outerHTML Does Not Match Original File =

Article ID: 206891

Article Last Modified on 5/11/2006

-

APPLIES TO


 * Microsoft Internet Explorer 5.0

-



This article was previously published under Q206891



SYMPTOMS
The outerHTML property of the document object's documentElement property in the Internet Explorer document object model (DOM) is not byte-for-byte the same as the original HTML file loaded into the browser. Common changes include insertion of missing elements (such as HEAD tags), stripping of quotes, capitalization of tag names, and conversion of open tags into closed tags.



CAUSE
Internet Explorer converts all documents it loads into a canonical format to make parsing of HTML documents simpler.



RESOLUTION
Script authors who want to obtain a "pure" version of the page can use the download behavior in Internet Explorer 5. An example of this is available in the MSDN Library.

WebBrowser hosts have many options at their disposal for retrieving the original document. The quickest way is to use the function UrlDownloadToFile in Internet Explorer's URLMON library and read the data back in from disk. For additional information, click the article number below to view the article in the Microsoft Knowledge Base:

244757 HOWTO: Download a File Without Prompting

C++ developers who want to avoid a round-trip to disk can call CreateURLMoniker instead and call IMoniker's BindToStorage to retrieve the bytes via an IStream. The simpler API call UrlOpenStream encapsulates this functionality. Additional information is available at the following MSDN Online Web Workshop site:

URL Monikers Functions

Visual Basic developers who want to avoid the same hit can use WinInet, the networking layer underneath URLMON. A full set of Visual Basic Declares for WinInet functions can be found in the following Knowledge Base article:

185519 FILE: Vbinet.exe WinInet API Declarations for Visual Basic



STATUS
This behavior is by design.



Steps to Reproduce the Behavior
Load and run the following HTML document to see the differences between outerHTML's output and the original HTML file:



function load { alert("The HTML for this file is:\n\n" + document.documentElement.outerHTML); }





Hello, world!

