Microsoft KB Archive/206891

From BetaArchive Wiki

Article ID: 206891

Article Last Modified on 5/11/2006



APPLIES TO

  • Microsoft Internet Explorer 5.0



This article was previously published under Q206891

SYMPTOMS

The outerHTML property of the document object's documentElement property in the Internet Explorer document object model (DOM) is not byte-for-byte the same as the original HTML file loaded into the browser. Common changes include insertion of missing elements (such as HEAD tags), stripping of quotes, capitalization of tag names, and conversion of open tags into closed tags.

CAUSE

Internet Explorer converts all documents it loads into a canonical format to make parsing of HTML documents simpler.

RESOLUTION

Script authors who want to obtain a "pure" version of the page can use the download behavior in Internet Explorer 5. An example of this is available in the MSDN Library.

WebBrowser hosts have many options at their disposal for retrieving the original document. The quickest way is to use the function UrlDownloadToFile() in Internet Explorer's URLMON library and read the data back in from disk. For additional information, click the article number below to view the article in the Microsoft Knowledge Base:

244757 HOWTO: Download a File Without Prompting


C++ developers who want to avoid a round-trip to disk can call CreateURLMoniker() instead and call IMoniker's BindToStorage() to retrieve the bytes via an IStream. The simpler API call UrlOpenStream() encapsulates this functionality. Additional information is available at the following MSDN Online Web Workshop site:

Visual Basic developers who want to avoid the same hit can use WinInet, the networking layer underneath URLMON. A full set of Visual Basic Declares for WinInet functions can be found in the following Knowledge Base article:

185519 FILE: Vbinet.exe WinInet API Declarations for Visual Basic


STATUS

This behavior is by design.

MORE INFORMATION

Steps to Reproduce the Behavior

Load and run the following HTML document to see the differences between outerHTML's output and the original HTML file:

<html>

<SCRIPT>

function load() {
    alert("The HTML for this file is:\n\n" + document.documentElement.outerHTML);
}

</SCRIPT>

<body onload="load();">

Hello, world!<p>

</body>

</html>
                

REFERENCES

For more information about developing Web-based solutions for Microsoft Internet Explorer, visit the following Microsoft Web sites:

Keywords: kbieobj kbprb KB206891