Microsoft KB Archive/288215

= Information about Microsoft Excel 2002 and XML =

Article ID: 288215

Article Last Modified on 11/2/2007

-

APPLIES TO


 * Microsoft Excel 2002 Standard Edition

-



This article was previously published under Q288215



SUMMARY
Excel 2002 enables you to open and save files in the Extensible Markup Language (XML) format. This article provides an overview of XML and XML stylesheets and how they can be used with Excel. The following topics are discussed:

What is XML?

What is an XML stylesheet?

How does Excel translate XML files?

XML flattening

Stylesheets

Troubleshooting errors opening your XML files

The XML Spreadsheet (XMLSS) format

Integrating XML in your Excel solutions



What is XML?
XML is a text format for structured data. Because XML is easy to create and interpret, it is used in a wide variety of applications. Proper, or well-formed, XML follows a set of guidelines that dictates the data structure yet gives the XML developer great flexibility to create an unlimited number of customized tags that best describe the data and relationships that might exist within the data.

Like HTML, XML is a markup language in which text is organized with a combination of tags (words enclosed in angle brackets ) and attributes (of the form name = &quot;value&quot;). But whereas HTML specifies what each tag and attribute means and how it should be rendered for display, XML uses the tags only to delineate elements of data. Because you can use any tag names with XML, it is at the sole discretion of an application to interpret the data and its meaning. Although XML developers are free to create whatever tags or hierarchy best fit their data, there is a set of specifications that should be followed to ensure that the XML is considered well-formed. Well-formed XML has the following characteristics:
 * It contains exactly one root element with a unique name that does not appear in any other element in the document.
 * Elements are properly nested so that no tags overlap between elements.
 * All element tags are closed.
 * Element start and end tags use a consistent case (XML is case-sensitive).
 * All element attributes are enclosed in quotation marks, either double or single.
 * Special characters (such as &, <, >) are defined as built-in entities (&amp;, <, >).

The following is an example of well-formed XML:      Energetic Having, exerting, or displaying energy   Happy Enjoying, displaying, or characterized by pleasure or joy</Definition> </Entry>  <Word Type=&quot;2&quot;>Emotion</Word> A complex, strong subjective response</Definition> </Entry> </Entries> </Dictionary> Figure 1 - Well-Formed XML Sample (Dictionary.xml)

The XML in Figure 1 describes entries in a dictionary. The root element is <Dictionary>, and the root contains one child element named <Entries>. <Entries> contains three children, each of which describes the data for an individual entry in the dictionary. For each entry, there are two child elements:  and . The value of each  element is the word itself;  elements also have a Type attribute that indicates whether the word is an adjective (1) or a noun (2).  elements have a text value and no attributes.

What is an XML stylesheet?
XML tags describe the data in a text file, but XML alone does not specify how the data should be presented to the user. Formatting rules for XML data are often contained in XML stylesheets. XML stylesheets are well-formed XML documents that use Extensible Style Language (XSL) to transform XML data for presentation. You can have many stylesheets for the same XML data. While your actual XML data structure might remain the same, stylesheets give you flexibility to present the data in many different ways. <?xml version='1.0'?> <xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/TR/WD-xsl&quot;> <xsl:template match=&quot;/&quot;> <HTML> <BODY> <UL> <xsl:for-each order-by=&quot;+ Word&quot; select=&quot;Dictionary/Entries/Entry&quot;> <LI> <B><xsl:value-of select=&quot;Word&quot;/></B> <xsl:if test=&quot;Word[@Type='1']&quot;><I>adj.</I></xsl:if> <xsl:if test=&quot;Word[@Type='2']&quot;><I>n.</I></xsl:if> <BR/> <xsl:value-of select=&quot;Definition&quot;/> <BR/><BR/> </LI> </xsl:for-each> </UL> </BODY> </HTML> </xsl:template> </xsl:stylesheet> Figure 2: XML Stylesheet (Dictionary.xsl)

Dictionary.xsl, shown in Figure 2, is a stylesheet that can be used to transform the sample Dictionary.xml. The transformation results in an HTML presentation of the data that can be viewed in a Web browser. The XSL sorts the dictionary entries alphabetically in a bulleted list and formats each component of the dictionary entry. The transformed XML that is rendered in the browser resembles the following:
 * Emotion n.

A complex, strong subjective response


 * Energetic adj.

Having, exerting, or displaying energy


 * Happy adj.

Enjoying, displaying, or characterized by pleasure or joy

Try it out: Copy and paste the XML in Figure 1 into any text editor and save it as Dictionary.xml. Then copy and paste the XML stylesheet in Figure 2 and save it as Dictionary.xsl in the same folder as Dictionary.xml. Start Internet Explorer and browse to Dictionary.xml.

How does Excel translate XML files?
Excel can open any XML file that is well-formed. XML files can be opened directly from the Open command on the File menu. XML files can also be opened programmatically using either the Open or OpenXML methods of the Workbooks collection. Regardless of whether you open the XML through code or through the user interface (UI), the file is marked as Read-only so that you cannot accidentally replace your original source file with one in the XML Spreadsheet (XMLSS) format.

XML flattening

In Excel, when you import XML that is not in the XML Spreadsheet (XMLSS) format, Excel uses a special flattening algorithm to load the data into rows and columns. Excel worksheets are two-dimensional entities comprised of rows and columns; because XML can have more than two dimensions, flattening is needed to interpret the XML so that it can be loaded into cells on a worksheet.

Consider the following two sets of XML, which essentially represent the same data in different ways:

<?xml version=&quot;1.0&quot;?> <Customer> <CustomerID>1234</CustomerID> <FirstName>John</FirstName> <LastName>Doe</LastName> </Customer> Figure 3: Customer XML with Child Elements <?xml version=&quot;1.0&quot;?> <Customer CustomerID=&quot;1234&quot; FirstName=&quot;John&quot; LastName=&quot;Doe&quot;/> Figure 4: Customer XML with Attributes

If you load the XML in Figure 3 into Excel, the data is imported into cells as shown below:

Excel treats attributes exactly like child elements. To eliminate name collision with existing elements, Excel prepends the at sign (@) to the front of the attribute names for consistency with the naming standards for XSL patterns. Therefore, the XML in Figure 4 is loaded into cells in the same way, with the exception that cells A2, B2, and C2 contain &quot;/@CustomerID&quot;, &quot;/@FirstName&quot;, and &quot;/@LastName&quot;, respectively.

The Customer XML in Figures 3 and 4 follows a two-dimensional structure, which is easy to see. But consider the following Customer XML, which cannot be easily interpreted as two-dimensional because of the addition of the <Order> level:

<?xml version=&quot;1.0&quot;?> <Customers> <Customer> <CustomerID>1234</CustomerID> <FirstName>John</FirstName> <LastName>Doe</LastName> <Orders> <Order ProdID=&quot;AAA&quot; Amt=&quot;10&quot;/> <Order ProdID=&quot;BBB&quot; Amt=&quot;3&quot;/> </Orders> </Customer> <Customer> <CustomerID>5678</CustomerID> <FirstName>Alice</FirstName> <LastName>Smith</LastName> <Orders> <Order ProdID=&quot;AAA&quot; Amt=&quot;5&quot;/> </Orders> </Customer> </Customers> Figure 5: Customer XML with Two Levels Beneath the Root Element

In the XML in Figure 5, there is one customer with two orders and another customer with one order. Excel must import the XML in such a way that the relationship between Orders and Customers is maintained. Excel imports the XML into three rows as shown below:

Two additional columns are present: a #id column that associates the row with the <Customer> node, and a #agg column that represents what can be considered a &quot;FillDown&quot;. With #agg, the value of the first node in the level above it is listed once and then omitted for remaining rows.

Also note that the column for <Amt> precedes the column for <ProdID> even though it appears after <ProdID> in the actual XML. The flattener loads the elements in alphabetical order from left to right, not in the natural order in which they appear in the original XML document.

Stylesheets

Using stylesheets can provide significant advantages for opening XML files in Excel. By using a stylesheet, you can better control how data is placed in cells and even provide some formatting for the cells. When you open an XML file in Excel and that XML file contains one or more xml-stylesheet processing instructions, Excel prompts you to select the stylesheets you want to use for transformation.

Once again consider the Customer XML in Figure 5. If you add a processing instruction to the Customer XML (<?xml-stylesheet type=&quot;text/xsl&quot; href=&quot;Customer.xsl&quot; ?>), you can control which columns in the worksheet the data appears in, and apply custom formatting. The following is a stylesheet sample that can be used as a transform for the Customer XML sample.

<?xml version='1.0'?> <xsl:stylesheet version=&quot;1.0&quot; xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;> <xsl:template match=&quot;/&quot;> <HTML> <HEAD> <STYLE> .HDR { background-color:bisque;font-weight:bold } .CustomerRow { border-top:thin solid black } </STYLE> </HEAD> <BODY> </BODY> </HTML> </xsl:template> </xsl:stylesheet> Figure 6 - Customer XML Stylesheet (Customer.xsl)

The stylesheet transforms the XML into valid HTML (which Excel can translate into its native format). When you open the Customer XML with this stylesheet applied, Excel displays the data as a four-column table with headers (Customer ID, Name, Product ID, and Amount). Cell formatting such as font attributes, cell background colors, and borders are also present in the table.

Try it out: Copy and paste the XML in Figure 5 into any text editor and save it as Customer.xml. Then copy the XML stylesheet in Figure 6 and save it as Customer.xsl in the same folder as Customer.xml. Start Excel 2002, and open the file with and without the stylesheet applied to see the differences that the stylesheet makes.

Troubleshooting errors opening your XML files

If you attempt to open an XML file that is not well-formed, you will receive an alert or a &quot;Script Parse Error&quot;. When an alert appears, it provides a description of why the XML could not be opened. The alert description may also provide details on the node that caused the translation to fail. In some cases in which there is a parse error, you may receive a dialog box that lists errors instead of an alert message; script parse errors are logged to a file named XMLErr.log in your Temp folder. To quickly examine the contents of XMLErr.log, follow these steps:
 * 1) At a command prompt, type cd %temp% and press Enter. The current folder changes to your Temp folder.
 * 2) At the command prompt, type type xmlerr.log and press Enter to display the contents of the log file.

Another problem that you might encounter when you open your XML file is that the Text Import Wizard appears. XML files are just text files that are structured in a very specific fashion. For Excel to recognize that the file you are opening is XML and not plain text, the first item in your XML file must be an XML declaration that contains, at a minimum, the reserved name xml and a version number. The XML must also contain at least one element (the root element): <?xml version=&quot;1.0&quot;?> <MyXML> </MyXML>

The XML Spreadsheet (XMLSS) format
Excel can save workbooks in XML. When you save a workbook as XML, Excel saves the workbook in the XML Spreadsheet (XMLSS) format. XMLSS describes the contents of the workbook by persisting cell data and formulas, cell formats, worksheet settings, and workbook settings. However, some Excel features cannot be persisted in XML. These include:


 * Charts
 * OLE Objects
 * Drawing shapes or AutoShapes
 * VBA Projects
 * Group and Outline

XMLSS is a format that is common to both Excel 2002 and the Office XP Spreadsheet component, so files in XMLSS may be shared between the two. You can also create your own stylesheets for XMLSS files to transform the data to another format so that it can be shared with any number of applications.

Integrating XML in your Excel solutions
Using XML data presents the developer with many opportunites for Excel 2002 solutions.

For example, you can create multisheet workbooks in XMLSS at runtime without automating Excel (which requires loading the Excel executable file). Because Excel is an out-of-process Automation server, calls to the Excel object model can be costly in terms of performance. You can create XMLSS completely in-process using only the Microsoft XML Parser (MSXML) and XML stylesheets. Or, you can use the Office XP Spreadsheet component (in-process) to build a spreadsheet and retrieve the XMLSS for the spreadsheet. In either case, the resulting XMLSS can be opened directly in Excel as a native format. By streaming the XMLSS to client browsers using Excel's Multipurpose Internet Mail Extensions (MIME) content type, Web applications such as those that use Active Server Pages (ASP) can create complex workbooks without the overhead of running Excel on the server. For sample code that demonstrates possible uses of XMLSS generation, click the article numbers below to view the articles in the Microsoft Knowledge Base:

288130 How to use ASP to build spreadsheet XML for client-side display

285891 How to use Visual Basic or ASP to create an XML spreadsheet for Excel 2002 and Excel 2003

Another potential use of XML in an Excel solution is to extract data from a range of cells in a format that can be easily shared with other applications. XMLSS can be extracted from cells on a worksheet by calling the Value property of the Range object with the xlRangeValueXMLSpreadsheet parameter. You can use a stylesheet to transform the XMLSS to any XML that can be used by another application. One possible use of this is to transform the XMLSS to custom XML that can be posted and processed by a Web application. For an example of this technique, see the following article in the Microsoft Knowledge Base:

278976 How to use XSL to transform Excel XML spreadsheet for server-side use

<div class="references_section">