Microsoft KB Archive/316063

= How to locate and replace special characters in an XML file with Visual C# .NET =

Article ID: 316063

Article Last Modified on 3/29/2007

-

APPLIES TO


 * Microsoft Visual C# .NET 2003 Standard Edition
 * Microsoft Visual C# .NET 2002 Standard Edition
 * Microsoft .NET Framework 1.0
 * Microsoft .NET Framework 1.1

-



This article was previously published under Q316063



For a Microsoft Visual Basic .NET version of this article, see 308060.

IN THIS TASK

 * SUMMARY
 * Description of the technique
 * Determine whether you must replace a special character
 * Not required: XML file in which the data is retrieved from a database
 * Required: XML file that contains third-party XML data with special characters
 * Replace the special characters
 * Create the XML file
 * Create a Visual C# .NET project
 * REFERENCES



SUMMARY
This article describes how to replace special characters in an Extensible Markup Language (XML) file by using Visual C# .NET.

back to the top

Description of the technique
XML predefines the following five entity references for special characters that would otherwise be interpreted as part of markup language:

You can use entity and character references to escape the left angle bracket, the ampersand, and other delimiters. You can also use numeric character references. Numeric character references are expanded immediately when they are recognized. In addition, because numeric character references are treated as character data, you can use the numeric character references

If you declare either of the following two entities:
 * &amp;
 * &amp;

you must declare them as internal entities whose replacement text is a character reference to the respective character (the left angle bracket or the ampersand) that is being escaped. This double escaping is required for these entities so that references to them produce a well-formed result.

If you declare any of the following three entities:
 * &apos;
 * &apos;

you must declare them as internal entities whose replacement text is the single character that is being escaped.

back to the top

Not required: XML file in which the data is retrieved from a database
When you are using the Microsoft .NET Framework, data is retrieved and is stored in a DataSet object. When you write data from a DataSet to an XML file by using the WriteXml method, the special characters that are referred to in the &quot;Summary&quot; section are replaced with the respective character references. Therefore, when you write XML files, and if you use a DataSet, no special replacement process is required.

back to the top

Required: XML file that contains third-party XML data with special characters
Sometimes the XML file or the XML data that comes from a third party may use these special characters. In this scenario, the data generates errors when you load it into an XmlDocument object or an XmlReader object.

You receive the following error message when the ampersand character is encountered:

An Error occurred while parsing, line  , position.

where line  and position   represent the exact position of the special character.

You receive the following error message when a left angle bracket is encountered:

The '<' character, hexadecimal value 0x3C,cannot be included in a name. Line, position.

In this error message, the line  and position   do not indicate the position where the left angle bracket exists, but where the second left angle bracket is encountered.

If the XML file contains a right angle bracket (>), a straight quotation mark (&quot;), or an apostrophe ('), the XmlReader and the XmlDocument objects handle these objects because these characters require only single character replacement.

back to the top

Replace the special characters
To replace the ampersand and the left angle bracket characters:
 * 1) Create the XML file.
 * 2) Create the Visual C# .NET application, and then insert the code.

back to the top

Create the XML file
Copy and paste the following code into Notepad, and then save the file as Customers.xml:    BLAUS Blauer See Delikatessen Hanna Moos test   SPLIR</CustomerID> Split Rail Beer & Ale</CompanyName> Art raunschweiger</ContactName> <Region>WY</Region> </Customer> </Customers> back to the top

Create Visual C# .NET project
<ol> <li>Create a new Visual C# .NET Windows application as follows: <ol style="list-style-type: lower-alpha;"> <li>Start Microsoft Visual Studio .NET.</li> <li>On the File menu, point to New, and then click Project.</li> <li>In the New Project dialog box, click Visual C# Projects under Project Types, and then click Windows Application under Templates.</li></ol> </li> <li>Drag a TextBox control, two Button controls, and a DataGrid control from the toolbox to your default form, Form1.cs.</li> <li>Set the Multiline property of the TextBox to True.</li> <li> Import the following namespaces: using System.Xml; using System.IO; using System.Data.SqlClient; </li> <li> Add the following code after the Main function: string filepath = &quot;C:\\Customers.xml&quot;; private void ReplaceSpecialChars(long linenumber) {           System.IO.StreamReader strm; string strline; string strreplace = &quot; &quot;; string tempfile = &quot;C:\\Temp.xml&quot;; try {               System.IO.File.Copy(filepath,tempfile,true); }           catch (Exception ex) {               MessageBox.Show(ex.Message); }

StreamWriter strmwriter = new StreamWriter(filepath); strmwriter.AutoFlush = true; strm = new StreamReader(tempfile); long i = 0; while (i < linenumber - 1) {               strline = strm.ReadLine; strmwriter.WriteLine(strline); i = i + 1; }

strline = strm.ReadLine; Int32 lineposition; lineposition = strline.IndexOf(&quot;&&quot;); if (lineposition > 0) {               strreplace = &quot;&amp;&quot;; }           else {               lineposition = strline.IndexOf(&quot;<&quot;,1); if (lineposition > 0 ) {                   strreplace = &quot;<&quot;; }

}           strline = strline.Substring(0, lineposition - 1) + strreplace + strline.Substring(lineposition + 1); strmwriter.WriteLine(strline);

strline = strm.ReadToEnd; strmwriter.WriteLine(strline);

strm.Close; strm = null;

strmwriter.Flush; strmwriter.Close; strmwriter = null;

}       public XmlDocument LoadXMLDoc {       XmlDocument xdoc; long lnum; try {           xdoc = new XmlDocument; xdoc.Load(filepath); }       catch (XmlException ex) {           MessageBox.Show(ex.Message); lnum = ex.LineNumber; ReplaceSpecialChars(lnum);

xdoc = LoadXMLDoc; }   return (xdoc); }                   </li> <li> Add the following code to the Button1_Click event: XmlDocument xmldoc = new XmlDocument; xmldoc = LoadXMLDoc; XmlNode nextnode; nextnode = xmldoc.FirstChild.NextSibling; this.textBox1.Text = nextnode.OuterXml.ToString; </li> <li> Add the following code to the Button2_Click event: DataSet ds = new DataSet; XmlDocument xdoc = new XmlDocument; SqlConnection cnNwind = new SqlConnection(&quot;Data source=myServerName;user id=myUser;Password=myPassword;Initial catalog=Northwind;&quot;); SqlDataAdapter daCustomers = new SqlDataAdapter(&quot;Select customerid,companyname,contactname, region from customers where region='WY'&quot;, cnNwind); string filepath = &quot;C:\\Customers.xml&quot;; try {               daCustomers.Fill(ds, &quot;Customers&quot;); this.dataGrid1.DataSource = ds.Tables[&quot;Customers&quot;]; ds.WriteXml(filepath); xdoc.Load(filepath); XmlNode nextnode; nextnode = xdoc.FirstChild.NextSibling; textBox1.Text = nextnode.OuterXml.ToString; }           catch (Exception ex) {               MessageBox.Show(ex.Message); }                   </li> <li>Change the properties in the SqlConnection connection string as necessary for your environment.</li> <li>Build and run the project.</li> <li>Click Button1.

The errors that you receive are consistent with the description of the errors that are explained in the Required: An XML file with special characters section. The XML data appears in the TextBox; the ampersand is replaced with</li> <li>Click Button2.

In the DataGrid, notice that companyname has an ampersand and that the TextBox displays the XML data with</li></ol>

back to the top

<div class="references_section">