Microsoft KB Archive/251354

= How To Locate and Replace Special Characters in an XML Document with Visual Basic =

Article ID: 251354

Article Last Modified on 7/1/2004

-

APPLIES TO


 * Microsoft Visual Basic 6.0 Learning Edition
 * Microsoft Visual Basic 6.0 Professional Edition
 * Microsoft Visual Basic 6.0 Enterprise Edition
 * Microsoft ActiveX Data Objects 2.5
 * Microsoft ActiveX Data Objects 2.6
 * Microsoft XML Core Services 4.0
 * Microsoft XML Core Services 4.0

-



This article was previously published under Q251354



SUMMARY
XML predefines five entity references for special characters that would otherwise be interpreted as part of the markup language. These five are listed as follows: Entity Reference               Character

&                       &      &lt:                         <

&gt                        >

"                      "

&apos;                      ' When you create an XML Document, you must input these special characters appropriately to insure proper parsing, interpretation, and data display. When you create an XML Document from Visual Basic, one way to search for these special characters and replace them with the proper entity reference is to use the Replace function.

NOTE: Microsoft Visual Basic 6.0 and later products, and Microsoft Office 98 and later products come with a Replace function that is built-in.



MORE INFORMATION
The following sample opens an ADO Recordset, creates Document Elements for the records, and saves the data as an XML Document using the FileSystemObject object. Before you save the file, however, the content is searched for special characters and replaced with the appropriate XML Entity References for those characters utilizing the Replace Method.

NOTE: For simplicity of demonstration, the instructions are for a search and replace for the Ampersand (&) character in the following sample. The same function can be applied for the other four characters as well.  To run this sample, start a new Visual Basic Standard Exe Project. Form1 is created by default. On the Project menu, under References, select the following: Microsoft ActiveX Data Objects Library (2.1 or later)

Microsoft XML, Version 1.0 or later

Microsoft Scripting Runtime*


 * The FileSystemObject Object Model is contained in the Scripting Type Library.

  Draw the following controls on Form1 and set their Properties as listed:   Control                   Name              Caption Label                 lblFSORecordCount           Record Count: Text Box              txtFSORecordCount Command Button        cmdOpenAdoFsoRs             Open ADO Recordset Command Button        cmdSaveRSasXMLwithFSO       Save RS as XML using FSO Command Button        cmdFsoXmltoTreeView         Place XML Data Into TreeView TreeView Control*     xtvFSO Therefore, you want to check this off under Components in the Project menu.   From the Project menu, add a standard Module and copy the following to create the appropriate variables needed in the project: Option Explicit
 * The TreeView Control is part of the Windows Common Controls.

Public cn As New adodb.Connection Public rs As New adodb.Recordset Public myI As Integer Public StrTmp As String Public Errs1 As Errors Public myFSO As New FileSystemObject Public myTxtStream As TextStream Public myfld As adodb.Field Public domdoc As New MSXML.DOMDocument Public domdoc As New MSXML2.DOMDocument 'For use with XML 2.6

  Copy the following code into the General Declarations Section of the code window.NOTE: Replace your Server Name, User ID, and Password information in the Connection String. Private Sub cmdOpenAdoFsoRs_Click Screen.MousePointer = vbHourglass

'Check to be sure the connection is not already opened: If cn.State = 1 Then cn.Close Set cn = Nothing End If

cn.Open "driver={SQL Server};Server=YourServer;uid=YourUID;pwd=YourPWD;database=Northwind"

'Check to be sure a Recordset is not already opened: If rs.State = 1 Then rs.Close Set rs = Nothing End If

With rs   .CursorLocation = adUseClient .CursorType = adOpenKeyset .LockType = adLockOptimistic .CacheSize = 50 .Source = "Select CustomerID, " & _ "CompanyName, ContactName, Region from Customers" .ActiveConnection = cn   .Open End With

rs.MoveLast rs.MoveFirst

txtFsoRecordCount.Text = rs.RecordCount

Screen.MousePointer = vbNormal

End Sub

Private Sub cmdExit_Click If cn.State = 1 Then cn.Close Set cn = Nothing End If

If rs.State = 1 Then rs.Close Set rs = Nothing End If

Unload Me

End Sub

Private Sub cmdSaveRSasXMLwithFso_Click Screen.MousePointer = vbHourglass

'If the file already exists, first Kill it 'and then create it. 'If it does not already exist, simply create it:

On Error GoTo mdkNoSeeEm Const mdkNoSeeEm = 53 'File Not Found Kill "Customers.xml"

'Note: Unicode data is stored as 2 Bytes per character. This will allow 'proper parsing of international language characters. However, either 'omitting or explicitly setting the last 'argument in the CreateTextFile 'Method to False will create the file as Ascii. Subsequently, a '"An 'Invalid Character Was Found in Text Content" error will occur if these 'characters exist in 'the text. Therefore, you use Unicode (True).

'Create a Text file using the FileSystemObject:

Set myTxtStream = myFSO.CreateTextFile(App.Path & "\Customers.xml", True, True)

'After searching for and replacing special characters with 'their respective entities (via the ReplaceEntXMLSpecChar Function), 'write the data to Text File to compile XML Document using the TextStream:

myTxtStream.WriteLine "" myTxtStream.WriteLine "<Customers>"

rs.MoveFirst

Do While Not rs.EOF myTxtStream.WriteLine "<Customer>" myTxtStream.WriteLine "<CustomerID>" & ReplaceEntXMLSpecChar(rs.Fields("CustomerID") & "", "&", "amp;") & "</CustomerID>" myTxtStream.WriteLine "<CompanyName>" & ReplaceEntXMLSpecChar(rs.Fields("CompanyName") & "", "&", "amp;") & "</CompanyName>" myTxtStream.WriteLine "<ContactName>" & ReplaceEntXMLSpecChar(rs.Fields("ContactName") & "", "&", "amp;") & "</ContactName>" myTxtStream.WriteLine "<Region>" & ReplaceEntXMLSpecChar(rs.Fields("Region") & "", "&", "amp;") & "</Region>" myTxtStream.WriteLine "</Customer>" rs.MoveNext Loop

rs.MoveFirst

myTxtStream.WriteLine "</Customers>"

myTxtStream.Close

Screen.MousePointer = vbNormal

Exit Sub

mdkNoSeeEm: If Err.Number = mdkNoSeeEm Then Resume Next End If Screen.MousePointer = vbNormal

End Sub

'When this function is applied to each record in the recordset, 'simply provide the Source to perform the search on, the item 'to be Searched For and the item to Replace with:

Private Function ReplaceEntXMLSpecChar(ByVal strSource As String, ByVal strSearchFor As String, ByVal strReplace As String) _ As String Dim lngPointer As Long, strNew As String lngPointer = InStr(strSource, strSearchFor) If lngPointer = 0 Then ReplaceEntXMLSpecChar = strSource Else strNew = Left$(strSource, lngPointer - 1) & strReplace & Mid(strSource, lngPointer + 1, Len(strSource)) ReplaceEntXMLSpecChar = strNew End If End Function

Private Sub cmdFsoXmltoTreeView_Click 'Fill The TreeView with data from the XML Document:

With DOMdoc .async = False .Load App.Path & "\customers.xml" End With

If DOMdoc.readyState = 4 Then xtvFSO.Nodes.Clear AddNode DOMdoc.documentElement End If End Sub Private Sub AddNode(ByRef myNode As IXMLDOMNode, Optional ByRef TreeNode As Node)

Dim tvcusterrNode As Node Dim tvcusterrNodeList As IXMLDOMNodeList Dim i As Long If TreeNode Is Nothing Then Set tvcusterrNode = xtvFSO.Nodes.Add Else Set tvcusterrNode = xtvFSO.Nodes.Add(TreeNode, tvwChild) End If      tvcusterrNode.Expanded = True tvcusterrNode.Text = myNode.nodeName If tvcusterrNode.Text = "#text" Then tvcusterrNode.Text = myNode.nodeTypedValue Else tvcusterrNode.Text = "<" & tvcusterrNode.Text & ">" End If                   </li> Run the project.</li> Click the Open ADO Recordset CommandButton to open the Connection and Recordset, displaying a RecordCount.</li> Click the Save RS as XML Using FSO CommandButton to parse the data, replace special characters in the Recordset, and then save it as an XML document through the FileSystemObject object.</li>  Subsequently, click the Place FSO XML Data into TreeView CommandButton to open the XML File and display the data in a TreeView Control.

If you open the XML document produced in Notepad or any other text editor and do a search for the ampersand (&), you find: <pre class="fixed_text">Split Rail Beer & In the treeview, this appears as: <pre class="fixed_text">Split Rail Beer & </li></ol>

If a newer version of MSXML has been installed in side-by-side mode, you must explicitly use the Globally Unique Identifiers (GUIDs) or ProgIDs for that version to run the sample code. For example, MSXML version 4.0 can only be installed in side-by-side mode. For additional information about the code changes that are required to run the sample code with the MSXML 4.0 parser, click the following article number to view the article in the Microsoft Knowledge Base:

305019 INFO: MSXML 4.0 Specific GUIDs and ProgIds

<div class="references_section">