Microsoft KB Archive/843522

= Update is available to configure the code page that is used by the Search HTML filter in SharePoint Portal Server 2003 =

Article ID: 843522

Article Last Modified on 4/24/2007

-

APPLIES TO


 * Microsoft Office SharePoint Portal Server 2003

-



Important This article contains information about how to modify the registry. Make sure to back up the registry before you modify it. Make sure that you know how to restore the registry if a problem occurs. For more information about how to back up, restore, and modify the registry, click the following article number to view the article in the Microsoft Knowledge Base:

256986 Description of the Microsoft Windows registry



INTRODUCTION
This article describes an update to the Search HTML filter in Microsoft Office SharePoint Portal Server 2003. When you use this update, you can configure the code page that is used to filter HTML documents.

By default, the Search HTML Filter in SharePoint Portal Server 2003 uses a Unicode Transmission Format-8 (UTF-8) code page as the default code page. If an HTML document does not specify a character set to use, the Search HTML filter uses a UTF-8 code page to filter the document properties of the HTML document. This behavior differs from the behavior of Microsoft SharePoint Portal Server 2001. The Search HTML filter in SharePoint Portal Server 2001 uses a code page that is based on the locale of the server. You can use the update to configure the Search HTML filter to use a code page that is based on the locale of the server.



RESOLUTION
This problem is corrected in Microsoft Office SharePoint Portal Server 2003 Service Pack 2.

To resolve this problem, obtain the latest service pack for SharePoint Portal Server 2003. For more information, click the following article number to view the article in the Microsoft Knowledge Base:

889380 How to obtain the latest service pack for SharePoint Portal Server 2003

After you install the service pack, follow the steps that are listed in the &quot;More Information&quot; section to set the HTMLFiltUseLocaleForDefaultCodePage registry entry and to enable the hotfix.



MORE INFORMATION
You may want to configure the Search HTML filter to use a code page that is based on the locale of the server when you want SharePoint Portal Server 2003 to use the same behavior as SharePoint Portal Server 2001. For example, you click Item Details to view the details of an HTML document that is returned in the search results. You notice that the values of certain properties of the document are displayed incorrectly. The HTML document contains high ASCII characters that use ANSI encoding.

In this scenario, configure the Search HTML filter in SharePoint Portal Server 2003 to use a code page that is based on the locale of the server so that the HTML document is filtered correctly. To do this, obtain the hotfix, and then follow the steps in the &quot;Add the HTMLFiltUseLocaleForDefaultCodePage registry entry after you install the hotfix&quot; section.

How to obtain the hotfix
This issue is fixed in the SharePoint Portal Server 2003 post-Service Pack 1 Hotfix Package that is dated September 17, 2004. For additional information, click the following article number to view the article in the Microsoft Knowledge Base:

883919 Description of the Office SharePoint Portal Server 2003 post-Service Pack 1 Hotfix Package: September 17, 2004

Add the HTMLFiltUseLocaleForDefaultCodePage registry entry after you install the hotfix
After you install this hotfix, add the HTMLFiltUseLocaleForDefaultCodePage registry entry to the following registry subkey, and then set the registry entry to either 1 or 0 (zero), depending on your situation:

The following describes the values that you can use for the HTMLFiltUseLocaleForDefaultCodePage registry entry:
 * If you set the registry entry to 1, the Search HTML filter in SharePoint Portal Server 2003 uses a code page that is based on the locale setting of the server. Therefore, the behavior of the code page is the same as the behavior in SharePoint Portal Server 2001.
 * If you set the registry entry to 0 (zero) or if you delete the registry entry, the Search HTML filter in SharePoint Portal Server 2003 uses a UTF-8 code page. This is the default behavior in SharePoint Portal Server 2003.

To add the HTMLFiltUseLocaleForDefaultCodePage registry entry, follow these steps.

Warning Serious problems might occur if you modify the registry incorrectly by using Registry Editor or by using another method. These problems might require that you reinstall your operating system. Microsoft cannot guarantee that these problems can be solved. Modify the registry at your own risk.  Click Start, click Run, type regedit in the Open box, and then click OK. Locate and then click the following registry subkey:

HKEY_LOCAL_MACHINE\Software\Microsoft\SPSSearch\Gathering Manager

 On the Edit menu, point to New, and then click DWORD Value. Type HTMLFiltUseLocaleForDefaultCodePage, and then press ENTER. On the Edit menu, click Modify. Type the value that you want in the Value data box, and then click OK.</li> Quit Registry Editor.</li></ol>

<div class="references_section">