Microsoft KB Archive/248306

From BetaArchive Wiki
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Article ID: 248306

Article Last Modified on 6/12/2001



APPLIES TO

  • Microsoft Site Server 3.0 Standard Edition
  • Microsoft Index Server 2.0



This article was previously published under Q248306

SYMPTOMS

Microsoft Site Server 3.0 Search incorrectly detects Korean documents as Japanese.

CAUSE

Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article.


WORKAROUND

If it is possible to pre-process the documents, converting them to HTML, and then you can add the language and charset tags. Otherwise, the Site Server Search crawl (also known as Gatherer) server must be dedicated to crawling Korean documents to allow proper language handling of Korean language text documents. Text documents cannot be tagged. Therefore, using document tagging to identify the language of the document is not an option in this case.

The following configuration is required on Site Server Service Pack 2 or later:

Regional Settings

Set the region to Korean and select the Set as system default locale option. This installs the Korean character set and makes iso-8959-5 the default character set. Restart the computer to activate the system locale change.

Input Locales

Korean and Japanese need to both be listed. Korean should be the default input locale. The Japanese character set is needed to recognize some of the characters.

Internet Explorer Language Settings

In Internet Explorer, click Internet Options, click Languages, and then click the General tab. Make sure Korean is listed, because Site Server Search uses a part of Internet Explorer (WinInet) to crawl the documents.

With the above settings, all Korean and most Japanese text documents are recognized as Korean. English text documents, however, are correctly recognized as English.

Keywords: kbprb KB248306