Microsoft KB Archive/248306
Article ID: 248306
Article Last Modified on 6/12/2001
- Microsoft Site Server 3.0 Standard Edition
- Microsoft Index Server 2.0
This article was previously published under Q248306
Microsoft Site Server 3.0 Search incorrectly detects Korean documents as Japanese.
Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article.
If it is possible to pre-process the documents, converting them to HTML, and then you can add the language and charset tags. Otherwise, the Site Server Search crawl (also known as Gatherer) server must be dedicated to crawling Korean documents to allow proper language handling of Korean language text documents. Text documents cannot be tagged. Therefore, using document tagging to identify the language of the document is not an option in this case.
The following configuration is required on Site Server Service Pack 2 or later:
Set the region to Korean and select the Set as system default locale option. This installs the Korean character set and makes iso-8959-5 the default character set. Restart the computer to activate the system locale change.
Korean and Japanese need to both be listed. Korean should be the default input locale. The Japanese character set is needed to recognize some of the characters.
Internet Explorer Language Settings
In Internet Explorer, click Internet Options, click Languages, and then click the General tab. Make sure Korean is listed, because Site Server Search uses a part of Internet Explorer (WinInet) to crawl the documents.
With the above settings, all Korean and most Japanese text documents are recognized as Korean. English text documents, however, are correctly recognized as English.
Keywords: kbprb KB248306