Microsoft KB Archive/321332

= FIX: Full-Text Searches for Data That is Enclosed in Parenthesis Inside HTML Tags is Not Being Returned =

Article ID: 321332

Article Last Modified on 11/5/2003

-

APPLIES TO


 * Microsoft SQL Server 2000 Standard Edition

-



This article was previously published under Q321332



SYMPTOMS
When you run SQL Server full-text searches, the search may not return the correct results if all of the following conditions are true:


 * The rows contain an HTML document that is stored in a text column.
 * The data that you are searching for is enclosed in parenthesis &quot;&quot;.
 * There is no blank space between the parenthesis and the HTML tag.
 * The full-text search is using an English word breaker.

The following example illustrates the problem: CREATE TABLE [tbl_fts_test] (   [row_num]  int  NOT NULL constraint PK_fts_test primary key,    [field_value] [varchar] (8000) NULL ) go

---insert data insert into tbl_fts_test values (1,' Type: Web Page Title: Microsoft Security Bulletin (MS01-060) Author: Microsoft ') insert into tbl_fts_test values (2,' Type: Web Page Title: Microsoft Security Bulletin (MS01-060) Author: Microsoft ') insert into tbl_fts_test values (3,' Title: Microsoft Security Bulletin (MS01-060) Author: Microsoft ') insert into tbl_fts_test values (4,'Title: Microsoft Security Bulletin (MS01-060) Author: Microsoft') If you run the following query after the full-text index is created and populated, the query only returns rows with row_num 2 and 4: select * from tbl_fts_test where contains(field_value, 'MS01-060')

---Row_Num=2 is returned because there is a blank space between the parenthesis and the HTML tag.

---Row_Num=4 is returned because it does not contain any HTML tags.



CAUSE
The English word breaker does not perform HTML filtering. Instead, it looks at the text and applies the rules of the English language. The English word breaker is designed to tokenize English text (not necessarily HTML).



RESOLUTION
To resolve this problem, obtain the latest service pack for Microsoft SQL Server 2000. For additional information, click the following article number to view the article in the Microsoft Knowledge Base:

290211 INF: How To Obtain the Latest SQL Server 2000 Service Pack

NOTE: The following hotfix was created before the release of Microsoft SQL Server 2000 Service Pack 3.



WORKAROUND
To work around this behavior, you can either:   Use a Neutral word breaker for HTML data that is stored in text columns. To avoid the problem, you can modify the code in the More Information section as follows: --add column to indexing exec sp_fulltext_column @tabname =  'dbo.tbl_fts_test' , @colname = 'field_value' , @action = 'add' , @language = 0    -This is the additional parameter that you must add to force the Neutral word breaker. go -or-

 Store the HTML as a Binary Large Object (BLOB), and then use the HTML IFilter. For more information see the &quot;Full-Text Catalogs and Indexes&quot;, &quot;Using Full-text Predicates to Query image Columns&quot;, and &quot;Filtering Supported File Types&quot; topics in SQL Server Books Online.

NOTE: When you use the Neutral word breaker, linguistic based searches may not work. Linguistic analysis involves finding word boundaries (word-breaking) and conjugating verbs (stemming).

For example, any full-text query that uses the FORMSOF(INFLECTIONAL) may not work with the Neutral word breaker. For more information, see the &quot;Column-Level Linguistic Analysis&quot; topic in SQL Server Books Online.



STATUS
Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article.

This problem was first corrected in Microsoft SQL Server 2000 Service Pack 3.



MORE INFORMATION
The following code segment illustrates the problem. use master go ---Create database drop database testfts go create database testfts go

use testfts go ---Create table if object_id ('tbl_fts_test') is not null drop table tbl_fts_test go CREATE TABLE [tbl_fts_test] (   [row_num]  int  NOT NULL constraint PK_fts_test primary key,    [field_value] [varchar] (8000) NULL ) go

---Insert data insert into tbl_fts_test values (1,' Type: Web Page Title: Microsoft Security Bulletin (MS01-060) Author: Microsoft ') insert into tbl_fts_test values (2,' Type: Web Page Title: Microsoft Security Bulletin (MS01-060) Author: Microsoft ') insert into tbl_fts_test values (3,' Title: Microsoft Security Bulletin (MS01-060) Author: Microsoft ') insert into tbl_fts_test values (4,'Title: Microsoft Security Bulletin (MS01-060) Author: Microsoft') go

---Enable database for full-text exec sp_fulltext_database 'enable' go

---Create catalog exec sp_fulltext_catalog @ftcat = 'fts_test' , @action = 'create' go

---Enable fulltext for the table exec sp_fulltext_table @tabname =  'dbo.tbl_fts_test' , @action =  'create' , @ftcat =  'fts_test' , @keyname =  'PK_fts_test' go

---Add column to indexing exec sp_fulltext_column @tabname =  'dbo.tbl_fts_test' , @colname = 'field_value' , @action = 'add' go

---Start full population exec sp_fulltext_catalog @ftcat =  'fts_test' , @action = 'start_full'

---Check status while (fulltextcatalogproperty ('fts_test', 'populatestatus')) <>0 begin waitfor delay '0:0:2' end --Make sure status column is 0, which means population is completed. exec sp_help_fulltext_catalogs 'fts_test' go

---Query that shows that data does exist, returns 4 rows select * from tbl_fts_test where field_value like '%MS01-060%' order by field_value go

---Run query, which returns 2 rows when it ought to return 4 rows. select * from tbl_fts_test where contains(field_value, '*MS01-060*') select * from tbl_fts_test where contains(field_value, '&quot;*MS01-060*&quot;') select * from tbl_fts_test where contains(field_value, '&quot;MS01-060&quot;') select * from tbl_fts_test where contains(field_value, 'MS01-060') select * from tbl_fts_test where contains(field_value, '(MS01-060)') go

Keywords: kbfix kbprb kbsqlserv2000sp3fix KB321332

-

[mailto:TECHNET@MICROSOFT.COM Send feedback to Microsoft]

© Microsoft Corporation. All rights reserved.