Microsoft KB Archive/302615

= FIX: Incorrect Cardinality Estimate And Bad Plan for Query with OR and AND on Same Column =

Article ID: 302615

Article Last Modified on 10/15/2002

-

APPLIES TO


 * Microsoft SQL Server 7.0 Standard Edition

-



This article was previously published under Q302615



BUG #: 101624 (SQLBUG_70)



SYMPTOMS
When dealing with an expression of the form ((col = a OR col < b) AND col BETWEEN c AND d), the SQL Server optimizer may incorrectly estimate the number of rows that meet the condition. The estimate will be too low.

For more complex queries, this estimate being too low may lead to a bad query plan, such as choosing the wrong join order or the wrong index.



RESOLUTION
To resolve this problem, obtain the latest service pack for Microsoft SQL Server 7.0. For additional information, click the following article number to view the article in the Microsoft Knowledge Base:

301511 INF: How to Obtain the Latest SQL Server 7.0 Service Pack

NOTE: The following hotfix was created prior to Microsoft SQL Server 7.0 Service Pack 4.

The English version of this fix should have the following file attributes or later:   Version      File name       Platform --

7.00.998    s70998i.exe     x86 7.00.998    s70998a.exe     Alpha NOTE: Because of file dependencies, the most recent hotfix or feature that contains the preceding files may also contain additional files.



STATUS
Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article. This problem was first corrected in Microsoft SQL Server 7.0 Service Pack 4.



MORE INFORMATION
You can view the estimated rows for the expression by looking at the EstimateRows column of the SHOWPLAN_ALL or STATISTICS PROFILE output.

In most cases, the estimates for a query using an OR expression should be the sum of the estimates for each individual expression. For example, if you have an expression (col = a OR col < b), the estimates should be calculated as the sum of the estimates for (col = a) and (col < b). If the value b is larger than the value a, estimating the expression individually like this results in too large of an estimate (the rows that match a would are counted twice), so special consideration is given for this case. However, in no situation should the estimate for the OR as a whole be lower than that of each of the individual expressions in the OR.

Following is an example that shows the bug in the pubs database. Also included is sample output from SHOWPLAN_ALL, which has been edited to remove irrelevant columns:

select count(*) from sales where (qty = 20 or qty < 25) and qty between 1 and 50 StmtText                                                                                                                               EstimateRows ---    select count(*) from sales where ((qty = 20) or (qty < 25)) and qty between 1 and 50                                                    1.0 |--Stream Aggregate(DEFINE:([Expr1002]=Count(*)))                                                                                    1.0 |--Filter(WHERE:([sales].[qty]=20 OR [sales].[qty]<25))                                                                          1.0 |--Clustered Index Scan(OBJECT:([pubs].[dbo].[sales].[UPKCL_sales]), WHERE:([sales].[qty]>=1 AND [sales].[qty]<=50))       20.0 Note that in the preceding output, the Filter condition for qty=20 or qty < 25 has an EstimateRows value of 1.0. However, there are 4 rows that qualify for the qty=20 expression, and 12 rows that qualify for qty < 25. Clearly the estimate of 1 row output is too low.

Note that this only occurs if there is an OR on the same column, and an additional, outer expression on the same column.

Keywords: kbbug kbfix KB302615

-

[mailto:TECHNET@MICROSOFT.COM Send feedback to Microsoft]

© Microsoft Corporation. All rights reserved.