Microsoft KB Archive/840167

= Description of crawl behavior and of crawl types in SharePoint Portal Server 2003 =

Article ID: 840167

Article Last Modified on 9/2/2004

-

APPLIES TO


 * Microsoft Office SharePoint Portal Server 2003

-





INTRODUCTION
This article lists the different crawl types that are available in Microsoft Office SharePoint Portal Server 2003. This article also identifies the types of crawl and the conditions where content is removed from a content index.



MORE INFORMATION
The following table lists conditions where content may be removed from a content index and contains information about whether content is removed from the content index for each crawl type.

For conditions where content is removed after the third crawl, three full updates of the content index must occur before any previously crawled pages are removed from the content index. The reason this logic exists for full crawls is that, generally, it is impossible to know exactly why a particular URL was &quot;unvisited&quot; during a full crawl because the crawler in SharePoint Portal Server 2003 does not keep track of all links between URLs. Therefore, this is a precautionary measure to prevent the unintended removal of content from the content index.

For conditions where content is immediately removed, the actual time that it takes for content to be removed may vary. The actual time depends on the time it takes for the crawl operation to complete and the time it takes for the index to propagate from the index management server to the search server.

Customers may see documents being removed with fewer than three full crawls. There may have been another crawl between the time that the document was deleted and the time that you started a full crawl, or between full crawls. However, the constant that keeps track of how many crawls the document is kept before it is deleted is set to three.

