Microsoft KB Archive/164014

{| = Slow Exchange Client Logons Due to Deadlock in LSASS =
 * width="100%"|

Last reviewed: May 19, 1997

Article ID: Q164014 The information in this article applies to:
 * Microsoft Windows NT Workstation versions 3.51 and 4.0
 * Microsoft Windows NT Server versions 3.51 and 4.0
 * Microsoft Exchange Server, version 4.0

SYMPTOMS
Microsoft Exchange clients experience slow logon response times or the inability to log on during peak logon hours because of a deadlock in LSASS. The following errors occur in the event log:

Event ID 7200 - MSExchangeIS Private

Background thread FDoMaintenance halted due to error code 4012.

-or-

Background thread FDoQuotaCheck halted due to error code 4012. Additionally, the thread counts increase rapidly for Dsamain.exe on the computer running Microsoft Exchange Server and Lsass.exe on the resource domain controllers. If the computers running Microsoft Exchange Server are domain controllers, thread counts for both DSAMAIN and LSASS increase on the computer running Microsoft Exchange Server. Normally, the thread counts for DSAMAIN and LSASS are under 30 for most servers. Thread counts for LSASS and DSAMAIN rapidly climb to over 70 threads during the time when clients are experiencing the slow logon response time associated with this problem.

CAUSE
This problem only occurs when computers running Microsoft Exchange Server are located in a resource domain with the Windows NT user accounts existing in a trusted domain.

Every Microsoft Exchange Client logon request causes Microsoft Exchange Server to look up the account security identifier (SID) in the user accounts domain. When a computer running Microsoft Exchange Client logs on, it sends an NspiBind request to the computer running Microsoft Exchange Server. This results in a LookupAccountSid call within LSASS on the domain controller in the resource domain being sent to one of the trusted domain controllers in the user accounts domain. When this occurs, one thread enters a critical section giving it exclusive access to the code that allows it to acquire a lock to a protected resource. The resource in this case is a particular work queue of LookupAccountSid operations to be performed.

After acquiring the lock, a calculation is performed to determine if more LookupWorker threads need to be spawned. If more threads are needed, more calculations are done, the lock released, threads created, and lookups performed. If not, LookupAccountSid is done within this thread itself. The only problem is that the lock continued to be held unnecessarily. The deadlock occurs because the other threads are waiting to acquire the lock to access the work queue. The fix was to release the lock after the initial calculation, when it was determined additional LookupWorker threads were not needed.

The fix has to be applied to all resource domain controllers when the user accounts exist in a trusted domain.

STATUS
Microsoft has confirmed this to be a problem in Windows NT version 3.51 and 4.0. This problem was corrected in the latest Microsoft Windows NT 4.0 U.S. Service Pack. For information on obtaining the service pack, query on the following word in the Microsoft Knowledge Base (without the spaces):

S E R V P A C K

MORE INFORMATION
Windows NT only runs into the problem when the user accounts for Microsoft Exchange exist in a trusted domain. In this case, there is a call to look up the SID in a trusted domain, which is the only code path to the problem that caused the deadlock situation. The search premise is based on pass- through authentication. To understand it better, follow this simplified version of what happens with respect to a computer running Microsoft Exchange Server being in a resource domain. There are two possible scenarios. Both are similar. The first scenario is if the computer running Microsoft Exchange Server is not a domain controller. The second scenario is if the computer running Microsoft Exchange Server is a domain controller.

Scenario 1 (the computer running Microsoft Exchange Server is not a domain controller):

The computer running Microsoft Exchange Server first attempts to look up the account SID in its local accounts database. If it does not find the SID, it then sends the request to a domain controller in its primary domain (the domain it is a member of). The domain controller in the primary domain attempts to locate the SID in its SAM by calling a routine to look up the SID in the local domain. If it still cannot find the SID, which will be the case when the user accounts are in a trusted domain, it sends the request to a trusted domain controller in the user accounts domain. The domain controller calls a routine to look up the SID in a trusted domain, which leads to the problem. The congestion is on the resource domain controller.

Scenario 2 (the computer running Microsoft Exchange Server is a domain controller):

The computer running Microsoft Exchange Server is a domain controller so it attempts to find the SID in its copy of the SAM by calling a routine lookup for the SID in the local domain. If it still cannot find the SID, which will be the case when the user accounts are in a trusted domain, it sends the request to a trusted domain controller in the user accounts domain. The resource domain controller/computer running Microsoft Exchange Server calls a routine to look up the SID in a trusted domain, which leads us to the problem. The congestion is at the computer running Microsoft Exchange Server because it is a domain controller.
 * }