Microsoft KB Archive/171034

From BetaArchive Wiki

Article ID: 171034

Article Last Modified on 11/6/2003



APPLIES TO

  • Microsoft SQL Server 6.5 Standard Edition



This article was previously published under Q171034

SUMMARY

SQL Server provides several tunable parameters that you can use to fine-tune SQL Server and significantly influence the performance of your system. These tunable knobs are accessible through isql, isql/w (sp_configure) and SQL Enterprise Manager. Additionally, you can monitor your SQL Server system to conduct a general or detailed evaluation of your SQL Server Performance using the SQL Server Performance Monitor counters. By default, SQL Server Setup installs SQL Server with the option of providing performance statistics to the Performance Monitor.

The following are some of the most commonly monitored performance monitoring objects and counters to better understand and analyze the dynamics of your SQL Server system. This list also points out the direct correlation to the relevant SQL Server configuration options, some tuning tips and guidelines for a particular Performance Monitor counter, and any corrective actions that you might need to take. Please be advised that the performance returns on such fine-tuning can only partially compensate for poor database, index, and query design. It is recommended that you examine these areas before you fine-tune your system. For additional information, please see the following article in the Microsoft Knowledge Base:

110352 INF: Optimizing Microsoft SQL Server Performance


Please be advised that some of the Performance Monitor counters discussed below provide a system-wide view and, therefore, it is difficult to interpret the result and isolate its effect on the SQL Server system. Also, note that changing the value of one or more configuration option might enhance or offset the gains achieved through another. Proceed with the fine tuning of these parameters only if you understand the underlying SQL Server architecture very well and you know the idiosyncrasies of your application and system. Work your way out gradually and do not attempt to change all the parameters at once because it might be difficult to troubleshoot and isolate which parameter caused the performance degradation.

MORE INFORMATION

Memory: Page Faults / sec

Initial paging is seen until the system reaches a steady state. This might momentarily go high when several processes are competing for memory. If this is steadily high it indicates a memory bottleneck - thrashing.

For excessive paging, add more Physical memory or reduce memory allocated to your processes or applications (SQL Server). Also, consider increasing the initial page file size. If you still see excessive paging, monitor the following counters of each process to see if you have a memory leak:

  • private bytes
  • virtual bytes
  • handle count
  • pool paged bytes
  • pool non-paged bytes

A gradual increase in Windows NT paging indicates a possible leak coming from some process on the computer. For additional information, please see the following articles in the Microsoft Knowledge Base:

168778 BUG: Page Fault Statement in Admin Companion Is Incorrect
110983 INF: Recommended SQL Server for WinNT Memory Configurations
168697 INF: Estimating the Initial SQL Server Memory Setting


Process: Working Set

This counter monitors the working set size for a specified instance of a process like SQL Server. The working set size for SQL Server should be very close to the configured SQL Server memory value.

If "set working set size" is specified as 0, Windows NT determines the working set size of SQL Server. Configuring this to 1 forces the working set size for SQL Server memory and might lead to over committing SQL Server memory. This is best left as the default, unless you have lot of physical memory. For additional information, please see the following article in the Microsoft Knowledge Base:

166967 INF: Proper SQL Server Configuration Settings


Process:%Processor Time

This counter indicates the CPU utilization of all non-idle threads. If this counter is consistently over 95%, you have a CPU bottleneck.

Consider adding another processor or upgrading to a faster, higher version of the current processor. Please note that some SQL Server operations such as sorting, reformatting, processing aggregate functions, and so forth, could be CPU-bound for the duration of the operation. Note that the most CPU time that a process will report on SMP boxes is 100% of one processor. Therefore, to get a true reading for the processor usage add up the thread:% Processor Time for all threads of the Process, divide by the number of processors, and compare it to the System:% Processor Time counter.

Processor:%Privileged Time

This corresponds to the percentage of time the processor is spending to execute Windows NT kernel commands, like processing SQL Server I/O requests.

If you see this counter consistently high in conjunction with the Physical Disk counters then it is very likely that your SQL Server application could be I/O bound. Consider a faster disk subsystem. TEMPDB in RAM might also decrease this as a result in the reduction of I/O. Decreasing "max async IO" also reduces I/O, but it increases the duration of checkpoint, which flushes the dirty pages to the disk. The same is true for "max lazy writer IO" parameter. The lowering of this value might directly affect the I/O volume generated by the lazy writer thread. Make sure you have enabled the Windows NT DISKPERF setting by issuing "diskperf -Y" from a command prompt and restarted the system for valid disk performance information. For additional information, please see the following article in the Microsoft Knowledge Base:

115050 INF: When to Use Tempdb in RAM


Processor:%User Time

This counter corresponds to the percentage of time the processor is spends executing user processes like SQL Server.

This is indicative of CPU-intensive operations, such as sorting and processing aggregate functions. Consider better indexing, simpler joins, horizontal partitioning for large tables.

Physical Disk:Avg.Disk Queue Length

This shows the average number of both reads and writes queued for the selected disk. The number of waiting I/O requests should be sustained at no more than 1.5 to 2 times the number of spindles making up the physical disk.

If this counter is less than the configured "max async IO" then increasing "max async IO" might improve performance. Add more drives. Most disks have one spindle. However, RAID disks usually have more, but they appear as only one physical disk in Performance Monitor. They might help improve efficiency. Also, consider using faster drives.

SQLServer:Cache - Number of Free Buffers

This shows the instantaneous value of free buffer page pool. The lazywriter process tries to wake up more often if needed to free up dirty pages from the data cache and maintain the free buffers as specified. The "max lazywriter IO" parameter also directly affects the rate at which lazywriter scans the used pages and frees them in the free page pool.

If free buffer consumption is fairly constant, this value might be best left at the default level or lowered to a level below the default. Increase "max lazywriter IO" value if this counter falls below the configured free buffers value.

SQLServer:Cache Hit Ratio

This counter shows the percentage of time a requested data page was found in the cache instead of being read from disk. The higher the percentage, the more efficient the process.

If this value is consistently under 80% then consider increasing SQL Server memory or reducing memory for some of the other SQL Server Structures. Some applications will never have good cache hits and adding more memory does not make a difference. Please note that SQL Server:Cache Hit Ratio is a cumulative percentage that is calculated from the time SQL Server is started. This counter is not very useful after a while because the early hits and misses weigh the value up or down.

SQLServer: I/O - Batch Writes/sec

This count reflects the number of SQL pages written to the disk per second using Batch I/O. The normal graph is a periodic function with the time instants corresponding to the checkpoint showing high I/O Batch writes/sec value and zero otherwise.

If the spikes are more frequent and last for shorter duration (implying a short checkpoint) then consider increasing the recovery interval and decreasing "max async IO".

SQLServer: I/O - Lazy writes/sec

This shows the number of used SQL Server pages freed by the lazywriter process. The desired graph should be fairly constant, implying lazywriter is working evenly at all instances of time rather than acting in spurts and bottlenecking the disk subsystem.

If this is not consistently active, consider increasing or decreasing "max lazywriter IO". For additional information, please see the following article in the Microsoft Knowledge Base:

112539 INF: Max Async IO Configuration Parameter


SQLServer: I/O - Trans. Per Log Record

The number of transactions that were packed into a log record before the log record was written to disk.

"logwrite sleep" parameter can directly affect the number of log records that are packed into a log buffer (a temporary storage for the transaction log records) before issuing one transaction log write. Log write increases this value to pack more log records into one transaction. Typically, applications issue transactions that are short and commit frequently. More packing improves the throughput. However, in the case where transactions are long in duration, this value should be kept low to cause less packing of every committed log record to increase the system throughput.

SQLServer:RA Pages Fetched into Cache/sec

This counter counts the pages that are pre-fetched into cache by Read Ahead Manager. Queries that do table scans or return large result sets in sequential fashion result in this counter showing high values. High values for this counter indicate that the Read Ahead Manager is being used effectively.

You can control the efficiency of Read Ahead strategy of parallel reads by tuning the following configurations:

  • RA cache hit limit
  • RA cache miss limit
  • RA delay
  • RA worker threads
  • RA slots per thread

Make sure that you have taken enough readings every time you change any of these parameters and note the corresponding system performance. It is important to keep in mind that in OLTP environments the RA strategy might not result in a major benefit, but the Read Ahead strategy might improve performance significantly for an OLAP system with large sequential data retrieval. However, keep in mind that a lot of Read Ahead data scans can displace the existing buffer pool pages with your data. Therefore, choose these parameters values judiciously.

SQLServer:RA Pages Found in Cache/sec

This counter counts the pages that have already been cached by the Read Ahead Manager. If this remains sufficiently high, it is likely that Read Ahead is not being used effectively. However, a high value could also be caused by misconfigured Read Ahead parameters, or it might mean that your application might not be able to make good use of Read Ahead, such as OLTP environments.

If the Read Ahead Manager finds the pages in the page buffer pool most of the time, this counter might report high readings and Read Ahead might not be very effective. You can investigate the "RA cache hit limit" to see if it has been configured to a high value. If "RA cache hit limit" has been configured to an excessively high value, it is possible that you could be missing a potential benefit of doing Read Ahead.

SQLServer:RA Physical Reads/sec

This counts the physical reads (each consisting of single extent) issued by Read Ahead Manager.

"RA pre-fetches" could directly affect the number of Read Ahead Physical Reads performed per second. Do not set this to a very high value because you might displace the existing buffer pool of the most recently-used pages by the other threads, thus leading to bad performance on other queries. Experiment with a broad range of queries to arrive at a good value for "RA pre-fetches".

SQLServer-Locks - Exclusive /Shared & Users blocked

These counters are useful in determining the number of users blocked due to table locks.

If there are no users blocked, leave the Lock Escalation (LE) parameters as they are. Please note that if the users are blocked by page locks then changing the Lock Escalation parameters will not help. You might want to consider using Row Level locking, which is available only for Inserts (IRL) to improve concurrency by setting the IRL for a table using sp_tableoption. However, if the users are blocked due to table locks, consider tuning the Lock Escalation parameters. "LE threshold Maximum" prevents premature table lock on large tables and is primarily useful in batch updates while activity is occurring in other parts of the table. "LE threshold Minimum" prevents premature escalation for very small tables such as a three-page, very active lookup table. "LE threshold percent" is used to decide whether table locks should be acquired at "LE threshold maximum," "LE threshold minimum," or at some defined percentage of the table pages. For additional information, please see the following articles in the Microsoft Knowledge Base:

151116 INF: How Tunable Lock Escalation Works
162361 INF: Understanding and Resolving SQL Server Blocking Problems


SQLServer-Procedure Cache Used % & Procedure Cache Active %

Each entry in the procedure cache points to one or more buffers in the procedure buffer. Used % displays the percentage of procedure cache that is currently occupied, but not necessarily active.

If Procedure Cache Used % is fairly high, no new procedure can be executed if it is not already in the cache. Use DBCC MEMUSAGE to monitor the 12 largest objects in the procedure cache, such as stored procedures, triggers, views, rules, and defaults. If multiple copies of an object are in the procedure cache, DBCC MEMUSAGE sums the total memory used by them. You might need to tune the default 30% "procedure cache" depending upon your system's procedure cache usage.

SQLServer - Log Space Used (%)

This counter displays the percentage of transaction log that is used at any given time. All the logged transactions result in error message 1105 (Transaction Log Full) on the log segment when the transaction log is full.

Monitor this counter closely for a typical range of activity against your SQL Server database and determine a good transaction log size. It is very common for the transaction log to run out of space after doing heavy logging, such as populating a large table or when you have stuck non- distributed transactions that take up a lot of your space. You can set up alerts to monitor the database transaction log.

Keywords: kbinfo KB171034