Microsoft KB Archive/828296

= Description of the ZTEST function in Excel =

Article ID: 828296

Article Last Modified on 1/18/2007

-

APPLIES TO


 * Microsoft Office Excel 2007
 * Microsoft Office Excel 2003

-



SUMMARY
This article describes the ZTEST function in Microsoft Office Excel 2003 and in later versions of Excel. This article discusses how the function is used and compares the results of ZTEST in Excel 2003 and in later versions of Excel with the results of ZTEST in earlier versions of Excel.



MORE INFORMATION
The ZTEST(array, µ0, sigma) function is used for hypothesis testing when observations in the sample are assumed to come from a normal distribution with a known standard deviation, sigma. If the null hypothesis is such that the mean of this normal distribution is µ0, then for a sample of size n, under this null hypothesis, the sample mean has a normal distribution with mean µ0 and standard deviation sigma/SQRT(n). ZTEST returns the probability that a typical random variable with this distribution will take on a value higher than the observed sample mean corresponding to the sample in array.

If sigma is omitted, then ZTEST will use STDEV(array) in its place.

Syntax
ZTEST(array, µ0, sigma)

Note The array contains numeric data, µ0 is a real number and sigma (if included) is a positive number. If sigma is omitted, STDEV(array) is used. In this case, the data in the array must be such that STDEV(array) is positive. This is a reasonably innocuous supposition that is satisfied unless each entry in the array has the same value.

Example of usage
To test how the ZTEST function works, assume that intelligence quotient (IQ) scores follow a normal distribution with a standard deviation of 15, and that you review the IQs for a sample of nine students in your local school.

To illustrate this, create a blank Excel worksheet, and copy the following table. Select cell A1 in your blank Excel worksheet, and then paste the entries so that the table fills cells A1:D20 in your worksheet.

After you paste the table in your new Excel worksheet, click the Paste Options button, and then click Match Destination Formatting. With the pasted range still selected, use one of the following procedures, as appropriate for the version of Excel that you are running:
 * In Microsoft Office Excel 2007, click the Home tab, click Format in the Cells group, and then click AutoFit Column Width.
 * In Excel 2003, point to Column on the Format menu, and then click AutoFit Selection.

The IQ data for the nine students are given in A2:A10, µ0 is in B2, and sigma is in C2. First, assuming that you know (and will therefore use) the value of sigma, the value returned by ZTEST for the one-sided (or one-tailed) test is in cell A14. The average IQ value, shown in A13 is 108.333.

ZTEST confirms that if the true mean of the underlying normal distribution from which all nine student IQs were drawn is 100 and the true standard deviation is 15, a sample mean that is higher than 108.333 would occur with probability 0.0478. Therefore, if before drawing the sample you had established a significance level of 0.05 for this one-tailed test, we would reject the null hypothesis in favor of the one-sided alternative that the true mean of the underlying distribution from which our sample was drawn is greater than 100.

Cell A15 shows how ZTEST is computed when you assume that sigma is known. If you call ZTEST(array, µ0, sigma), ZTEST returns

1 - NORMSDIST((AVERAGE(array) - µ0)/(sigma/SQRT(n)))

where  is the number of observations in the sample.

Cells A17:A20 provide results for the case where sigma is assumed to be not known (and the optional third argument to ZTEST is omitted). In this case, Excel calculates STDEV(array) shown in A18, and it uses this quantity instead of sigma. Cell A19 shows the results of ZTEST with the third argument omitted. Notice that the probability under the null hypothesis of a higher value of the sample mean is about one fourth of the value in the case where sigma is known. This occurs because the sample standard deviation was 11.18, lower than 15. Dividing these quantities by SQRT(sample size) = SQRT(9) = 3 gives 3.73 and 5, respectively. The observed sample mean is 8.333 units above the hypothesized µ0 = 100 and 8.333/3.73 is about 2.23, while 8.333/5 is 1.67. The probability that a standard normal random variable exceeds 2.23 is about one-fourth the probability that it exceeds 1.67.

ZTEST is designed to give a one-tailed result, the probability that a sample mean higher than observed would occur under the null hypothesis. ZTEST can be adapted to give a two-tailed result, the probability that a sample mean further from µ0 in either direction than the observed sample mean would occur under the null hypothesis. To answer this question, you can use the following formula involving ZTEST when sigma is assumed known:

= 2 * MIN(ZTEST(array, µ0, sigma); 1 - ZTEST(array, µ0 , sigma))

You can still do a two-tailed test when sigma is assumed unknown by using:

= 2 * MIN(ZTEST(array, µ0); 1 - ZTEST(array, µ0))

Cells D14 and D19 show these two results.

Results in earlier versions of Excel
Code for ZTEST was not changed for Excel 2003 and for later versions of Excel. However, because ZTEST effectively calls NORMSDIST when it computes its result, the accuracy of ZTEST relies primarily on the accuracy of NORMSDIST.

The accuracy of NORMSDIST has been improved in Excel 2003 and in later versions of Excel. In earlier versions of Excel, a single computational procedure was used for all values of z. Results were essentially accurate to seven decimal places. This is more than sufficient for most practical examples.

For more information about NORMSDIST, click the following article number to view the article in the Microsoft Knowledge Base:

827369 Excel statistical functions: NORMSDIST

Results in Excel 2003 and in later versions of Excel
The procedure for NORMSDIST in Excel 2003 and in later versions of Excel uses two different computational procedures, depending on the value of z. The first is for z between -5 and +5. The second is for z values in the extreme left or right tails, below -5 or above +5. Accuracy was improved for all values because, over the range of z values where each was used, these two methods were both superior to the single method used in previous versions of Excel. Typical accuracy is now 14 to 15 decimal places.

Conclusions
There are rare occasions in which you might demand accuracy in ZTEST that is better than seven decimal places. On such occasions, the version of NORMSDIST in Excel 2003 and in later versions of Excel will cause ZTEST to give superior performance. For all other computations involving ZTEST, you should not notice a difference between later versions of Excel and earlier versions of Excel.

When you conduct a hypothesis test, you should establish a significance level in advance of the test. Typical significance levels are 0.05, 0.01, or 0.001. After the data is gathered, you can run ZTEST to determine whether the null hypothesis should be rejected because the value of ZTEST is below the significance level cutoff.

It is hard to imagine that more than seven decimal places of accuracy in ZTEST would be required for this determination unless you started with an extremely small significance level, such as 0.00000001.

Keywords: kbexpertisebeginner kbformula kbfuncstat kbfunctions kbinfo KB828296

-

[mailto:TECHNET@MICROSOFT.COM Send feedback to Microsoft]

© Microsoft Corporation. All rights reserved.