Microsoft KB Archive/828361

= Excel statistical functions: NEGBINOMDIST =

Article ID: 828361

Article Last Modified on 1/17/2007

-

APPLIES TO


 * Microsoft Office Excel 2007
 * Microsoft Office Excel 2003
 * Microsoft Excel 2004 for Mac

-



SUMMARY
The purpose of this article is to describe the NEGBINOMDIST function in Microsoft Office Excel 2003 and in later versions of Excel, to illustrate how the function is used, and to compare the results of the function in Excel 2003 and in later versions of Excel with the results of the function in earlier versions of Excel.

Microsoft Excel 2004 for Macintosh information
The statistical functions in Excel 2004 for Mac were updated by using the same algorithms that were used to update the statistical functions in Excel 2003 and in later versions of Excel. Any information in this article that describes how a function works or how a function was modified for Excel 2003 and for later versions of Excel also applies to Excel 2004 for Mac.



MORE INFORMATION
Note Number_f is a non-negative integer, number_s is a positive integer, and p is a probability with 0 < p < 1.

The NEGBINOMDIST(number_f, number_s, p) function returns the probability of exactly number_f failures before the number of successes reaches number_s in independent Bernoulli trials, each of which has an associated probability p of success and probability 1-p of failure.

Syntax
NEGBINOMDIST(number_f, number_s, p)

Example of usage
In baseball, you can assume that a &quot;.300 hitter&quot; gets a hit (success) with a probability of 0.300 each time the hitter comes to bat (each trial). You can also assume that successive times at bat are independent Bernoulli trials. The table can be used to find the probability that such a hitter gets exactly 0, 1, 2, ..., or 20 non-hits (failures) before the third success. The entry in cell B26 also reveals that there is about a 1.6 percent chance that the hitter will have 21 or more failures before the third success.

To illustrate the NEGBINOMDIST function, create a blank Excel worksheet, and then copy the following table. Click cell A1 in your blank Excel worksheet, and then paste the entries so that the table fills cells A1:D26 in your worksheet.

After you paste the table into your new Excel worksheet, click Paste Options, and then click Match Destination Formatting. While the pasted range is selected, use one of the following procedures, as appropriate for the version of Excel that you are running:
 * In Microsoft Office Excel 2007, click the Home tab, click Format in the Cells group, and then click AutoFit Column Width.
 * In Microsoft Office Excel 2003, point to Column on the Format menu, and then click AutoFit Selection.

Cells B4:B24 show the probabilities of exactly f failures before the number of successes reaches the value in cell B1 (initially set to 3). With this initial value in cell B1, the most likely number of failures before the third success is 3, but any number of failures between 1 and 10 has at least a 5 percent chance of occurring. Chances of specific numbers of failures that are greater than 10 decline rather rapidly, but there is still about a 1.6 percent chance that the batter will endure 21 or more failures before the third success.

The NEGBINOMDIST Help file provides a formula for computing NEGBINOMDIST (where COMBIN(n, k) returns the number of combinations of size k in a population of size n):

COMBIN(number_f + number_s – 1, number_s – 1) * (probability_s ^ number_s) * ((1 – probability_s) ^ number_f)

Another approach is to recognize that the negative binomial and binomial distributions are closely related. For exactly k failures to occur before the r-th success, the following conditions must be true:
 * Exactly k failures and r-1 successes in the first k+r-1 trials must occur.
 * A success on the k+r-th trial must occur.

You can use the following formula to compute NEGBINOMDIST (where the two terms on the right side correspond to 1 and 2, respectively):

NEGBINOMDIST(number_f, number_s, probability_s) = BINOMDIST(number_s, number_f + number_s – 1, probability_s, false) * probability_s

These terms can safely be multiplied together because the outcome on the k+r-th trial is independent of the outcomes on the previous k+r-1 trials. On the worksheet, column C shows this computation. Notice that values in columns B and C are the same.

There is no cumulative version of NEGBINOMDIST, like there is for the BINOMDIST function. However, you can define a function to return the probability of number_f or fewer failures before the number_s-th success. This value is the same as the probability of number_s or more successes in the first number_f + number_s trials. This value is the complement of the probability of number_s-1 or fewer successes in the first number_f + number_s trials. You can use the following formula to compute the probability of number_f or fewer failures before the number_s-th success:

1 - BINOMDIST(number_s - 1, number_f + number_s, probability_s, TRUE)

The results of this formula are shown in column D of the worksheet.

Results in earlier versions of Excel
Knusel (see the &quot;References&quot; section in this article) documented instances where BINOMDIST does not return a numeric answer and yields #NUM! instead because of a numeric overflow. When numeric answers are returned by BINOMDIST, they are correct. The NEGBINOMDIST function exhibits similar behavior. Therefore, NEGBINOMDIST does not suffer from roundoff problems. However, in some extreme cases, NEGBINOMDIST returns #NUM! when it can compute an answer. Additionally, other functions exhibit this same type of behavior in earlier versions of Excel, including the following functions:
 * CRITBINOM
 * HYPGEOMDIST
 * POISSON


 * 1) NUM! is returned only when the first argument of the COMBIN function, number_f + number_s – 1, is greater than or equal to 1,030. There are no computational problems as long as this argument is less than 1,030. In practice, such high values of number_f and number_s are not likely.

Results in Excel 2003 and in later versions of Excel
Microsoft has implemented a conditional algorithm in Excel 2003 and in later versions of Excel because of the overflow issue and because NEGBINOMDIST is well-behaved when the overflow does not occur. The conditional algorithm uses NEGBINOMDIST code from earlier versions of Excel (the computational formula involving COMBIN) when number_f + number_s – 1 < 1030. When number_f + number_s – 1 >= 1030, Microsoft implemented an alternative plan to use the formula that calls BINOMDIST.

For more information about how this plan avoids calling COMBIN, click the following article number to view the article in the Microsoft Knowledge Base:

827459 Excel statistical functions: BINOMDIST

Conclusions
Inaccuracies in earlier versions of Excel occur only when the NEGBINOMDIST formula results in a call to COMBIN with a first argument that is greater than or equal to 1030. In these cases, NEGBINOMDIST returns #NUM! in earlier versions of Excel. This behavior occurs because COMBIN overflows. This behavior has been corrected in Excel 2003 and in later versions of Excel by using an alternative procedure that calls BINOMDIST.

BINOMDIST, in turn, also suffers from overflow problems in earlier versions of Excel. The article about BINOMDIST provides pseudocode that describes the BINOMDIST function's alternative plan. This plan is invoked whenever the number of trials is greater than or equal to 1,030.

The following functions also exhibit similar behavior in earlier versions of Excel:
 * BINOMDIST
 * CRITBINOM
 * HYPGEOMDIST
 * POISSON

These functions also return either correct numeric results or #NUM! or #DIV/0!. An alternative algorithm that is similar to the one for BINOMDIST is implemented in Excel 2003 and in later versions of Excel to return correct answers in cases where earlier versions return #NUM!.

