Microsoft KB Archive/828795

From BetaArchive Wiki

Article ID: 828795

Article Last Modified on 1/16/2007



APPLIES TO

  • Microsoft Office Excel 2007
  • Microsoft Office Excel 2003



SUMMARY

This article describes the modified algorithm that is used in the random number generator function, RAND, in Microsoft Office Excel 2007 and in Microsoft Office Excel 2003.

MORE INFORMATION

The RAND function in earlier versions of Excel used a pseudo-random number generation algorithm whose performance on standard tests of randomness was not sufficient. Although this is likely to affect only those users who have to make a large number of calls to RAND, such as a million or more, and not to be a concern for almost every user, the pseudo-random number generation algorithm that is described here was first implemented for Excel 2003. It passes the same battery of standard tests.

The battery of tests is named Diehard (see note 1). The algorithm that is implemented in Excel 2003 was developed by B.A. Wichman and I.D. Hill (see note 2 and note 3). This random number generator is also used in the RAT-STATS software package that is provided by the Office of the Inspector General, U.S. Department of Health and Human Services. It has been shown by Rotz et al (see note 4) to pass the DIEHARD tests and additional tests developed by the National Institute of Standards and Technology (NIST, formerly National Bureau of Standards).

Notes

  • The tests were developed by Professor George Marsaglia, Department of Statistics, Florida State University and are available at the following Web site:
  • Wichman, B.A. and I.D. Hill, Algorithm AS 183: An Efficient and Portable Pseudo-Random Number Generator, Applied Statistics, 31, 188-190, 1982.
  • Wichman, B.A. and I.D. Hill, Building a Random-Number Generator, BYTE, pp. 127-128, March 1987.
  • Rotz, W. and E. Falk, D. Wood, and J. Mulrow, A Comparison of Random Number Generators Used in Business, presented at Joint Statistical Meetings, Atlanta, GA, 2001.

The basic idea is that if you take three random numbers on [0,1] and sum them, the fractional part of the sum is itself a random number on [0,1]. The critical statements in the Fortran code listing from the original Wichman and Hill article are:

C   IX, IY, IZ SHOULD BE SET TO INTEGER VALUES BETWEEN 1 AND 30000 BEFORE FIRST ENTRY

IX = MOD(171 * IX, 30269)

IY = MOD(172 * IY, 30307)

IZ = MOD(170 * IZ, 30323)

RANDOM = AMOD(FLOAT(IX) / 30269.0 + FLOAT(IY) / 30307.0 + FLOAT(IZ) / 30323.0, 1.0)

Therefore IX, IY, IZ generate integers between 0 and 30268, 0 and 30306, and 0 and 30322 respectively. These are combined in the last statement to implement the simple principle that was expressed earlier: if you take three random numbers on [0,1] and sum them, the fractional part of the sum is itself a random number on [0,1].

Because RAND produces pseudo-random numbers, if a long sequence of them is produced, eventually the sequence will repeat itself. Combining random numbers as in the Wichman-Hill procedure guarantees that more than 10^13 numbers will be generated before the repetition begins. Several of the Diehard tests produced unsatisfactory results with earlier versions of RAND because the cycle before numbers started repeating was unacceptably short.

Results in Earlier Versions of Excel

The RAND function in earlier versions of Excel was fine in practice for users who did not require a lengthy sequence of random numbers (such as a million). It failed several standard tests of randomness, making its performance an issue when a lengthy sequence of random numbers was needed.

Results in Excel 2003

A simple and effective algorithm has been implemented. The new generator passes all standard tests of randomness.

Conclusions

Power users of RAND who require lengthy sequences of random numbers are better off with the new generator of Excel 2003. Other users should be undeterred from using RAND in earlier versions of Excel.

REFERENCES

For more information about an issue that was documented to occur in RAND, click the following article number to view the article in the Microsoft Knowledge Base:

834520 The RAND function returns negative numbers in Excel 2003


Keywords: kbfuncstat kbfunctions kbinfo KB828795