Microsoft KB Archive/12297

= PRB: Rounding Error Casting Double to Long =

Article ID: 12297

Article Last Modified on 12/1/2003

-

APPLIES TO


 * Microsoft C Professional Development System 5.1
 * Microsoft C Professional Development System 6.0
 * Microsoft C Professional Development System 6.0a
 * Microsoft C Professional Development System 6.0a
 * Microsoft C Professional Development System 5.1
 * Microsoft C Professional Development System 6.0
 * Microsoft C Professional Development System 6.0a
 * Microsoft C/C++ Professional Development System 7.0
 * Microsoft Visual C++ 1.0 Professional Edition
 * Microsoft Visual C++ 1.5 Professional Edition
 * Microsoft Visual C++ 1.51
 * Microsoft Visual C++ 1.52 Professional Edition

-



This article was previously published under Q12297





SYMPTOMS
An application produces incorrect results when it casts an expression of type double to type long.



CAUSE
The error is caused by differences in arithmetic precision.



STATUS
The C and C++ standards specify that when an application converts a floating-point number to an integer, the number is truncated toward zero. When rounding is the preferred behavior, the application can add or subtract 0.5 as appropriate so truncation provides the correct result. The text below presents a macro for this purpose.



MORE INFORMATION
The following code example demonstrates this behavior when compiled with any floating point math option other than /FPa. The application displays -4049 as the value for Long2 and Long4, which is incorrect. The application displays the correct value -4050 for Long1 and Long3.

Sample Code
/* * Compile options needed: /FPc or /FPc87 or /FPi or /FPi87 */

main {  long    val1, val2, val3; double mul1, mul2; val1 = 45000; mul1 = 0.09; mul2 = (double)val1 * mul1 * -1.00; printf("%7ld Long1 ", (long)mul2); val2 = (long)mul2; printf("%7ld Long2 ", (long)((double)val1 * mul1 * -1.00)); printf("%7ld Long3 ", val2); val3 =(long)((double)val1 * mul1 * -1.00); printf("%ld Long4 \n", val3); } The application produces the incorrect results by converting a 10-byte real value to a long; the application produces correct results by converting a 10-byte real to an 8-byte real and converting that value to a long.
 * 1) include "stdio.h"

According to the type conversion rules, the conversion from a 10-byte real to an 8-byte real rounds the number from -4049.99999999999985 to -4050.0. When the application converts this value to a long, the value -4050 results. However, when the application directly converts the double value to a long, the application truncates toward zero. In this example, -4049.99999999999985 becomes -4049.

Many numbers (such as .01) are repeating fractions in the binary numbering system which cannot be represented exactly. Any representation of these numbers is slightly more or less than the "true" value. When a calculation involves one of these values, the representation error propagates and can be magnified. Because the error is present only in the least significant part of the number, errors occur only when a calculation loses precision in an intermediate value.

The conversions always truncate toward zero. The following macro effectively rounds the number by increasing the magnitude of the number by 0.5 then converting the number to an integer. #define ROUNDL( d ) ((long)((d) + ((d) > 0 ? 0.5 : -0.5))) The problem does not occur with the Microsoft Visual C++ 32-bit Editions as Win32 does not support the long double data type for compatibility reasons.

For more information on converting floating-point numbers to integers, see the type conversions section of the "C Language Reference" manual.

Additional query words: 1.00 1.50 5.10 6.00 6.00a 6.00ax 7.00

Keywords: kb16bitonly kbprb KB12297

-

[mailto:TECHNET@MICROSOFT.COM Send feedback to Microsoft]

© Microsoft Corporation. All rights reserved.