# Re: floating point mantissa?

From: Nick (SeaNICK_at_Noemail.nospam)
Date: 12/18/04

Date: Sat, 18 Dec 2004 15:54:12 -0800

> That implementation is fully standards conforiming with respect to the
> precision of long-double: it's not required to actually have more
> precision than double.
so why was long double 80 bits before,and 64 now? was it more-than-compliant
in the days when it compiled to 80 bit numbers? or compliant to a different
standard?

incidentally I have begun to think i will go with my proprietary struct
mentioned in my original mail.
struct LargerDouble
{
unsigned mantissa : 96; // no assumed leading 1, so the value doesnt have to
be denormalized in order to support smaller numbers with more accuracy
unsigned exponent : 31; // (2^exp)- 1073741823 // way too large but might be
interesting to see how far I could zoom
unsigned sign : 1; // 0 = (+), 1 = (-)
};
the difficult pieces now appear to be:
1. quick verification that the number is not one of the predefined special
numbers
easily solved probably, by maybe just a goto statement using the hiword
and/or loword values, and doing the majority of the work in the default
section. not sure if this will be very optimized or not.
2. optimized normalization, denormalization methods.
whats the quickest way to see the largest bit in a value? my initial
attempts come to something like 3 instructions per comparison and that makes
every normalization a 146 cpu cycle operation - 48 comparisons for 96
possible values, one to subtract the magnitude detected from the magnitude
required, and one to shift the value the number of bits specified by the
result to the previous equation. I obviously don't know assembly yet, but
IMHO 146 cycles just for a normalization of a single floating point number
seems extremely slow. how many cpu cycles does it take for a regular
floating point normalization operation?
3. addition and subtraction operators seem easy, just use the largest
exponent value of the two operands. but what about multiplication and
division?
4. there are some values that floating point numbers have a very hard time
with. I understand that for instance due to the leading one, 0 is not
possible. so if I kept the leading one instead of assuming it, would that
actually make the floating point number work for me, or are there still
difficulties like odd numbers, etc. ? what about an alternate method to make
those work, like binary coded decimal? I imagine I would probably be
throwing any concept of performance away at that point though.

I appreciate all your responses so far, and would be grateful for any
further suggestions you might have.
Thanks
NICK

"Carl Daniel [VC++ MVP]" <cpdaniel_remove_this_and_nospam@mvps.org.nospam>
wrote in message news:%23u3giss4EHA.2180@TK2MSFTNGP10.phx.gbl...
> "Nick" <SeaNICK@Noemail.nospam> wrote in message
>> how do I force Visual C++ .NET 2003 to use long doubles as they are
>> originally intended to be used?
>
> Under VC++ the 'long double' type is identical to the 'double' type and
> there's nothing you can do to change that. That implementation is fully
> standards conforiming with respect to the precision of long-double: it's
> not required to actually have more precision than double.
>
> -cd
>
>

## Relevant Pages

• Floating point misc questions
... I was curious about the following floating point ... some say it is because of poor rounding, normalization ... Our new Z9 seems to be extremely powerful. ... or about 31 digits of precision. ...
(comp.lang.asm370)
• Re: Precision
... Whether one does the subsequent calculations in internal ie integer format ... I have always used external format with Precision 4 since my ... floating point calcs which are done in binary have around 3% error albeit at ... positions it is IMPOSIBLE to have a result of 4 decimals with the 2 ...
(comp.databases.pick)
• Re: Linear Algebra Challenge
... Since I'm using floating point, so I'll never be able to calculate one ... floating point math set to 99 digits. ... As close as I'm willing to wait if I use arbitrary precision. ... This mode is fast; when you select arbitry ...
(comp.sys.hp48)
• Re: Floating point environment
... In all functions like fetestexcept, for instance, it is assumed that there is only one floating point environment ... one for double precision ... the 64 bit format and another for the 80 bit format. ... All versions of SSE use a status/control word separate from the main FPU, however, so they can indeed be considered a separate environment. ...
(comp.std.c)
• Re: Precision
... It's the same if we use Precision, floating point, or ... positions it is IMPOSIBLE to have a result of 4 decimals with the 2 ... May be something wrong in the floating ... remember what the associated actuarial calculations were. ...
(comp.databases.pick)