< Floating Point

Floating-Point Formats

There are 4 different formats of floating point number representation in the IEEE 754 standard:

Single-Precision
Double-Precision
Single, Extended-Precision
Double, Extended-Precision

Single-Precision

Single precision floating point numbers are 32 bits wide. The first bit (bit 31, the MSB) is a sign bit, the next 8 bits (bits 30-23) are the exponent, and the remaining 23 bits are for the significand. Note that even though 23 bits are stored for the significand, the precision() is actually 24 bits. This is a trick made possible by a normalized floating point system with . The exponent is biased by 127, so that negative exponents can be expressed.

Double-Precision

Double-precision numbers are 64 bits wide. The MSB (bit 63) is the sign bit. The next 11 bits (bits 62-52) are the exponent, and the rest of the bits (bits 51-0) are for the significand. Again, the precision is actually 53 bits (not 52) because of the same normalization trick.

Extended-Precision

Review

FormatWidthPrecisionExponentSignificand
Single32 bits23 bitsbits 30-23bits 22-0
Double64 bits52 bitsbits 62-52bits 51-0
This article is issued from Wikibooks. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.