Friday, 2 September 2011

Floating Point Numbers

          Floating-point notation can be used conveniently to represent both large as well as small fractional
or mixed numbers. This makes the process of arithmetic operations on these numbers relatively much
easier. Floating-point representation greatly increases the range of numbers, from the smallest to the
largest, that can be represented using a given number of digits. Floating-point numbers are in general
expressed in the form
N = m×be (1.1)
where m is the fractional part, called the significant or mantissa, e is the integer part, called the
exponent, and b is the base of the number system or numeration. Fractional part m is a p-digit number
of the form (±d.dddd dd), with each digit d being an integer between 0 and b – 1 inclusive. If the
leading digit of m is nonzero, then the number is said to be normalized.
Equation (1.1) in the case of decimal, hexadecimal and binary number systems will be written as
follows:
Decimal system
N = m×10e (1.2)
Number Systems 13
Hexadecimal system
N = m×16e (1.3)
Binary system
N = m×2e (1.4)
For example, decimal numbers 0.0003754 and 3754 will be represented in floating-point notation
as 3.754 × 10−4 and 3.754 × 103 respectively. A hex number 257.ABF will be represented as
2.57ABF × 162. In the case of normalized binary numbers, the leading digit, which is the most
significant bit, is always ‘1’ and thus does not need to be stored explicitly.
Also, while expressing a given mixed binary number as a floating-point number, the radix point is
so shifted as to have the most significant bit immediately to the right of the radix point as a ‘1’. Both
the mantissa and the exponent can have a positive or a negative value.
The mixed binary number (110.1011)2 will be represented in floating-point notation as .1101011
× 23 = .1101011e+0011. Here, .1101011 is the mantissa and e+0011 implies that the exponent is
+3. As another example, (0.000111)2 will be written as .111e−0011, with .111 being the mantissa
and e−0011 implying an exponent of −3. Also, (−0.00000101)2 may be written as −.101 × 2−5 =
−.101e−0101, where −.101 is the mantissa and e−0101 indicates an exponent of −5. If we wanted
to represent the mantissas using eight bits, then .1101011 and .111 would be represented as .11010110
and .11100000.

No comments:

Post a Comment

Newest