0% found this document useful (0 votes)
263 views

Floating Point Number Representation

Floating point numbers represent numbers with decimal portions using a sign bit, biased exponent, and mantissa based on IEEE 754 standard. IEEE 754 defines 32-bit and 64-bit representations that differ in the number of bits used for the biased exponent and mantissa. Numbers are normalized by separating the number into an integer and fractional mantissa, with the exponent tracking the decimal place.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
263 views

Floating Point Number Representation

Floating point numbers represent numbers with decimal portions using a sign bit, biased exponent, and mantissa based on IEEE 754 standard. IEEE 754 defines 32-bit and 64-bit representations that differ in the number of bits used for the biased exponent and mantissa. Numbers are normalized by separating the number into an integer and fractional mantissa, with the exponent tracking the decimal place.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Floating point number representation

As the name implies, floating point numbers are numbers that contain
floating decimal points. For example, the numbers 5.5, 0.001, and
-2,345.6789 are floating point numbers. Numbers that do not have
decimal places are called integers.

To represent floating point numbers, IEEE 754 floating point number


representation method is used. There are two methods based on the
number of bits used to represent a floating point number.

Method 1: IEEE 754-32 bit floating point number representation (single


precision floating point number representation)

Method 2: IEEE 754- 64 bit floating point number representation


(double precision floating point number representation)
But, 1.111001 is not the original number. Original number is
1111.001.Therefore, we have to represent it correctly.
So how?
1111.001

1.111001

Mantissa Exponent

In IEEE754-32 bit representation,

Biased Exponent = Exponent+127


=3+127
=130
We need to convert this in to binary now
2 130

2 65 0

2 32 1

2 16 0

2 8 0

2 4 0

2 2 0

1 0

10000010
Mantissa - 111001

Biased Exponent – 10000010

Sign - 0
Example 2: Represent -10.675 using IEEE 754 – 32 bit representation

Step 01: The sign of the number is – therefore sign bit has 1
Step 2: Covert the number into binary
10.675

2 10 0.625*2 = 1.25 1

2 5 0 0.25*2 = 0.5 0
2 2 1 0.5*2 = 1.0 1
1 0
0

10.675 = 1010.101
1010.101
normalization

1.010101
mantissa
exponent

In IEEE 754 – 32 bit representation


Biased exponent = exponent + 127
= 3+127
= 130
Let’s convert this in to binary
2 130

2 65 0

2 32 1

2 16 0

2 8 0

2 4 0

2 2 0

1 0

10000010
Mantissa - 010101

Biased Exponent – 10000010

Sign - 1
In 64 bit representation, only few differences are there

32 bit representation 64 bit representation


Represent floating point number in 32 Represent floating point number in 64
bits bits

1st bit is the sign bit 1st bit is the sign bit

Next 8 bits are to biased exponent Next 11 bits are to biased exponent
Next 23 bits are to mantissa Next 52 bits are to mantissa
Biased exponent = exponent+127 Biased exponent=exponent+1023

Let’s see 64 bit representation with an example


Example 3: Represent -10.675 using IEEE 754 – 64 bit representation

Step 01: The sign of the number is – therefore sign bit has 1
Step 2: Covert the number into binary
10.675

2 10 0.625*2 = 1.25 1

2 5 0 0.25*2 = 0.5 0
2 2 1 0.5*2 = 1.0 1
1 0
0

10.675 = 1010.101
1010.101
normalization

1.010101
mantissa
exponent

In IEEE 754 – 64 bit representation


Biased exponent = exponent + 1023
= 3+1023
= 1026
Let’s convert this in to binary
2 1026

2 513 0

2 256 1

2 128 0

2 64 0

2 32 0

2 16 0
2
8 0
2
4 0
2 2 0
1 0
10000000010
Mantissa - 010101

Biased Exponent – 10000000010

Sign - 1
Now let’s see how to convert a floating point binary
number in to decimal

Ex1.Following is a binary number represented in 32


bits. find the number

Sign Biased mantissa


bit Exponent
Step 1: Covert the biased exponent in to exponent

Biased Exponent - 10000010

10000010
16 8
128 64 32 4 2 1

(128*1)+(2*1)
=130
Step 2.Find the exponent

As this 32 bit representation

Biased exponent = exponent+127


130 = exponent+127
Therefore, exponent=130-127
=3

Now let’s see what’s the number


In matissa, we store the floating point part ignoring 1.

So now let’s take that 1.

1.010101
Let’s de-normalize now

Means we have to take the binary point 3 times right

1010.101

Now you need to convert this in to decimal


to take original value

(8*1)+(2*1)+(0.5*1)(0.125*1)
10.625 is the number represented

You might also like