# 2.2 Module 3: computer arithmetic  (Page 3/3)

 Page 3 / 3

So: -(-128) = -128.

Because 128 is out of the range of signed 8bits numbers.

Addition and Subtraction is done using following steps:

• Monitor sign bit for overflow
• Take twos compliment of subtrahend and add to minuend ,i.e. a - b = a + (-b)

## 3.3 multiplying positive numbers:

The multiplying is done using following steps:

• Work out partial product for each digit
• Take care with place value (column)

## Solution 1:

• Convert to positive if required
• Multiply as above
• If signs were different, negate answer

## Solution 2:

• Booth’s algorithm:

Example of Booth’s Algorithm:

## 3.5 division:

• More complex than multiplication
• Negative numbers are really bad!
• Based on long division
• (for more detail, reference to Computer Organization and Architecture, William Stalling)

## 4.1 principles

We can represent a real number in the form

$±S×{B}^{±E}$

This number can be stored in a binary word with three fields:

• Sign: plus or minus
• Significant: S
• Exponent: E.

(A fixed value, called the bias, is subtracted from the biased exponent field to get the true exponent value (E). Typically, the bias equal ${2}^{k-1}-1$ , where k is the number of bits in the binary exponent)

• The base B is implicit and need not be stored because it is the same for all numbers.

## 4.2 ieee standard for binary floating-point representation

The most important floating-point representation is defined in IEEE Standard 754 [EEE8]. This standard was developed to facilitate the portability of programs from one processor to another and to encourage the development of sophisticated, numerically oriented programs. The standard has been widely adopted and is used on virtually all contemporary processors and arithmetic coprocessors.

The IEEE standard defines both a 32-bit (Single-precision) and a 64-bit (Double-precision) double format with 8-bit and 11-bit exponents, respectively. Binary floating-point numbers are stored in a form where the MSB is the sign bit, exponent is the biased exponent, and "fraction" is the significand. The implied base (B) is 2.

Not all bit patterns in the IEEE formats are interpreted in die usual way; instead, some bit patterns are used to represent special values. Three special cases arise:

1. if exponent is 0 and fraction is 0, the number is ±0 (depending on the sign bit)
2. if exponent = ${2}^{e}$ -1 and fraction is 0, the number is ±infinity (again depending on the sign bit), and
3. if exponent = ${2}^{e}$ -1 and fraction is not 0, the number being represented is not a number (NaN).

This can be summarized as:

Single-precision 32 bit

A single-precision binary floating-point number is stored in 32 bits.

The number has value v:

v = s × ${2}^{e}$ × m

Where

s = +1 (positive numbers) when the sign bit is 0

s = −1 (negative numbers) when the sign bit is 1

e = Exp − 127 (in other words the exponent is stored with 127 added to it, also called "biased with 127")

m = 1.fraction in binary (that is, the significand is the binary number 1 followed by the radix point followed by the binary bits of the fraction). Therefore, 1 ≤ m<2.

In the example shown above:

S=1

E= 011111100(2) -127 = -3

M=1.01 (in binary, which is 1.25 in decimal).

The represented number is: +1.25 × 2−3 = +0.15625.

## 5. floating-point arithmetic

The basic operations for floating-point $\mathrm{X1}=\mathrm{M1}\ast {R}^{\mathrm{E1}}$ and $\mathrm{X2}=\mathrm{M2}\ast {R}^{\mathrm{E2}}$

• $\mathrm{X1}±\mathrm{X2}=\left(\mathrm{M1}\ast {R}^{\mathrm{E1}-\mathrm{E2}}\right){R}^{\mathrm{E2}}$ (assume E1  E2)
• $\mathrm{X1}\ast \mathrm{X2}=\left(\mathrm{M1}\ast \mathrm{M2}\right){R}^{\mathrm{E1}+\mathrm{E2}}$
• $\mathrm{X1}/\mathrm{X2}=\left(\mathrm{M1}/\mathrm{M2}\right){R}^{\mathrm{E1}-\mathrm{E2}}$

For addi­tion and subtraction, it is necessary lo ensure that both operands have the same exponent value. I his may require shifting the radix point on one of the operands to achieve alignment. Multiplication and division are more straightforward.

A floating-point operation may produce one of these conditions:

• Exponent overflow: A positive exponent exceeds the maximum possible expo­nent value. In some systems, this may be designated as
• Exponent underflow: A negative exponent is less than the minimum possible exponent value (e.g.. -200 is less than -127). This means that the number is too small to be represented, and it may be reported as 0.
• Significand underflow: In the process of aligning significands, digits may flow off the right end of the significand. Some form of rounding is required.
• Significand overflow: The addition of two significands of the same sign may result in a carry out of the most significant bit. This can be fixed by realign­ment.