<< Chapter < Page Chapter >> Page >

In the 2's complement fractional representation, an N bit binary word can represent 2 N equally space numbers from 2 N 1 2 N 1 1 to 2 N 1 1 2 N 1 1 2 N 1 .

For example, we interpret an 8-bit binary word b 7 b 6 b 5 b 4 b 3 b 2 b 1 b 0 as a fractional number x b 7 2 7 b 6 2 6 b 1 2 b 0 2 7 b 7 i 0 6 2 i 7 b i -1 1 2 -7

This representation is also referred as Q-format . We can think of having an implied binary digit right after the MSB. If we have an N -bit binary word with MSB as the sign bit, we have N 1 bits to represent the fraction. We say the number has Q-( N 1 ) format. For example, in the example, x is a Q-7 number. In C6211, it is easiest to handle Q-15 numbers represented by each 16bit binary word, because the multiplication of two Q-15 numbers results in a Q-30 number that can still be stored ina 32-bit wide register of C6211. The programmer needs to keep track of the implied binary point when manipulatingQ-format numbers.

(Q format): What are the decimal fractional numbers corresponding to the Q-7 format binary numbers; 01001101 , 11100100 , 01111001 , and 10001011 ?

Intentionally left blank.

Two's complement arithmetic

The convenience of 2's compliment format comes from the ability to represent negative numbers and computesubtraction using the same algorithm as a binary addition. The C62x processor has instructions to add, subtract andmultiply numbers in the 2's compliment format. Because, in most digital signal processing algorithms, Q-15 format ismost easy to implement on C62x processors, we only focus on the arithmetic operations on Q-15 numbers in the following.

Addition and subtraction

The addition of two binary numbers is computed in the same way as we compute the sum of two decimal numbers.Using the relation 0 0 0 , 0 1 1 0 1 and 1 1 10 , we can easily compute the sum of two binary numbers. The C62x instruction ADD performs this binary addition on different operands.

However, care must be taken when adding binary numbers. Because each Q-15 number can represent numbers in therange -1 1 2 15 , if the result of summing two Q-15 numbers is not in this range, we cannot represent the result in theQ-15 format. When this happens, we say an overflow has occurred. Unless carefully handled, the overflow makes the result incorrect.Therefore, it is really important to prevent overflows from occurring when implementing DSP algorithms. One wayof avoiding overflow is to scale all the numbers down by a constant factor, effectively making all the numbers verysmall, so that any summation would give results in the -1 1 range. This scaling is necessary and it is important to figure out how muchscaling is necessary to avoid overflow. Because scaling results in loss of effective number of digits, increasingquantization errors, we usually need to find the minimum amount of scaling to prevent overflow.

Another way of handling the overflow (and underflow) is saturation . If the result is out of the range that can be properly represented in the given datasize, the value is saturated, meaning that the value closest to the true result is taken in the rangerepresentable. Such instructions as SADD , SSUB perform the operations followed by saturation.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Finite impulse response. OpenStax CNX. Feb 16, 2004 Download for free at http://cnx.org/content/col10226/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Finite impulse response' conversation and receive update notifications?

Ask