<< Chapter < Page Chapter >> Page >

Although the first LDW instruction do not load the A4 register correctly while the ADD is executed, the D1 functional unit becomes available in the clock cycle right after the one in which LDW is executed.

To clarify the execution of instructions with delay slots, let's think of the following example of the LDW instruction. Let's assume A10 = 0x0100 A2=1 , and your intent is loading A9 with the 32-bit word at the address 0x0104 . The 3 MV instructions are not related to the LDW instruction. They do something else.

1 LDW .D1 *A10++[A2], A92 MV .L1 A10, A8 3 MV .L1 A1, A104 MV .L1 A1, A2 5 ...

We can ask several interesting questions at this point:

  1. What is the value loaded to A8 ? That is, in which clock cycle, the address pointer isupdated?
  2. Can we load the address offset register A2 before the LDW instruction finishes the actual loading?
  3. Is it legal to load to A10 before the first LDW finishes loading the memory content to A9 ? That is, can we change the address pointer before the 4 delay slotselapse?
Here are the answers:
  1. Although it takes an extra 4 clock cycles for the LDW instruction to load the memory content to A9 , the address pointer and offset registers ( A10 and A2 ) are read and updated in the clock cycle the LDW instruction is issued. Therefore, in line 2, A8 is loaded with the updated A10 , that is A10 = A8 = 0x104 .
  2. Because the LDW reads the A10 and A2 registers in the first clock cycle, you are free to change these registers and do not affect the operationof the first LDW .
  3. This was already answered above.

Similar theory holds for MPY and B (when using a register as a branch address) instructions. The MPY reads in the source values in the first clock cycle and loads themultiplication result after the 2nd clock cycle. For B , the address pointer is read in the first clock cycle, and the actual branching occurs after the5th clock cycle. Thus, after the first clock cycle, you are free to modify the source or the address pointer registers.For more details, refer Table 3-5 in the instruction set description or read the description of the individualinstruction.

Addition, subtraction and multiplication

There are several instructions for addition, subtraction and multiplication on the C6x CPU. The basic instructions are ADD , SUB , and MPY . ADD and SUB have 0 delay slots (meaning the results of the operation are immediately available), but the MPY has 1 delay slot (the result of the multiplication is valid after an additional 1 clock cycle).

(Add, subtract, and multiply): Write an assembly program to compute ( 0000 ef35h + 0000 33dch - 0000 1234h ) * 0000 0007h

Branching and conditional operations

Often you need to control the flow of the program execution by branching to another block of code. The B instruction does the job in the C6x CPU. The address of the branch can be specified either bydisplacement or stored in a register to be used by the B instruction. The B instruction has 5 delay slots, meaning that the actual branch occurs in the 5th clock cycleafter the instruction is executed.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Dsp lab with ti c6x dsp and c6713 dsk. OpenStax CNX. Feb 18, 2013 Download for free at http://cnx.org/content/col11264/1.6
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Dsp lab with ti c6x dsp and c6713 dsk' conversation and receive update notifications?

Ask