<< Chapter < Page Chapter >> Page >

In general, the time delay d is equivalent to a clock pulse and T m size 12{T rSub { size 8{m} } } {} >>d. Suppose that n instruction are processed with no branched.

  • The total time required T k size 12{T rSub { size 8{k} } } {} to execute all n instruction is:

T k size 12{T rSub { size 8{k} } } {} = [k + (n-1)]

  • The speedup factor for the instruction pipeline compared to execution without the pipeline is defined as:

S K = T 1 T K = nk τ k + ( n 1 ) τ = nk k + ( n 1 ) size 12{ { size 24{S} } rSub { size 8{K} } = { { { size 24{T} } rSub { size 8{1} } } over { { size 24{T} } rSub { size 8{K} } } } = { { ital "nk"τ} over { left [k+ \( n - 1 \) right ]τ} } = { { ital "nk"} over {k+ \( n - 1 \) } } } {}

  • An ideal pipeline divides a task into k independent sequential subtasks

– Each subtask requires 1 time unit to complete

– The task itself then requires k time units tocomplete. For n iterations of the task, the execution times will be:

– With no pipelining: nk time units

– With pipelining: k + (n-1) time units

Speedup of a k-stage pipeline is thus

S = nk / [k+(n-1)] ==>k (for large n)

2.2 pipeline limitations

Several factors serve to limit the pipeline performance. If the six stage are not of equal duration, there will be some waiting involved at various pipeline stage. Another difficulty is the condition branch instruction or the unpredictable event is an interrupt. Other problem arise that the memory conflicts could occur. So the system must contain logic to account for the type of conflict.

  • Pipeline depth

- Data dependencies also factor into the effective length of pipelines

- Logic to handle memory and register use and to control the overall pipeline increases significantly with increasing pipeline depth

– If the speedup is based on the number of stages, why not build lots of stages?

– Each stage uses latches at its input (output) to buffer the next set of inputs

+ If the stage granularity is reduced too much, the latches and their control become a significant hardware overhead

+ Also suffer a time overhead in the propagation time through the latches

- Limits the rate at which data can be clocked through the pipeline

  • Data dependencies

– Pipelining must insure that computed results are the same as if computation was performed in strict sequential order

– With multiple stages, two instructions “in execution” in the pipeline may have data dependencies. So we must design the pipeline to prevent this.

– Data dependency examples:

A = B + C

D = E + A

C = G x H

A = D / H

Data dependencies limit when an instruction can be input to the pipeline.

  • Branching

One of the major problems in designing an instruction pipeline is assuring a steady flow of instructions to initial stages of the pipeline. However, 15-20% of instructions in an assembly-level stream are (conditional) branches. Of these, 60-70% take the branch to a target address. Until the instruction is actually executed, it is impossible to determin whether the branch will be taken or not.

- Impact of the branch is that pipeline never really operates at its full capacity.

– The average time to complete a pipelined instruction becomes

Tave =(1-pb)1 + pb[pt(1+b) + (1-pt)1]

– A number of techniques can be used to minimize the impact of the branch instruction (the branch penalty).

- A several approaches have been taken for dealing with conditional branches:

+ Multiple streams

+ Prefetch branch target

+ Loop buffer

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Computer architecture. OpenStax CNX. Jul 29, 2009 Download for free at http://cnx.org/content/col10761/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Computer architecture' conversation and receive update notifications?

Ask