<< Chapter < Page Chapter >> Page >

A typical corporation is full of frightening examples of overhead. Say your department has prepared a stack of paperwork to be completed by another department. What do you have to do to transfer that work? First, you have to be sure that your portion is completed; you can’t ask them to take over if the materials they need aren’t ready. Next, you need to package the materials — data, forms, charge numbers, and the like. And finally comes the official transfer. Upon receiving what you sent, the other department has to unpack it, do their job, repackage it, and send it back.

A lot of time gets wasted moving work between departments. Of course, if the overhead is minimal compared to the amount of useful work being done, it won’t be that big a deal. But it might be more efficient for small jobs to stay within one department. The same is true of subroutine and function calls. If you only enter and exit modules once in a relative while, the overhead of saving registers and preparing argument lists won’t be significant. However, if you are repeatedly calling a few small subroutines, the overhead can buoy them to the top of the profile. It might be better if the work stayed where it was, in the calling routine.

Additionally, subroutine calls inhibit compiler flexibility. Given the right opportunity, you’d like your compiler to have the freedom to intermix instructions that aren’t dependent upon each other. These are found on either side of a subroutine call, in the caller and callee. But the opportunity is lost when the compiler can’t peer into subroutines and functions. Instructions that might overlap very nicely have to stay on their respective sides of the artificial fence.

It helps if we illustrate the challenge that subroutine boundaries present with an exaggerated example. The following loop runs very well on a wide range of processors:


DO I=1,N A(I) = A(I) + B(I) * CENDDO

The code below performs the same calculations, but look at what we have done:


DO I=1,N CALL MADD (A(I), B(I), C)ENDDO SUBROUTINE MADD (A,B,C)A = A + B * C RETURNEND

Each iteration calls a subroutine to do a small amount of work that was formerly within the loop. This is a particularly painful example because it involves floating- point calculations. The resulting loss of parallelism, coupled with the procedure call overhead, might produce code that runs 100 times slower. Remember, these operations are pipelined, and it takes a certain amount of “wind-up” time before the throughput reaches one operation per clock cycle. If there are few floating-point operations to perform between subroutine calls, the time spent winding up and winding down pipelines figures prominently.

Subroutine and function calls complicate the compiler’s ability to efficiently man- age COMMON and external variables, delaying until the last possible moment actually storing them in memory. The compiler uses registers to hold the “live” values of many variables. When you make a call, the compiler cannot tell whether the subroutine will be changing variables that are declared as external or COMMON . Therefore, it’s forced to store any modified external or COMMON variables back into memory so that the callee can find them. Likewise, after the call has returned, the same variables have to be reloaded into registers because the compiler can no longer trust the old, register-resident copies. The penalty for saving and restoring variables can be substantial, especially if you are using lots of them. It can also be unwarranted if variables that ought to be local are specified as external or COMMON , as in the following code:

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask