<< Chapter < Page Chapter >> Page >

Numerical codes usually spend most of their time in loops, so you don’t want anything inside a loop that doesn’t have to be there, especially an if-statement. Not only do if-statements gum up the works with extra instructions, they can force a strict order on the iterations of a loop. Of course, you can’t always avoid conditionals. Sometimes, though, people place them in loops to process events that could have been handled outside, or even ignored.

To take you back a few years, the following code shows a loop with a test for a value close to zero:


PARAMETER (SMALL = 1.E-20) DO I=1,NIF (ABS(A(I)) .GE. SMALL) THEN B(I) = B(I) + A(I) * CENDIF ENDDO

The idea was that if the multiplier, A(I) , were reasonably small, there would be no reason to perform the math in the center of the loop. Because floating-point operations weren’t pipelined on many machines, a comparison and a branch was cheaper; the test would save time. On an older CISC or early RISC processor, a comparison and branch is probably still a savings. But on other architectures, it costs a lot less to just perform the math and skip the test. Eliminating the branch eliminates a control dependency and allows the compiler to pipeline more arithmetic operations. Of course, the answer could change slightly if the test is eliminated. It then becomes a question of whether the difference is significant. Here’s another example where a branch isn’t necessary. The loop finds the absolute value of each element in an array:


DO I=1,N IF (A(I) .LT. 0.) A(I) = -A(I)ENDDO

But why perform the test at all? On most machines, it’s quicker to perform the abs() operation on every element of the array.

We do have to give you a warning, though: if you are coding in C, the absolute value, fabs() , is a subroutine call. In this particular case, you are better off leaving the conditional in the loop. The machine representation of a floating-point number starts with a sign bit. If the bit is 0, the number is positive. If it is 1, the number is negative. The fastest absolute value function is one that merely “ands” out the sign bit. See macros in /usr/include/macros.h and /usr/include/math.h .

When you can’t always throw out the conditional, there are things you can do to minimize negative performance. First, we have to learn to recognize which conditionals within loops can be restructured and which cannot. Conditionals in loops fall into several categories:

  • Loop invariant conditionals
  • Loop index dependent conditionals
  • Independent loop conditionals
  • Dependent loop conditionals
  • Reductions
  • Conditionals that transfer control

Let’s look at these types in turn.

Loop invariant conditionals

The following loop contains an invariant test:


DO I=1,K IF (N .EQ. 0) THENA(I) = A(I) + B(I) * C ELSEA(I) = 0. ENDIFENDDO

“Invariant” means that the outcome is always the same. Regardless of what happens to the variables A , B , C , and I , the value of N won’t change, so neither will the outcome of the test.

You can recast the loop by making the test outside and replicating the loop body twice — once for when the test is true, and once for when it is false, as in the following example:

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask