<< Chapter < Page Chapter >> Page >

Gprof’s flat profile

As we mentioned previously, gprof also produces a timing profile (also called a “flat” profile, just to confuse things) similar to the one produced by prof . A few of the fields are different from prof , and there is some extra information, so it will help if we explain it briefly. The following example shows the first few lines from a gprof flat profile for stuff . You will recognize the top three routines from the original program. The others are library functions included at link-time.


% cumulative self self total time seconds seconds calls ms/call ms/call name39.9 3.23 3.23 2 1615.07 1615.07 _c_ [6] 39.9 6.46 3.23 1 3230.14 4845.20 _b_ [4]20.0 8.08 1.62 1 1620.07 3235.14 _a_ [5] 0.1 8.09 0.01 3 3.33 3.33 _ioctl [9]0.0 8.09 0.00 64 0.00 0.00 .rem [12] 0.0 8.09 0.00 64 0.00 0.00 _f_clos [177]0.0 8.09 0.00 20 0.00 0.00 _sigblock [178] ... .... .... . . . ......

Here’s what each column means:

  • %time Again, we see a field that describes the runtime for each routine as a percent- age of the overall time taken by the program. As you might expect, all the entries in this column should total 100% (nearly).
  • cumulative seconds For any given routine, the column called “cumulative seconds” tallies a running sum of the time taken by all the preceding routines plus its own time. As you scan towards the bottom, the numbers asymptotically approach the total runtime for the program.
  • self seconds Each routine’s individual contribution to the runtime.
  • calls The number of times this particular routine was called.
  • self ms/call Seconds spent inside the routine, divided by the number of calls. This gives the average length of time taken by each invocation of the routine. The figure is presented in milliseconds.
  • total ms/call Seconds spent inside the routine plus its descendants, divided by the number of calls.
  • name The name of the routine. Notice that the cross-reference number appears here too.

Accumulating the results of several gprof runs

It is possible to accumulate statistics from multiple runs so that you can get a picture of how a program is doing with a variety of data sets. For instance, say that you wanted to profile an application — call it bar — with three different sets of input data. You could perform the runs separately, saving the gmon.out files as you go, and then combine the results into a single profile at the end:


% f77 -pg bar.f -o bar % bar<data1.input % mv gmon.out gmon.1% bar<data2.input % mv gmon.out gmon.2% bar<data3.input % gprof bar -s gmon.1 gmon.2 gmon.out>gprof.summary.out

In the example profile, each run along the way creates a new gmon.out file that we renamed to make room for the next one. At the end, gprof combines the infor- mation from each of the data files to produce a summary profile of bar in the file gprof.summary.out . Additionally (you don’t see it here), gprof creates a file named gmon.sum that contains the merged data from the original three data files. gmon.sum has the same format as gmon.out , so you can use it as input for other merged profiles down the road.

In form, the output from a merged profile looks exactly the same as for an individual run. There are a couple of interesting things you will note, however. For one thing, the main routine appears to have been invoked more than once — one time for each run, in fact. Furthermore, depending on the application, multiple runs tend to either smooth the contour of the profile or exaggerate its features. You can imagine how this might happen. If a single routine is consistently called while others come and go as the input data changes, it takes on increasing importance in your tuning efforts.

A few words about accuracy

For processors running at 600 MHz and more, the time between 60 Hz and 100 Hz samples is a veritable eternity. Furthermore, you can experience quantization errors when the sampling frequency is fixed, as is true of steady 1/100th or 1/60th of a second samples. To take an exaggerated example, assume that the timeline in [link] shows alternating calls to two subroutines, BAR and FOO . The tick marks represent the sample points for profiling.

Quantization errors in profiling

This figure is a line of alternating grey and dark grey sections. The grey sections are wider, and are labeled ..bar.., and the dark grey sections are labeled .foo.

BAR and FOO take turns running. In fact, BAR takes more time than FOO . But because the sampling interval closely matches the frequency at which the two subroutines alternate, we get a quantizing error: most of the samples happen to be taken while FOO is running. Therefore, the profile tells us that FOO took more CPU time than BAR .

We have described the tried and true UNIX subroutine profilers that have been available for years. In many cases, vendors have much better tools available for the asking or for a fee. If you are doing some serious tuning, ask your vendor representative to look into other tools for you.

Questions & Answers

what is mutation
Janga Reply
what is a cell
Sifune Reply
how is urine form
Sifune
what is antagonism?
mahase Reply
classification of plants, gymnosperm features.
Linsy Reply
what is the features of gymnosperm
Linsy
how many types of solid did we have
Samuel Reply
what is an ionic bond
Samuel
What is Atoms
Daprince Reply
what is fallopian tube
Merolyn
what is bladder
Merolyn
what's bulbourethral gland
Eduek Reply
urine is formed in the nephron of the renal medulla in the kidney. It starts from filtration, then selective reabsorption and finally secretion
onuoha Reply
State the evolution relation and relevance between endoplasmic reticulum and cytoskeleton as it relates to cell.
Jeremiah
what is heart
Konadu Reply
how is urine formed in human
Konadu
how is urine formed in human
Rahma
what is the diference between a cavity and a canal
Pelagie Reply
what is the causative agent of malaria
Diamond
malaria is caused by an insect called mosquito.
Naomi
Malaria is cause by female anopheles mosquito
Isaac
Malaria is caused by plasmodium Female anopheles mosquitoe is d carrier
Olalekan
a canal is more needed in a root but a cavity is a bad effect
Commander
what are pathogens
Don Reply
In biology, a pathogen (Greek: πάθος pathos "suffering", "passion" and -γενής -genēs "producer of") in the oldest and broadest sense, is anything that can produce disease. A pathogen may also be referred to as an infectious agent, or simply a germ. The term pathogen came into use in the 1880s.[1][2
Zainab
A virus
Commander
Definition of respiration
Muhsin Reply
respiration is the process in which we breath in oxygen and breath out carbon dioxide
Achor
how are lungs work
Commander
where does digestion begins
Achiri Reply
in the mouth
EZEKIEL
what are the functions of follicle stimulating harmones?
Rashima Reply
stimulates the follicle to release the mature ovum into the oviduct
Davonte
what are the functions of Endocrine and pituitary gland
Chinaza
endocrine secrete hormone and regulate body process
Achor
while pituitary gland is an example of endocrine system and it's found in the Brain
Achor
what's biology?
Egbodo Reply
Biology is the study of living organisms, divided into many specialized field that cover their morphology, physiology,anatomy, behaviour,origin and distribution.
Lisah
biology is the study of life.
Alfreda
Biology is the study of how living organisms live and survive in a specific environment
Sifune
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask