0.6 Results and discussion (Page 4/5)

Page 4 / 5

For the benchmarks in this chapter, SFFT used a calibration routine to choose the fastest configuration. The calibration data was collected, along with some data about the machine and the compiler, and used to train a classifier.

The data was processed into instances, with each instance having attributes for the size of the transform and the precision, the size of each level of cache, the architecture and micro-architecture of the machine, the SIMD extensions, the OS, the compiler used, and the CPU frequency. In total there were 3348 instances of data, each of which had 12 attributes.

Weka [link] was used to experiment with several classifiers, and a REPTree classifier with bagging was used to train a model. Using 10-fold cross-validation, the model correctly classified 76.1% of the instances with a weighted average precision of 74.8%, which tends to confirm the existence of a relationship between the characteristics of the machine and the performance of a particular FFT configuration.

The accuracy of the classifier is promising, and it has the potential to replace the calibration code in SFFT. It is highly likely that if the noise in the data was reduced through the use of an isolated benchmarking environment, the accuracy of the classifier would increase. The accuracy would also likely benefit from a larger dataset collected from a larger range of benchmark machines.

Split-radix vs. conjugate-pair

In order to quantify the gain in performance that might be attributable to the use of the conjugate-pair algorithm, SFFT was retrospectively modified to compute the FFT using the ordinary split-radix algorithm as well as the conjugate-pair algorithm. The results of benchmarks between the two algorithms, as well as FFTW and SPIRAL, are plotted in [link] .

Unexpectedly, the ordinary split-radix algorithm is faster than the conjugate-pair algorithm for some smaller sizes of transform, but for transforms above a certain size, the conjugate-pair algorithm is faster by a few hundred MFLOPS.

The performance advantage of the ordinary split-radix algorithm for smaller sizes of transforms is likely due to shorter chains of dependent instructions where twiddle factors are loaded and used. Consider that the ordinary split-radix algorithm separately loads two twiddle factors into two registers, and there are no dependencies between these instructions, while the conjugate-pair algorithm must load one twiddle factor and then duplicate it into another register, which does result in dependent instructions. Thus the ordinary split-radix algorithm is faster for smaller transforms where memory bandwidth is not the limiting factor, but when memory bandwidth does become the limiting factor, the conjugate-pair algorithm is faster.

In future, SFFT could exploit the performance advantage of the ordinary split-radix algorithm when computing smaller sizes of transforms.

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, Computing the fast fourier transform on simd microprocessors. OpenStax CNX. Jul 15, 2012 Download for free at http://cnx.org/content/col11438/1.2

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Computing the fast fourier transform on simd microprocessors' conversation and receive update notifications?

Ask

	Foundations of Software Engineering By Kevin Amaratunga Start Quiz
©flickr:	Vocabulary Practice Quiz! By Katie Montrose Start Quiz
	1 Microeconomics 01 What Is Economics? By OpenStax Start Flashcards
	1 Biology 01 The Study of Life MCQ By OpenStax Start Quiz
	English Vocabulary By Jordon Humphreys Start Quiz
	SCEA for Java EE Study Guide By Edward Biton Start Quiz
	Microbiology Practice Test By Sandhills MLT Start Test
©flickr: Gage	How well do you know Ross Lynch? By Brianna Beck Start Quiz
©flickr: Steve	C Programming Language By JavaChamp Team Start Quiz
	14 Dr Landholt Large Animal Medicine-GI quiz By Brooke Delaney Start Exam