<< Chapter < Page Chapter >> Page >

Conclusions

Our optimal filter implementation utilized the Direct Form II Transposed filter structure with 8 POSIX threads and innovative data structuring with the SSE3 instruction set and the O3 compiler optimization flag to provide the perfect combination between cache hits, CPU utilization, and minimal cache poisoning. Assuming a sampling rate of 25 KHz, we were able to achieve our goal of real-time filtering by processing one second of simulated data in 14 ms (this figure was calculated using the real-time processing speed formula specified in the Results section).

Note that we obtained our results on a single type of computer architecture. We expect that many of our optimization methods, such as SSE and data reordering, will be effective on any architecture. The cache organization of most modern processors are relatively similar to our benchmark computer. Modern CPUs are primarily differentiated in the number of physical cores they contain. This will alter the number of p-threads required for optimal CPU utilization. Machines with more cores will efficiently utilize more p-threads, while machines with fewer cores will utilizes fewer p-threads.

Processing data at this speed is of paramount importance to the Open Ephys project and to neural signal processing in general. The filter bank we have written is designed to provide a generic bandpass filter to all incoming signals. Signals will typically require additional real-time processing afterwards. For example, a good number of neural signal projects need to detect specific phenomena (usually peaks) in real-time and provide impulses to alter the source neuron's behavior.

Future work

This project focused heavily on optimizing code for use on CPU architectures, however, optimizing the code for GPU implementation, especially GPGPU (General-purpose computing on graphics processing units), represents a major area for future performance improvement. GPU units are extremely efficient at manipulating graphics and matrix calculations because of parallelized structuring that more efficiently pipelines instruction sets. As a result, large blocks of data can be simultaneously computed in multichannel processes alleviating the burden of memory allocation and usage that we commonly encountered throughout the development of our project. Another major advantage of optimizing for GPU units is that the OpenCL computing framework upon which modern GPUs are built is a platform independent language. Also, an issue that we encountered in writing our code was that different computer architectures necessitated different optimization techniques, and as a result, code timings for different architectures varied significantly. Solutions made upon the OpenCL framework are extended to various architectures homogeneously, eliminating the need for targeted optimization techniques in addition to the basic compiler optimizations that we built into our code.

The next step in this project is to implement our code into the source code for the Open Ephys GUI. This will allow neuroscientists performing experiments with many electrodes to make use our filter implementation for real-time processing. This will be an invaluable addition to the open source project as the Open Ephys project grows in popularity and as the community tends towards experiments with a large number of electrodes.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Efficient real-time filter design for recording multichannel neural activity. OpenStax CNX. Dec 11, 2012 Download for free at http://cnx.org/content/col11461/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Efficient real-time filter design for recording multichannel neural activity' conversation and receive update notifications?

Ask