<< Chapter < Page Chapter >> Page >
Two algorithms operating in the time domain to change the pitch of solo human voice.

Introduction

Time domain pitch shifting algorithms have several advantages over frequency domain approaches. First, the formants of the original signal can be preserved, meaning that the timbre of the input signal will be largely unaffected. Second, the computational complexity is much less for time domain algorithm because there is no need to take transforms of the data. Two different algorithms were created and utilized with the major difference being the way in which the section of signal that is to be overlapped and added is selected.

Through this overlap and add approach the signal retains most of its correct shape. For both algorithms the original signal was broken down into overlapping windows of a specified size and hop size (which should be consistent with the values provided to the detection algorithm). Then, for each window the detected period (one divided by the detected fundamental frequency) and target period is computed and used to build up the new data for that window. After the construction of each window, which is described further under the two approaches, the windows themselves where then overlapped and added to create the new pitch corrected output signal. When the detection algorithm decides that a given window is unvoiced (i.e. has no fundamental frequency), both algorithms just copy that window as is, without any modification. A Hanning window is used to filter out the inconsistencies created from adding together overlapping windows of the output signal. This helps in the smoothing process so that there are not large discontinuities between added segments.

Sample of human voice

Several periods of a human voice holding a note.

Psola : pitch-synchronous overlap-add

The key to PSOLA is the determination and utilization of pitch markers in the original signals. The idea is that these markers should be equally spaced throughout the signal (at intervals equal to the detected fundamental period), but also that they should be placed at a location for which the signal has a maximum value (a peak). These two constraints are often in conflict, especially since our assumption that the fundamental period is constant for the entire window is not entirely true. As a result, following the highest peak in the signal from period to period may require relaxing the requirement that the markers be exactly equally spaced. On the other hand, if we only follow the maximum peak without regard for the fundamental period, our markers no longer have any regard for the pitch of the window and are not useful.

In order to strike this compromise, we created a matrix where each column contains two periods of the signal and the center row starts at 0 and increments by one period each column. Then we used a dynamic path finding algorithm (created by Vladimir Goncharoff and Patrick Gries from the University of Chicago in Illinois) to find a path that went through the maximum peak as much as possible, but which did not exceed a given slope as it went through the matrix. Since a slope of 0 (horizontal line) means the markers are equally spaced, the slope is the factor that is adjusted to strike the compromise between following peaks and maintaining periodicity. Empirically, we found a suitable value of this slope to be around 4. In the diagram below these pitch marks are labeled as mi-1, mi and mi+1.

Pitch markers across windows

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Ece 301 projects fall 2003. OpenStax CNX. Jan 22, 2004 Download for free at http://cnx.org/content/col10223/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Ece 301 projects fall 2003' conversation and receive update notifications?

Ask