<< Chapter < Page Chapter >> Page >
A description of how formant frequencies were extracted from vowel samples.

For each vowel sample we needed to extract the first and second formant frequencies. To do this we made a function in MATLAB that we could then apply to each speaker's vowel samples quickly. In an ideal world with clear speech this would be a straightforward process, since there would be two or more peaks on the frequency spectrum with little oscillation. The formants would be simply be the locations of the first two peaks.

However, very few of the samples are this clear. If the formants do not stay constant during the entire clip, then the formant peaks have smaller peaks on them. In order to solve this problem we did three things. First we cut the samples into thirds, found the formants in each division, and then averaged the three values for a final formant value. Second, we ignored frequencies below 300 Hz which correspond to frequencies made when the human vocal tract makes a sound. Finally we filtered our frequency spectrum data to remove noise from the peaks. We also experimented with cubing the spectrum, but the second formant was generally small and cubing the signal made it harder to find. As a guide for the accuracy of our answer we used the open source application Praat. Praat can accurately find the formants using a more advanced techniques.

With the aid of Praat, the first and second formants should be 569.7 Hz and 930.3 Hz. In the unfiltered spectrum there is a strong peak just above 300Hz which does not correspond to a formant, in the filtered spectrum it is removed.

To locate the first formant we started by finding the maximum value in the spectrum. However, sometimes the second formant is stronger than the first, so we looked for another peak before this first guess above a threshold (1.5 on the normalized scale). If a peak could not be found before the maximum to be the first formant, then we had to search for a second formant beyond the first. We did this in the exact same manner as finding the first, but we only looked at the part of the spectrum above the minimum immediately following the first peak. We found this minimum with the aid of the derivative.

This function was used on each vowel sample to generate an accent profile for each speaker. The profile consisted of the first and second formants of the speaker's 14 vowels in a column vector.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Elec 301 projects fall 2005. OpenStax CNX. Sep 25, 2007 Download for free at http://cnx.org/content/col10380/1.3
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Elec 301 projects fall 2005' conversation and receive update notifications?

Ask