<< Chapter < Page Chapter >> Page >

In contrast, unvoiced speech has more of a noise-like quality. Unvoiced sounds are usually much smaller in amplitude,and oscillate much faster than voiced speech. These sounds are generally produced by turbulence,as air is forced through a constriction at some point in the vocal tract. For example, an h sound comes from a constriction at the vocal cords, and an f is generated by a constriction at the lips.

An illustrative example of voiced and unvoiced sounds contained in the word “erase” are shown in [link] . The original utterance is shown in (2.1).The voiced segment in (2.2) is a time magnification of the “a” portion of the word.Notice the highly periodic nature of this segment. The fundamental period of this waveform, which is about 8.5 ms here,is called the pitch period . The unvoiced segment in (2.3) comes from the “s” soundat the end of the word. This waveform is much more noise-like than the voicedsegment, and is much smaller in magnitude.

(2.1) Utterance of the word “erase”. (2.2) Voiced segment. (2.3) Unvoiced segment.

Classification of voiced/unvoiced speech

Download the file start.au for the following sections. Click here for help on how to load and play audio signals .

For many speech processing algorithms, a very important step is to determine the type of sound that is being uttered in a given timeframe. In this section, we will introduce two simple methods for discriminatingbetween voiced and unvoiced speech.

Download the file start.au , and use the auread() function to load it into the Matlab workspace.

Do the following:

  • Plot (not stem) the speech signal. Identify two segments of the signal: one segment that is voicedand a second segment that is unvoiced (the zoom xon command is useful for this).Circle the regions of the plot corresponding to these two segments and label them as voiced or unvoiced.
  • Save 300 samples from the voiced segment of the speech into a Matlab vector called VoicedSig .
  • Save 300 samples from the unvoiced segment of the speech into a Matlab vector called UnvoicedSig .
  • Use the subplot() command to plot the two signals, VoicedSig and UnvoicedSig on a single figure.
Hand in your labeled plots. Explain how you selected your voiced and unvoiced regions.

Estimate the pitch period for the voiced segment. Keep in mind that these speech signals are sampled at 8 KHz, which means that the time betweensamples is 0.125 milliseconds (ms). Typical values for the pitch period are 8 ms for male speakers,and 4 ms for female speakers. Based on this, would you predict that the speaker is male, or female?

One way to categorize speech segments is to compute the average energy, or power.Recall this is defined by the following:

P = 1 L n = 1 L x 2 ( n )

where L is the length of the frame x ( n ) . Use [link] to compute the average energy of the voiced and unvoiced segments that you plotted above.For which segment is the average energy greater?

Another method for discriminating between voiced and unvoiced segments is to determine the rate at which the waveformoscillates by counting number of zero-crossings that occur within a frame (the number of times the signal changes sign).Write a function zero_cross that will compute the number of zero-crossings that occur within a vector, and apply this to the two vectors VoicedSig and UnvoicedSig . Which segment has more zero-crossings?

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Purdue digital signal processing labs (ece 438). OpenStax CNX. Sep 14, 2009 Download for free at http://cnx.org/content/col10593/1.4
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Purdue digital signal processing labs (ece 438)' conversation and receive update notifications?

Ask