<< Chapter < Page Chapter >> Page >

Database parsing

This database was selected because it contained labeled emotional samples. The annotation files followed the general format

Start Stop A: Emotion,Phrase

Consequently, a parsing file was written that scanned each line and searched for 'A:'. If all sample descriptors could be identified successfully, the sample was added to the database providing that

  • The start and stop times were less than 3 seconds apart
  • The emotion matched one of the designated 15 categories
  • The phrase did not contain any non-alpha characters (ie, [ ] or )

With the database assembled, the files were read individually and each sample extracted one at a time.

Silence removal

Although the annotations were always accurate, they were not exactly precise. The padding of silence that surrounded each sample added bias to our feature vectors. Consequently, we developed a method to detect and remove this padding. We want to apply a threshold above which we deem the speaker to be active. Directly applying such a threshold to the raw signal is risky because of the inherent background noise. Thus, we first find the amplitude envelope. The Hilbert transform was applied to each sample. This returns the analytic signal. A lowpass FIR filter with cutoff frequency 20 Hz was implemented in MatLab. Finally, we applied a zero-phase filtering technique. This essentially filters the signal in the forward direction and then again in the reverse direction. Because the phase shift of a FIR filter is linear in the passband, the phase shift in each direction is cancelled. This produces an excellent amplitude envelope with no offset in the time domain.

analyt = hilbert(trace);

b = fir1(200,20/(fs/2),'low');

env = filtfilt(b,1,abs(analyt));

Finally, we applied a generous threshold, found the first and last crossings of that threshold, and deleted the silence padding.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Robust classification of highly-specific emotion in human speech. OpenStax CNX. Dec 14, 2012 Download for free at http://cnx.org/content/col11465/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Robust classification of highly-specific emotion in human speech' conversation and receive update notifications?

Ask