<< Chapter < Page Chapter >> Page >
Summarizes our experimental procedures and our results in relation to Voice Recognition through Formant analysis.

Adrian A. Galindo Experimental Methodology, Conclusions, and Future Work

Experimental methodology, conclusions, and future work

We now turn our attention to the work that went into testing our model to ensure it worked as designed.

Experimental methodology

In order to test if our system works we had to ensure that the algorithm would produce the correct result (allow, deny) regardless of which recording of the template speaker or the two intruders we used.

  1. The Template User records a series of just vowels from the English language in order to train the algorithm. This recording is done on a separate day to better simulate the common variability in vocal resonance found in day to day speech. See the results for our template speaker in [link] .
  2. Each of the 3 users (template, intruder 1, intruder 2) records each of the 12 HVD words(see [link] which encapsulate the entirety of the English vowel space.
  3. We run the algorithm, comparing each of the 3 recordings against the originally stored vowels of the template user. A percent error is computed for each user (even the template speaker) and a decision of allow or deny is made based on the minimum error.
  4. Steps 2 through 3 are repeated a total of 3 times to ensure some measure of repeatability.
HVD Words
Our Template Speaker's Average Formant Frequencies


When we ran our experiments we found that while the template speaker was occasionally not the person with the lowest percent error on a per-vowel basis he had the lowest percent error across the entire vowel space. We found this result to be consistent across several different trials. In each a trial a new recording of the same three people was made - often on different days, in order to allow for maximum day to day variability in voice and recording positioning. We wanted to make sure that it was not just a particular set of data that looked good against the template speaker. In addition we wanted the template speaker to make several recordings to ensure that the system could cope with differences in both the template speaker and in the potential intruders.

Experimental Results
Subject Average % Error Trial 1 Average % Error Trial 2 Average % Error Trial 3
Template Speaker 7.673 8.902 7.747
Speaker A 11.787 11.301 11.102
Speaker B 20.676 16.933 12.389

More detail on each of the trials can be found in Figures   [link][link][link]

Trial 1 Note: the X-Axis all 3 trials is labeled according to the HVD listings in [link]
Trial 2
Trial 3


In conclusion we found that the system we built does exactly as we intended: it is able to tell the difference between the user it was tuned to and everybody else. There were a few caveats to the system we built. In particular we noticed that when the template speaker got a little sick the system was no longer able to grant him access. Due to the newfound resonant characteristics of the template speaker's vocal tract caused no doubt by the addition of mucous and swelling, it was quite difficult for the system to match it to the stored template formants.This is a bit of a sticky problem because we want to tune the system as sharply as we can to the template speaker's resonant characteristics so that even a slight change would cause the system to revoke the user. These slight changes seen through a security lens are an added opportunity to take advantage of the entire variability space to increase "passphrase" entropy and make the entire system more secure. The same slight changes seen through a usability perspective however are seen as a rather formidable annoyance for the sick template user attempting to gain access to his environment. This balance between lenience and security is one that most biometric security systems must weigh. Anyone who has attempted to use a fingerprint reader after a long shower (raisined fingers) knows which way the company that made that biometric system choose to lean towards. In the same manner we figured that while the balance could definitely use some fine tuning we figured it would be best to produce a false negative than to produce a false positive and allow a rouge user access to the secure environment. Of interesting note we noticed that when a user changed the pitch of his voice in order to try to match the template speaker the position of the formants in the frequency domain changed very little. This result shows that the system is indeed tuned to a user's resonant characteristics and not just the pitch of the voice. This result also makes it extremely difficult for a would be attacker to gain access to the system even if he did know the exact passphrase and what the secured template user sounds like.

Future work

In its current state the formant detection system makes use of several very convenient MATLAB features such as the filter design tool to rapidly create a computable filter. It also leverages the rather large amount of computational power available to modern computer platforms to make some rather sharp filters and decisions. While it is true that we did not set out to make the most efficient mechanism by which to identify a user based on formants, we would truly like to see this system implemented in an embedded hardware environment. This would require that we trim the program and its requisite filters considerably just to get the algorithm to fit within the confines of the restricted memory space available in most embedded environments. In addition this type of embedded security system is of little use if it cannot be run at real or near-real time speeds. Use of a powerful embedded environment such as an FPGA could see this type of implementation without a drastic reduction in the filter sharpness; however such an implementation would also require considerable effort in order to port the filter and surrounding decision rules into the hardware reconfigurable languages of Verilog or VHDL.

Hidden feature: vowel recognition

In order to accurately guess whether the right vowel is even being spoken before attempting to compute a percent error from a potentially wrong vowel to the template speaker we had to build in some sort of vowel recognition feature. The recognition implementation is rather crude but we found it to be quite accurate in our tests. We never tested the vowel recognition alone but rather saw the results of it in the streaming debug statements our program can output. Based solely on this we believe that the subsystem could be expanded and refined to both aide in the voice recognition process and to increase security by fully checking the phrase spoken.

Questions & Answers

how do they get the third part x = (32)5/4
kinnecy Reply
can someone help me with some logarithmic and exponential equations.
Jeffrey Reply
sure. what is your question?
okay, so you have 6 raised to the power of 2. what is that part of your answer
I don't understand what the A with approx sign and the boxed x mean
it think it's written 20/(X-6)^2 so it's 20 divided by X-6 squared
I'm not sure why it wrote it the other way
I got X =-6
ok. so take the square root of both sides, now you have plus or minus the square root of 20= x-6
oops. ignore that.
so you not have an equal sign anywhere in the original equation?
Commplementary angles
Idrissa Reply
im all ears I need to learn
right! what he said ⤴⤴⤴
what is a good calculator for all algebra; would a Casio fx 260 work with all algebra equations? please name the cheapest, thanks.
Kevin Reply
a perfect square v²+2v+_
Dearan Reply
kkk nice
Abdirahman Reply
algebra 2 Inequalities:If equation 2 = 0 it is an open set?
Kim Reply
or infinite solutions?
The answer is neither. The function, 2 = 0 cannot exist. Hence, the function is undefined.
Embra Reply
if |A| not equal to 0 and order of A is n prove that adj (adj A = |A|
Nancy Reply
rolling four fair dice and getting an even number an all four dice
ramon Reply
Kristine 2*2*2=8
Bridget Reply
Differences Between Laspeyres and Paasche Indices
Emedobi Reply
No. 7x -4y is simplified from 4x + (3y + 3x) -7y
Mary Reply
is it 3×y ?
Joan Reply
J, combine like terms 7x-4y
Bridget Reply
how do you translate this in Algebraic Expressions
linda Reply
Need to simplify the expresin. 3/7 (x+y)-1/7 (x-1)=
Crystal Reply
. After 3 months on a diet, Lisa had lost 12% of her original weight. She lost 21 pounds. What was Lisa's original weight?
Chris Reply
what's the easiest and fastest way to the synthesize AgNP?
Damian Reply
types of nano material
abeetha Reply
I start with an easy one. carbon nanotubes woven into a long filament like a string
many many of nanotubes
what is the k.e before it land
what is the function of carbon nanotubes?
I'm interested in nanotube
what is nanomaterials​ and their applications of sensors.
Ramkumar Reply
what is nano technology
Sravani Reply
what is system testing?
preparation of nanomaterial
Victor Reply
Yes, Nanotechnology has a very fast field of applications and their is always something new to do with it...
Himanshu Reply
good afternoon madam
what is system testing
what is the application of nanotechnology?
In this morden time nanotechnology used in many field . 1-Electronics-manufacturad IC ,RAM,MRAM,solar panel etc 2-Helth and Medical-Nanomedicine,Drug Dilivery for cancer treatment etc 3- Atomobile -MEMS, Coating on car etc. and may other field for details you can check at Google
anybody can imagine what will be happen after 100 years from now in nano tech world
after 100 year this will be not nanotechnology maybe this technology name will be change . maybe aftet 100 year . we work on electron lable practically about its properties and behaviour by the different instruments
name doesn't matter , whatever it will be change... I'm taking about effect on circumstances of the microscopic world
how hard could it be to apply nanotechnology against viral infections such HIV or Ebola?
silver nanoparticles could handle the job?
not now but maybe in future only AgNP maybe any other nanomaterials
I'm interested in Nanotube
this technology will not going on for the long time , so I'm thinking about femtotechnology 10^-15
can nanotechnology change the direction of the face of the world
Prasenjit Reply
At high concentrations (>0.01 M), the relation between absorptivity coefficient and absorbance is no longer linear. This is due to the electrostatic interactions between the quantum dots in close proximity. If the concentration of the solution is high, another effect that is seen is the scattering of light from the large number of quantum dots. This assumption only works at low concentrations of the analyte. Presence of stray light.
Ali Reply
the Beer law works very well for dilute solutions but fails for very high concentrations. why?
bamidele Reply
how did you get the value of 2000N.What calculations are needed to arrive at it
Smarajit Reply
Privacy Information Security Software Version 1.1a
Got questions? Join the online conversation and get instant answers!
QuizOver.com Reply

Get the best Algebra and trigonometry course in your pocket!

Source:  OpenStax, Voice recognition. OpenStax CNX. Dec 19, 2011 Download for free at http://cnx.org/content/col11389/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Voice recognition' conversation and receive update notifications?