GSoC Project Status Update 04: Speech Recognition in Openmoko

steve steve at openmoko.com
Sun Jun 29 22:45:09 CEST 2008


very cool. as voice reco is one of my long time passions I'm glad to see
someone take this up.

  _____  

From: community-bounces at lists.openmoko.org
[mailto:community-bounces at lists.openmoko.org] On Behalf Of saurabh gupta
Sent: Sunday, June 29, 2008 9:38 AM
To: List for Openmoko community discussion
Cc: community-repository at lists.openmoko.org.
Subject: GSoC Project Status Update 04: Speech Recognition in Openmoko


Hello everyone,

    Finally I also got my Neo Freerunner on Friday and I spent some time
playing with it:). Here is the status update of this week. I finally
completed the code book design code using vector quantization. Now the
testing phase of the code is going on. I recorded various samples of word
like "hello" in a .wav file and then using scilab I converted them into text
files having arrays of numbers. The most challenging part in testing is the
proper scaling and testing of each subroutine separately for fixed point
notation. I also made an important change in fixed point by now using 8:8
notation than 16:16 as suggested by Erwin Lewin. However, I had to keep a
track of all data types used in various subroutines for their ranges which
was also interesting. While checking each subroutine separately,I found most
of them giving correct results but some still needs to be modified for
underflow and overflow problems.
    Besides this, some modification is being done in the noise rejection
part since it can degrade the performance wildly. I will use zero crossing
rate and short term energy algorithm for end point detection. My model will
also use left to right HMM. But for properly training HMM models one needs
more than one training sequence. It means that in speaker dependent
recognition, for training any word, one needs to utter the same word two or
three times so that proper modeling of HMM parameters take place. When more
than one training sequence is used for training, baum welch or K-means
segmental method gives better modeling of HMM  parameters.

Next To Do:
1)Porting the whole code on openmoko platform:
2)testing with real adc channel of Freerunner
3)Proper testing of noise handling and recognition on freerunner


-- 
Saurabh Gupta
Electronics and Communication Engg.
NSIT,New Delhi, India
I blog here: http://saurabh1403.wordpress.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openmoko.org/pipermail/community/attachments/20080629/99650c70/attachment.htm 


More information about the community mailing list