speech -> text on FR?
saurabhgupta1403 at gmail.com
Mon Jun 16 20:32:38 CEST 2008
On Mon, Jun 16, 2008 at 6:04 AM, Dan Staley <dlstal2 at uky.edu> wrote:
> I actually just interfaced with the Sphinx project at one of the
> research positions I hold. It is actually a very well written interface
> (for the most part...there were a few things poorly documented and/or
> implemented) But anyway, I found the java version of the project (Sphinx
> 4 http://cmusphinx.sourceforge.net/sphinx4/ ) to be pretty easy to
> build/interface with.
Its great Dan that u got sphinx packages worked for you. I tried it but got
some error. However now a days i was concentrating on understanding their
some libraries and trying to write my own optimized codes. I will definitely
ping you in case of any help.
> The benefit of using the HMMs and models and methods that Sphinx
> implements is that anyone in their programs should be able to specify a
> grammar (similar to a simplified regex) that they want to be recognized
> and then the interpreter should be able to be user independant...meaning
> anyone can speak the phrase into the phone and get the desired output.
> Speech training wouldn't be required. I found that once you set it up
> correctly, the Sphinx engine is very powerful, and usually identifies
> the spoken words no matter who says them (we found it even seemed to
> work decently well with a variety different accents).
This is good and in fact I will also try to implement this in the model. I
will get the HMM models of words by training them from different speakers.
This thing i have covered in my Design Document.
Thanks in advance...
> -Dan Staley
> On Sun, 2008-06-15 at 19:07 -0400, Ajit Natarajan wrote:
> > Hello,
> > I know nothing about speech recognition, so if the following won't work,
> > please let me know (gently :) ).
> > I understand that there is a project called Sphinx in CMU which attempts
> > speech recognition. It seems pretty complex. I couldn't get it to work
> > on my Linux desktop. I'm not sure if it would work on an FR since it
> > may need a lot of CPU horsepower and memory.
> > I see a speech project on the OM projects page. To me, it seems like
> > the project is attempting command recognition, e.g., voice dialing.
> > However, it would be great if the FR can function as a rudimentary
> > dictation machine, i.e., allow the user to speak and convert to text.
> > Perhaps the following may work.
> > 1. Ask the user to speak some standard words. Record the speech and
> > establish the mapping from the words to the corresponding speech.
> > It may even be good to maintain separate databases for different
> > purposes, e.g., one for UNIX command lines, one for emails, and a
> > third for technical documents.
> > 2. The speech recognizer then functions similar to a keyboard in that it
> > converts speech to text which it then enters into the application
> > that has focus.
> > 3. The user must speak word by word. The speech recognizer finds the
> > closest match for the speech my checking against the recordings made
> > in step 1 (and step 4). The user may need to set the database from
> > which the match must be made.
> > 4. If there is no close match, or if the user is unhappy with the
> > selection made in step 3, the user can type in the correct word. A
> > new record can be added to the appropriate database.
> > The process may be frustrating for the user at first, but over time, the
> > speech recognition should become better and better.
> > The separate databases may be needed, for example, because the word
> > period should usually translate to the symbol `.' except when writing
> > about time periods when it should translate to the word `period'.
> > I do not know what the storage requirements would be to maintain this
> > database. I do not know if the closest match algorithm in step 3 is
> > even possible. But if we could get a good dictation engine, that would
> > be a killer app, in my opinion. No more typing! No more carpal tunnel
> > injuries. No more having to worry about small on screen keyboards that
> > challenge finger typing.
> > Thanks.
> > Ajit
> Openmoko community mailing list
> community at lists.openmoko.org
Electronics and Communication Engg.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the community