GSOC and Accelerometer Gestures Idea

Paul-Valentin Borza paulvalentin at borza.ro
Tue Mar 25 18:58:01 CET 2008


Hi,

I have registered in the GSOC today. I'll only have time to add the student
application tomorrow evening as Tuesday and Wednesday are busy school days.
I hope that's early enough for you to take a look over the application and
send back comments on how to refine the app. I'm glad that someone is
interested in my solution. Thanks.

You are definitely right with the sources that trigger the accelerometer.
It's pointless to keep the accelerometer running in the background, not to
mention that the algorithm always needs to run and is computationally
intensive and would drain the battery quickly. The incoming call is the
trigger in this case.
However, we need online recognition because the other two (isolated and
connected) work in offline mode - meaning the procedure takes an array of
observations of fixed size. Once the trigger has activated the
accelerometer, online recognition kicks in and starts recognizing. It
eventually stops after the online recognizer has detected an answer or
ignore gesture. Online recognition doesn't require to know when to stop. In
the answer gesture, you don't have the stop trigger.

Here's something about the accuracy of my HMMs. I'm using continuous density
HMMs because the signal is continuous (+- 3G). I could also use codebooks to
quantize the vector but that would give a serious degradation to the
recognition accuracy. Now, because my vector is of length 3 (XYZ) I have to
use a finite mixture that models the vector against a Gaussian probability
density function. I've chosen the Trivariate Normal Distribution Function (
http://mathworld.wolfram.com/TrivariateNormalDistribution.html).
To ease my work, I'm generating MATLAB files from my c++ program to plot the
parameters of the HMM. Because the TriVar has 3 params + 1 result, I can't
even plot it on the screen and had to decompose it in 3 standard normal
distribution diagrams.

HMMs are mainly used in speech recognition with more than 20000 models
(words). They've proven that they're great at this job.
However, my current solution is not that great because I haven't implemented
all the "tweaks" that make it that good.
For example, there is the so called "State Tieing" where certain parameters
are re-estimated in a consistent way. I've designed a HMM so that it has 27
states (3 at the power 3). Each axis can be in a negative, null or positive
sub-state. There is really much to discuss here and I don't think I can in
one email :)
Another optimization is "State Duration" meaning that a transition can be
made to another state only after the appropriate number of observations have
occured in the state.
I'll have to implement these.

I'm currently doing testing in small groups. I've tested digits 0 - 4 when
I've seen that 2 and 3 weren't recognized vey well, 2 was often recognized
as 3. 2 is made of half a cycle and one horizontal line, and 3 is made of
two half cycles. This can be corrected by manual re-estimating the
parameters. The algorithms are not that hard to understand in HMMs. The most
important aspect in HMMs is how to compute the parameters and reevaluate
them. I can correct 3 by giving the state in which the Wii/Neo is
accelerating on +Z (the horizontal line) the probability 0. Therefore,
decreasing the probability of model 3 when it encounters that state
(straight line). In the end, it's all about the probabilities of the
parameters inside each HMM.
Remember that you shouldn't train each model (gesture) individually. You
want to increase the probability for one gesture and decrease the
probability of the same observation sequence for the others.

I'll try to write the GSOC app as soon as possible.

Thanks,
Paul-Valentin Borza

On Mon, Mar 24, 2008 at 10:42 PM, Daniel Willmann <daniel at openmoko.org>
wrote:

> Hello,
>
> On Wed, 19 Mar 2008 21:49:59 +0200
> "Paul-Valentin Borza" <paulvalentin at borza.ro> wrote:
>
> > My name is Paul-Valentin Borza (http://www.borza.ro) and I'm working
> > on my Bachelor of Computer Science Thesis – Motion Gestures: An
> > approach with continuous density hidden Markov models. I've designed
> > and implemented in c++ continuous density hidden Markov models for
> > the data measured by a 3-Axis ±3G accelerometer (the Nintendo Wii
> > Remote over Bluetooth).
> >
> > There are several alternatives to motion (accelerometer) gestures
> > like:
> >
> [...]
>
> it certainly looks like you did your homework already. :-)
>
> > Once the models are created and trained, there are three types of
> > recognitions:
> >
> > Isolated recognition (the user presses a button, makes a gestures,
> > releases the button and the gesture is recognized) – Viterbi Beam
> > Search
>
> This is definitely interesting, could also be triggered through an
> event other than button press, i.e. incoming call
>
> > Connected recognition (the user presses a button, makes several
> > gestures, releases the button and those several gestures are
> > recognized) – 2-Level
>
> Don't know about the usefulness of this.
>
> > Online recognition (the user just makes a gesture as the
> > accelerometer is monitored constantly and the gestures are recognized
> > on the fly) – this is the one that should be used on mobile devices
>
> That would be the coolest, but I see several problems, especially
> falsely detecting gestures while you are moving around and battery
> lifetime. As long as you monitor for gestures the CPU cannot go into
> suspend which will dramatically reduce battery lifetime. Maybe we can
> achieve almost the same level of seamless recognition by choosing our
> trigger sources wisely.
>
> > I'll stop here with the theory. If someone needs further
> > clarification, please ask.
> >
> > I believe I have the required skills to build the Accelerometer
> > Gestures Idea for the Google Summer of Code.
> >
> > Gestures will be an innovation in mobile phones. Just imagine the
> > scenario where your phone is on the table and it's ringing... You
> > pick the phone, see who's calling and you take your phone to your ear
> > to talk (the phone answers the call automatically). The answer mobile
> > gesture is exactly this: you take your phone to your ear. Remember
> > that this is an involuntary action that you always perform. Pressing
> > the green answer button will no longer be needed.
>
> Really cool!
>
> > It's almost like the phone is reading you mind!
> >
> > Plus, the user can create his/her own custom gestures.
>
> That's definitely a must.
>
> > I already have a working solution for the Nintendo Wii Remote as
> > described earlier and it should be easy to port on OpenMoko.
>
> Cool, how accurate is your detection? Could you, say, hold the Wii/Neo
> as a pen and "write" stuff which will then be digitized?
>
> > What do you think?
>
> By all means please submit your application to GSoC. The earlier you
> publish your application and timeline the more time we have for giving
> feedback in order to refine the application.
>
> Regards,
> Daniel Willmann
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openmoko.org/pipermail/openmoko-devel/attachments/20080325/31a9f6a3/attachment.html


More information about the openmoko-devel mailing list