<br><br><div><span class="gmail_quote">On 11/28/06, <b class="gmail_sendername">Joel Newkirk</b> <<a href="mailto:moko@newkirk.us">moko@newkirk.us</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Robert Michel wrote:<br>> Salve Richard!<br><br>>> I like your ideas here,<br>> ;)<br>><br>>> it definitely looks feasible to support a<br>>> small subset of voice-commands.. "Yes/No/Again/Next" could be
<br>>> standardised and available to all applications.<br>><br>> AFAIK will a small number of different voice commands have a good<br>> regognize rate.<br>><br><br>IBM released a modified multimodal Opera web browser for the older-style
<br>Zaurus (Embedix linux) that supports voice interaction tags - "Websphere<br>Everyplace Multimodal Environment". I've played around with it, and it<br>works pretty well.<br><br>By using XML (XHTML plus VoiceXML, actually) and defining limited-domain
<br>voice tags within a document it can distinguish spoken numbers, names,<br>pizza toppings, etc without training. The engine should be able to<br>handle a screenful of 9-16 icons by name plus basic menus, for example.<br>
As long as each item consists of a distinct series of phonemes it's<br>smooth. (it doesn't need to hear the difference between 'whiter' and<br>'writer' - it's not speech-to-text)<br><br>I for one find 'voice tags' on my cells to have been irritating, but
<br>have always wanted to be able to just recite a number and store or dial,<br>or fire up the calculator and run some calculations, without pressing<br>buttons or navigating menus. Between that and FLite (Festival Lite<br>
speech synthesis engine, available for the Zaurus and various ARM-Linux<br>distros) you have the underpinnings of some very interesting possibilities.<br><br>j</blockquote><div><br>Hey! and OpenMoKo is supposed to be build compatible with Zaurus apps, right? so we're half way there
<br><br>--Jeff<br></div><br></div><br>