ASU keyboards, again

"Marco Trevisan (Treviño)" mail at 3v1n0.net
Wed Aug 27 23:12:59 CEST 2008


Carsten Haitzler (The Rasterman) wrote:
> On Wed, 27 Aug 2008 16:00:50 +0200 "Marco Trevisan (Treviño)" <mail at 3v1n0.net>
> babbled:
>> Well, generally for small words there's a correction list, but it's not 
>> always complete and often there are words very different from the one 
>> I'd like to write, but not that one. So maybe it doesn't search in all 
>> the dictionary. I could I try that?
>> However my fingers are not so great...
>> If you want I can send you my dictionaries, so you'll be able to test 
>> them in a better way.
> 
> hmm. is this english?i am wondering if non-ascii chars are messing it up or
> not. your dictionary may be useful - i have just been going off my 98,000 or so
> entry dict from /usr/share/dict/words which seems to be big enough for me it
> seems and has pretty much everything in it... for english anyway. as its used
> for spellchecking i kind of assumed it'd be good enough for typing up sms's and
> emails :) at least in my tests it is listing all the completions i'd expect it
> to. did you sort -f the illume dict? (non-case-sensitive sort)?

Yes it's sorted and it's an Italian dictionary (so few non-ascii chars); 
that's why it has so many words. Consider that an Italian dictionary has 
about 120000 words to be declined.
So from a verb in the infinite form I can extract about 50 different 
words, from names and from adjectives about 3 for each.
But here (like in the more common occidental languages), in most cases, 
only the suffix differs.

Imho, a way to reduce the size would be allowing a rule to set suffix 
and prefix (for composed words) that would reduce the dictionary size.
So, for example, in my dictionary instead of using 50 lines for each 
verb I would use only one per one; i.e.:

Italian verb "parlare" (to talk) would be (not complete)
  parl{o,i,a,iamo,ate,ano,avo,avi,ava,avamo,avate,avano,ai,asti,ò,ammo, \
       aste,arono,erò,erai,erà,eremo,erete,eranno,erei,eresti,erebbe, \
       eremmo,ereste,erebbero,ii,iamo,iate,ino,assi,asse,assimo, \
       assero,ino,ando,ante,ato,ata,ati}

Italian noun "casa" (house) would be
  cas{a,e}

Italian adjective "libero" (free [as freedom]) would be
  liber{a,i,o}


BTW I don't know if this would improve the keyboard typo-fixing work 
(maybe yes if also the suffixes/[prefixes?] are sorted)

Anyway, let me know I should send you the dict I've.

> illume's dict is 6mb? hmm i guess the raw text there has a lot of redundancy :)
Yes and this happens because of the things shown above. And I've made 
only a part of the work; i guess that the final dictionary will double 
this size. And it won't contain any proper name (City names, Sigles...).

Italian standard linux dictionary (/usr/share/dict/italian) "weights" 
1,2mb but it's mostly incomplete.

> i tried to keep the dictionary simple in illume but am always willing to look
> at other ways to improve it. though the keyboard is not really a focus of mine
> - it's something along the way so there may come a time when i go "well- you
> want it better.. please.. send a patch!"... but its fresh on my plate now, so
> it's active :)

And this is a great thing. Since this phone without a great virtual 
keyboard (like the one you're doing) won't be usable/cool as it should 
be. Imho this is the killer tool of illume.

>> Another thing I'd like to suggest you is that imho the backspace/space 
>> right-left/left-right dragging is too long. If you try writing using 
>> your thumbs you can notice that is hard deleting a word... Imho they 
>> should be more sensible.
> 
> from illume's TODO file (in svn):
> 
> * kbd needs drag for backspace/next word etc. to be shorter
> 
> :) already there. :) well - as with accent normalising - there is a marker that
> i realise something needs to be done.

Nice! :P

-- 
Treviño's World - Life and Linux
http://www.3v1n0.net/





More information about the community mailing list