[SHR] illume predictive keyboard is too slow

Carsten Haitzler (The Rasterman) raster at rasterman.com
Fri Jan 30 04:15:38 CET 2009

On Thu, 29 Jan 2009 12:19:38 +0100 "arne anka" <openmoko at ginguppin.de> said:

> > This dictionary would have hundreds of millions of rows even if you take
> > only reasonable user inputs.
> why would that be? colloquial language (nad that's what is to be  
> considered) contains only several thousends words, still a lot but far  
> away from millions.
> > But what to do if the users inputs something
> > that's not in the dictionary?
> but that's a problem with every dictionary -- you never can contain every  
> possible word.
> i don't use the keyboard and i do not follow the discussion close, but  
> what always struck me odd was the use of a text file.
> why not use a db? it would enable learning, too.

sheer simplicity and dependencies. a db would mean selecting one. gdbm is gpl.
libdb is fine - but they love to break db format every few releases and that'd
royally suck. also these lean to key/value pair - and that means u need to
GENERATE all possible permutations (which is prohibitively expensive) so the
dict also affects the lookup as you simply avoid generating permutations u know
will never have any matches (ie nothing starts with qz... so never worry about
all the qz* permutations). the best suggestion is a trie - but i need a format
i can access really quickly - and a library that isnt license or otherwise
restricted, easy to use, doesnt eat much ram at all, and is fast.

invariably you never get that - it either eats ram or it slow, or something
else. so what i did is just use a simple format easy to generate with a small 1
liner shell command and index it on the fly for quick lookups in a tiny 2 level
index. it of course is not incredibly fast - but it uses a tiny amount of
precious ram.

making it a text file opens the gate to easy generation of new dicts - and i
wanted to keep that as easy as possible.

------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster at rasterman.com

More information about the community mailing list