[SHR] illume predictive keyboard is too slow

Carsten Haitzler (The Rasterman) raster at rasterman.com
Fri Jan 30 04:10:39 CET 2009


On Thu, 29 Jan 2009 14:32:48 +0100 Helge Hafting <helge.hafting at hist.no> said:

> Carsten Haitzler (The Rasterman) wrote:
> 
> > i was hoping to be able to keep a SIMPLE ascii qwerty keyboard for as 
> > much as
> > possible - so you can just type and it will work and offer the selections as
> > it's trying to guess anyway - it can present the multiple accented versions
> > too. this limits the need for special keyboards - doesn't obviate it, but
> > allows more functionality out of the box. in the event users explicitly 
> > select
> > an accented char - ie a non-ascii character, it should not "decimate". it
> > should try match exactly that char.
> > 
> We will still need to select the correct dictionary for the language 
> somewhere. It is no more work if this also selects a keyboard layout 
> adapted to that language.
> 
> I can see why you want a simple keyboard with fewer keys - the keys can 
> be bigger and so there will be fewer finger-misses. I don't see any 
> reason why it should be limited to ascii though - that division does not 
> seem natural to me.
> 
> An example from the Norwegian laguage: The letter ô is rarely used, and 
>   everybody thinks about it as an "o" with a "hat" on it. So this one 
> fits your scheme - type "o" and "ô" will be suggested in the few cases 
> where it is appropriate.  But the three vowels "æøå" is different. They 
> are letters of their own, they are not seen as "modifications of a/o", 
> even if that may be historically correct. These three have their own 
> names and their own places in the alphabet (after z). An "å" is not 
> merely an "a with ring", no more than the "E" is an "F" with an extra 
> line attached. The "ø" is not merely an "o" with a slash either. Many 
> people don't know that "æ" originated as an "ae" ligature. "æ" and "ae" 
> can both occur in words, but the pronunciation is different and they are 
> not interchangeable.
> 
> So when Norwegians type, they expect to see the 29 letters of their 
> alphabet: abcdefghijklmnopqrstuvwxyzæøå. "ô" and "é" are sometimes 
> useful too, but these are just "o" and "e" with modifications. "æøå" 
> however, are parts of the base alphabet. Just like "abc". A keyboard 
> without "æøå" is assumed not to support Norwegian.
> 
> I hope things like this will be possible, if a new dictionary format is 
> realized. It is ok if typing "for" suggests "fôr" as an alternative, but 
> "før" should not come up unless the user types "f" "ø" "r". In which 
> case "o" must not be suggested...

ok - how do you romanise norwegian then? example. in german ö -> oe, ü -> ue,
ß -> ss, etc. - there is a set of romanisation rules that can convert any such
char to 1 or more roman letters. i was hoping to be even more lenient with ö ->
o being valid too for the lazy :) japanese has romanisation rules - so does
chinese... norwegian must (eg æ -> ae for example).

if something can be romanised - it can have a romanised match in a dictionary
and thus suggest the appropriate matches. of course now the dictionary
determines these rules implicitly by content, not by code specifically
enforcing such rules. :)

but yes - selecting dictionary is needed so selecting a keyboard for that
language as well as dictionary is useful. it still adds a few keys - thus
squashing the keyboard some more :( i was hoping to avoid that.

note - the keyboard is by no means limited to ascii at all - it's perfectly
able to have accented/other keys added to layouts - so i'm considering this
problem "solved" as its simply a matter of everyone agreeing to make a .kbd for
their language - should they need one other than the default qwerty (ascii)
one. so from this point of view - that's solved. what isn't done yet is:

1. a kbd being able to hint at wanting a specific dictionary language (or
vice-versa).
2. dictionary itself being able to hint to have a specific kbd layout.
3. applications not being able to hint for a specific language for input (and
thus dictionary and/or kbd).

so there needs to be a tie-in between language, dict and kbd - which one drives
what... is the question. it needs to not BREAK things like terminal kbd etc. -
ie i can stay with norwegian ad my language but if i select the terminal kbd -
it will stay there and not suddenly flip back to the simple kbd layout.
number/symbol entry similarly. this bit of things is currently undefined and
unimplemented.

the other is improved dictionary format. the problem is - if we go make the
dict smarter... how on earth do you GENERATE such a dictionary. i sure as hell
am not hand-writing a whole dictionary... and i doubt anyone here will - it
could be a large community effort to build a full one for each language - but
that will take time. you need to enter all words, all matches, conjugations,
and then frequency info too. the simple dict english can use is much easier -
it can be auto-generated from input text. just throw a (text version) of a book
- or newspaper or documentation - it can just index every word it finds and
even count frequency usage. thats easy to automate the production of such a
dict (and that is why the dict is as it is now - sheer simplicity).

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster at rasterman.com





More information about the community mailing list