Illume keyboard dictionary sorting and normalization

Olof Sjobergh olofsj at gmail.com
Tue Jan 6 11:49:55 CET 2009


Hi,

I'm working on a Swedish dictionary and keyboard for Illume, but I'm
having some trouble with sorting of utf8 chars in the dictionary. I
can't seem to get the sorting right. Looking at the code, Illume sorts
the dictionary after first normalizing the strings according to the
internal normalization table. Is there any way to reproduce this
sorting with the sort command? I've tried with a few different locales
(C, en_US.utf8) which all make the unix sort command work differently.
But no matter what I try words don't show up correctly.

Another issue I found is that the built in normalization table is not
very good for typing Swedish text. On a standard Swedish qwerty
layout, we have three additional letters (å, ä and ö). These are used
very frequently in Swedish and there are many common words that have
different meanings if spellt with a, å or ä (for example har, här and
hår are all very common words). But in Illume these are all normalized
to a. Writing Swedish with a US qwerty layout and then having to
select aåä manually after the dictionary lookup is a pain, since many
common words will have to be selected from the lookup list each time.

Instead, what you want is a Swedish qwerty layout (which is very
simple to implement as a .kbd file), and not normalize åäö for the
Swedish dictionary lookup. So the normalization table would really
need to be configurable, either as a part of the dictionary or the
.kbd file. I suppose this problem exists for other languages as well.
If I were to work on such a change, what would be the best approach?

Best regards,

Olof Sjobergh




More information about the community mailing list