[SHR] illume predictive keyboard is too slow

Laszlo KREKACS laszlo.krekacs.list at gmail.com
Wed Feb 4 16:37:56 CET 2009


Hi!

> ok - so if a young person typed:
> Öt szép szűz
> it'd be:
> Ot szep szuz

((btw, the meaning of "Öt szép szűz lány őrült írót nyúz" is
"Five virgins tire a crazy writer".
It is the hungarian synonym of "The quick brown fox jumps over the lazy dog"))


Yes, and in that specific case works.
(because none of the above words (Ot, szep, szuz) has a meaning in
hungarian language, so you can understand that example without
accent.)

But there are other cases, where it is not that clear:
ólt - pound (accusative)
ölt - he killed ...
olt - to graft

So when you see "olt" in the text you cant be sure it is "olt", "ólt"
or "ölt" without analysing the whole sentence.

The german example is two-way conversion: ü - ue, ß - ss. You can
switch back and for
without losing additional information.

>> A simple word based dictionary is limited anyway for the hungarian
>> language, where you can create a word as long as this:
>> "elkelkáposztástalaníthatatlanságoskodásaitokért".
>
> ugh. so its like german. compound words get created a lot by just stringing
> multiple words together without a space. that's ok- as long as there arent a
> massive set of them... :)
>

But there are. Because this language is "agglutinative".
I explain a bit the difficulty.

In german you can create the following word:
wood [en] - Holz [de] - fa [hu]
house [en] - Haus [de] - ház [hu]

wood house [en] - Holzhaus [de] - faház [hu]

So you glued together house and wood in one word.
(this is your example: stringing together without space)

In german you can even create words of one verb plus a modifier, like:
to work [en], arbeiten [de], dolgoz [hu]
to ply [en], bearbeiten (be+arbeiten) [de], megdolgoz (meg+dolgoz) [hu]

It is the same process;) There are many example of this:
to link together[en], anschliessen (an+schliessen) [de] - összekapcsol
(össze+kapcsol) [hu],
to buy up [en], aufkaufen (auf+kaufen) - felvásárol (fel+vásárol) [hu]

But in hungarian language, we glue together everything, some example:
in house [en], im Haus [de], házban (ház+ban) [hu]
car [en], Wagen [de], kocsi [hu]
our car [en], unseren Wagen (unser+en Wagen) [de], kocsinkat
(kocsi+(u/ü)nk+(a/á/e/é)t) [hu]

So the possibilities are nearly infinite.
Without analysing the sentence and the word, you cant find the root
word with correct accent.

And finding the root word requires a spell checker (the best available
is hunspell for the hungarian language)

Summary:
- Losing the accents (in hungarian) most of the time results in contradiction.
- Need a spell checker to suggesting the right accented word.
(see: http://hunspell.sourceforge.net/)

So creating an architecture for spell checker is not a bad idea (for
future extensibility).
It could be handy for english too. But for other language (ex:
hungarian) maybe essential.

Sorry for being so tiresome.

Best regards,
 Khiraly




More information about the community mailing list