Input Method Development

Sébastien Lorquet squalyl at gmail.com
Wed Feb 6 17:54:12 CET 2008


First time I will publish a bit of my personal work... wow! :)

Find here my notes about implementing the algorithm to turn jamos to han
syllabes using IME 2002, reverse engineered from a number of keystokes ;)

consider it's under CC-BY-SA, and not totally "public domain" as i've spent
a bit of time on it ;)

-----8<-----------------------------------------------------------------------------------------------
notes about understanding IME2002 algorithm
Sébastien Lorquet, July 2006
Published under CC-BY-SA
=======
Now I have

(22:14)
[root at devhost core]# make run

[root at devhost core]# make run
./core ../../data/hantree.dat
File size is 58226 bytes , mapped @ 0xb7f25000
Init ok, please type chars, exit to quit

>d
stack len=1, contents=[d]
currnode=0 , 33 childs
searching char [d]...qQwWeErRtTyuioOpPasd->found
state=3147 nxtoff=36828 out=0000
temp disp: [3147] ㅇ

>h
you typed h, len is 1
stack len=2, contents=[dh]
currnode=36828 , 14 childs
searching char [h]...yuioOpPh->found
state=C624 nxtoff=37890 out=0000
temp disp: [C624] 오

>t
you typed t, len is 1
stack len=3, contents=[dht]
currnode=37890 , 19 childs
searching char [t]...qwerRt->found
state=C637 nxtoff=0 out=C637
temp disp: [C637] 옷
output: [C637] 옷

>h
you typed h, len is 1
stack len=4, contents=[dhth]
currnode=0 , 33 childs
searching char [h]...qQwWeErRtTyuioOpPasdfgh->found
state=3157 nxtoff=0 out=3157
temp disp: [3157] ㅗ
output: [3157] ㅗ

>

here is the main bug of my basic method: i'm not using the stack correctly.
a single vowel is not allowed. I should have cut the syllabe before.

What to do


PUSH d    -> [d]
    -> sequence found, possible output ㅇ

PUSH h    -> [dh]
    -> sequence found, possible output 오

PUSH t    -> [dht]
    -> sequence found, possible output 옷

PUSH h    -> [dhth]
    -> sequence NOT FOUND
    -> last (ㅗ) is a vowel, cant be alone.
    -> output 오
    -> keep "th" in the stack
    -> [th]
    -> sequence found, possible output 소

PUSH f    -> [thf]
    -> sequence found, possible output 솔

PUSH d    -> [thfd]
    -> sequence NOT FOUND
    -> last (ㅇ) is NOT a vowel, CAN be alone.
    -> output 솔
    -> keep "d" in the stack
    -> [d]
    -> sequence found, possible output ㅇ

PUSH l    -> [dl]
    -> sequence found, possible output 이

PUSH v    -> [dlv]
    -> sequence found, possible output 잎

etc...

YAY IT'S OK
eat sth then implement that

here is the result (00:16)

[root at devhost core]# make run
cc -I../../include   -c -o himecore.o himecore.c
cc himecore.o posix_main.o -o core
./core ../../data/hantree.dat
File size is 58226 bytes, mapped at 0xb7f61000
Init ok, please type chars, exit to quit
>d
stack len=1, contents=[d]
sequence found
current state:ㅇ

>h
you typed h, len is 1
stack len=2, contents=[dh]
sequence found
current state:오

>t
you typed t, len is 1
stack len=3, contents=[dht]
sequence found
current state:옷

>h
you typed h, len is 1
stack len=4, contents=[dhth]
last entered char is h
This is a vowel
2 chars remaining on stack
new stack:
stack len=2, contents=[th]
current state:소
                output: 오
>f
you typed f, len is 1
stack len=3, contents=[thf]
sequence found
current state:솔

>d
you typed d, len is 1
stack len=4, contents=[thfd]
last entered char is d
This is not a vowel
1 chars remaining on stack
new stack:
stack len=1, contents=[d]
current state:ㅇ
                output: 솔
>l
you typed l, len is 1
stack len=2, contents=[dl]
sequence found
current state:이

>v
you typed v, len is 1
stack len=3, contents=[dlv]
sequence found
current state:잎

>

8<-----------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openmoko.org/pipermail/community/attachments/20080206/0e04f4d2/attachment.htm 


More information about the community mailing list