Input Method Development
Sébastien Lorquet
squalyl at gmail.com
Wed Feb 6 17:54:12 CET 2008
First time I will publish a bit of my personal work... wow! :)
Find here my notes about implementing the algorithm to turn jamos to han
syllabes using IME 2002, reverse engineered from a number of keystokes ;)
consider it's under CC-BY-SA, and not totally "public domain" as i've spent
a bit of time on it ;)
-----8<-----------------------------------------------------------------------------------------------
notes about understanding IME2002 algorithm
Sébastien Lorquet, July 2006
Published under CC-BY-SA
=======
Now I have
(22:14)
[root at devhost core]# make run
[root at devhost core]# make run
./core ../../data/hantree.dat
File size is 58226 bytes , mapped @ 0xb7f25000
Init ok, please type chars, exit to quit
>d
stack len=1, contents=[d]
currnode=0 , 33 childs
searching char [d]...qQwWeErRtTyuioOpPasd->found
state=3147 nxtoff=36828 out=0000
temp disp: [3147] ㅇ
>h
you typed h, len is 1
stack len=2, contents=[dh]
currnode=36828 , 14 childs
searching char [h]...yuioOpPh->found
state=C624 nxtoff=37890 out=0000
temp disp: [C624] 오
>t
you typed t, len is 1
stack len=3, contents=[dht]
currnode=37890 , 19 childs
searching char [t]...qwerRt->found
state=C637 nxtoff=0 out=C637
temp disp: [C637] 옷
output: [C637] 옷
>h
you typed h, len is 1
stack len=4, contents=[dhth]
currnode=0 , 33 childs
searching char [h]...qQwWeErRtTyuioOpPasdfgh->found
state=3157 nxtoff=0 out=3157
temp disp: [3157] ㅗ
output: [3157] ㅗ
>
here is the main bug of my basic method: i'm not using the stack correctly.
a single vowel is not allowed. I should have cut the syllabe before.
What to do
PUSH d -> [d]
-> sequence found, possible output ㅇ
PUSH h -> [dh]
-> sequence found, possible output 오
PUSH t -> [dht]
-> sequence found, possible output 옷
PUSH h -> [dhth]
-> sequence NOT FOUND
-> last (ㅗ) is a vowel, cant be alone.
-> output 오
-> keep "th" in the stack
-> [th]
-> sequence found, possible output 소
PUSH f -> [thf]
-> sequence found, possible output 솔
PUSH d -> [thfd]
-> sequence NOT FOUND
-> last (ㅇ) is NOT a vowel, CAN be alone.
-> output 솔
-> keep "d" in the stack
-> [d]
-> sequence found, possible output ㅇ
PUSH l -> [dl]
-> sequence found, possible output 이
PUSH v -> [dlv]
-> sequence found, possible output 잎
etc...
YAY IT'S OK
eat sth then implement that
here is the result (00:16)
[root at devhost core]# make run
cc -I../../include -c -o himecore.o himecore.c
cc himecore.o posix_main.o -o core
./core ../../data/hantree.dat
File size is 58226 bytes, mapped at 0xb7f61000
Init ok, please type chars, exit to quit
>d
stack len=1, contents=[d]
sequence found
current state:ㅇ
>h
you typed h, len is 1
stack len=2, contents=[dh]
sequence found
current state:오
>t
you typed t, len is 1
stack len=3, contents=[dht]
sequence found
current state:옷
>h
you typed h, len is 1
stack len=4, contents=[dhth]
last entered char is h
This is a vowel
2 chars remaining on stack
new stack:
stack len=2, contents=[th]
current state:소
output: 오
>f
you typed f, len is 1
stack len=3, contents=[thf]
sequence found
current state:솔
>d
you typed d, len is 1
stack len=4, contents=[thfd]
last entered char is d
This is not a vowel
1 chars remaining on stack
new stack:
stack len=1, contents=[d]
current state:ㅇ
output: 솔
>l
you typed l, len is 1
stack len=2, contents=[dl]
sequence found
current state:이
>v
you typed v, len is 1
stack len=3, contents=[dlv]
sequence found
current state:잎
>
8<-----------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openmoko.org/pipermail/community/attachments/20080206/0e04f4d2/attachment.htm
More information about the community
mailing list