First time I will publish a bit of my personal work... wow! :)<br><br>Find here my notes about implementing the algorithm to turn jamos to han syllabes using IME 2002, reverse engineered from a number of keystokes ;)<br><br>
consider it's under CC-BY-SA, and not totally "public domain" as i've spent a bit of time on it ;) <br><br>-----8<-----------------------------------------------------------------------------------------------<br>
notes about understanding IME2002 algorithm<br>Sébastien Lorquet, July 2006<br>Published under CC-BY-SA<br>=======<br>Now I have<br><br>(22:14)<br>[root@devhost core]# make run<br><br>[root@devhost core]# make run<br>./core ../../data/hantree.dat<br>
File size is 58226 bytes , mapped @ 0xb7f25000<br>Init ok, please type chars, exit to quit<br><br>>d<br>stack len=1, contents=[d]<br>currnode=0 , 33 childs<br>searching char [d]...qQwWeErRtTyuioOpPasd->found<br>state=3147 nxtoff=36828 out=0000<br>
temp disp: [3147] ㅇ<br><br>>h<br>you typed h, len is 1<br>stack len=2, contents=[dh]<br>currnode=36828 , 14 childs<br>searching char [h]...yuioOpPh->found<br>state=C624 nxtoff=37890 out=0000<br>temp disp: [C624] 오<br>
<br>>t<br>you typed t, len is 1<br>stack len=3, contents=[dht]<br>currnode=37890 , 19 childs<br>searching char [t]...qwerRt->found<br>state=C637 nxtoff=0 out=C637<br>temp disp: [C637] 옷<br>output: [C637] 옷<br><br>>h<br>
you typed h, len is 1<br>stack len=4, contents=[dhth]<br>currnode=0 , 33 childs<br>searching char [h]...qQwWeErRtTyuioOpPasdfgh->found<br>state=3157 nxtoff=0 out=3157<br>temp disp: [3157] ㅗ<br>output: [3157] ㅗ<br><br>><br>
<br>here is the main bug of my basic method: i'm not using the stack correctly.<br>a single vowel is not allowed. I should have cut the syllabe before.<br><br>What to do<br><br><br>PUSH d -> [d]<br> -> sequence found, possible output ㅇ<br>
<br>PUSH h -> [dh]<br> -> sequence found, possible output 오<br><br>PUSH t -> [dht]<br> -> sequence found, possible output 옷<br><br>PUSH h -> [dhth]<br> -> sequence NOT FOUND<br> -> last (ㅗ) is a vowel, cant be alone.<br>
-> output 오<br> -> keep "th" in the stack<br> -> [th]<br> -> sequence found, possible output 소<br><br>PUSH f -> [thf]<br> -> sequence found, possible output 솔<br><br>PUSH d -> [thfd]<br>
-> sequence NOT FOUND<br> -> last (ㅇ) is NOT a vowel, CAN be alone.<br> -> output 솔<br> -> keep "d" in the stack<br> -> [d]<br> -> sequence found, possible output ㅇ<br><br>PUSH l -> [dl]<br>
-> sequence found, possible output 이<br><br>PUSH v -> [dlv]<br> -> sequence found, possible output 잎<br><br>etc...<br><br>YAY IT'S OK<br>eat sth then implement that<br><br>here is the result (00:16)<br>
<br>[root@devhost core]# make run<br>cc -I../../include -c -o himecore.o himecore.c<br>cc himecore.o posix_main.o -o core<br>./core ../../data/hantree.dat<br>File size is 58226 bytes, mapped at 0xb7f61000<br>Init ok, please type chars, exit to quit<br>
>d<br>stack len=1, contents=[d]<br>sequence found<br>current state:ㅇ<br><br>>h<br>you typed h, len is 1<br>stack len=2, contents=[dh]<br>sequence found<br>current state:오<br><br>>t<br>you typed t, len is 1<br>stack len=3, contents=[dht]<br>
sequence found<br>current state:옷<br><br>>h<br>you typed h, len is 1<br>stack len=4, contents=[dhth]<br>last entered char is h<br>This is a vowel<br>2 chars remaining on stack<br>new stack:<br>stack len=2, contents=[th]<br>
current state:소<br> output: 오<br>>f<br>you typed f, len is 1<br>stack len=3, contents=[thf]<br>sequence found<br>current state:솔<br><br>>d<br>you typed d, len is 1<br>stack len=4, contents=[thfd]<br>last entered char is d<br>
This is not a vowel<br>1 chars remaining on stack<br>new stack:<br>stack len=1, contents=[d]<br>current state:ㅇ<br> output: 솔<br>>l<br>you typed l, len is 1<br>stack len=2, contents=[dl]<br>sequence found<br>
current state:이<br><br>>v<br>you typed v, len is 1<br>stack len=3, contents=[dlv]<br>sequence found<br>current state:잎<br><br>><br><br>8<-----------------------------------------------------------------------------------------------