2009/1/29 Olof Sjobergh <span dir="ltr"><<a href="mailto:olofsj@gmail.com">olofsj@gmail.com</a>></span><br><div class="gmail_quote"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div><div class="Wj3C7c"><br>
</div></div>I think most problems could be solved by using a dictionary format<br>
similar to what you describe above, i.e. something like:<br>
<br>
match : candidate1 candidate2; frequency<br>
for example:<br>
vogel : Vogel Vögel; 123<br>
<br>
That would mean you can search on the normalised word where simple<br>
strcmp works fine and will be fast enough.</blockquote><div><br>This dictionary would have hundreds of millions of rows even if you take only reasonable user inputs. But what to do if the users inputs something that's not in the dictionary? Of course I'm assuming you want to correct typos, as it's doing now.<br>
</div></div><br>vogel: Vogel, Vögel<br>vigel: Vogel, Vögel<br>vpgel: Vogel, Vögel<br>wogel: Vogel, Vögel<br>wigel: Vogel, Vögel<br>vigem: Vogel, Vögel<br>vigwl: Vogel, Vögel<br>...<br>...<br>