Illume dictionary for Dutch (Nederlands)

Pander pander at users.sourceforge.net
Fri Nov 28 00:20:38 CET 2008


Is it possible to put comments in the .dic file? If so, in what format?
E.g. only the first couple of lines which start with a #.

Carsten Haitzler (The Rasterman) wrote:
> On Thu, 20 Nov 2008 10:55:02 +0100 (CET) "Pander"
> <pander at users.sourceforge.net> babbled:
> 
> any dictionary should not care about gsm encodings. it should be just a utf8
> dictionary file. it is the job of the sms app to convert normal utf8 unicode to
> whatever encoding used by the network, and back. :)
> 
>> Small correction to my text:
>>
>> "Note that more characters" must be "Note that certain special characters
>> are in GSM 03.38 which are not in extended ASCII"
>>
>>
>> Nevertheless, one complete utf-8 dictionary could be used by most
>> applications, also SMS. The conversion I do for GSM 03.38 could also be
>> done later just before sending the SMS.
>>
>> On Thu, November 20, 2008 10:44, Rui Miguel Silva Seabra wrote:
>>> I have no idea... I might only make a new version with utf-8 encoded
>>> characters. :)
>>>
>>>
>>> On Thu, Nov 20, 2008 at 10:40:46AM +0100, Pander wrote:
>>>> Hi all,
>>>>
>>>> I intent to generate the following:
>>>> - a full list utf-8 (for 8 bit SMS and regular use, default)
>>>> - b full list utf-8 GSM 03.38[1] (for 7 bit SMS)
>>>> - c truncated list utf-8 (for 8 bit SMS and regular use)
>>>> - d truncated list utf-8 GSM 03.38[1] (for 7 bit SMS, default)
>>>>
>>>> [1] These utf-8 characters in this list are within the 7-bit range of
>>>> GSM
>>>> 03.38, see http://en.wikipedia.org/wiki/Short_message_service#GSM Note
>>>> that more characters
>>>>
>>>> a and b will both have 250,000 words
>>>> b will be conversion, remapping and normalisation of a
>>>> c and d are truncations and normalisation of respectively a and b
>>>>
>>>> For utf-16, a simple conversion of the utf-8 files can be used, but I'll
>>>> leave this for now. This could result in two extra files.
>>>>
>>>> Note that nor extended nor non-extended ASCII is available. Is this
>>>> desirable? This can result in four extra files.
>>>>
>>>> So, I can come up with 10 different files. Which are according to you
>>>> the
>>>> most useful?
>>>>
>>>> Regards,
>>>>
>>>> Pander
>>>>
>>>> On Thu, November 20, 2008 08:58, Rui Miguel Silva Seabra wrote:
>>>>> On Thu, Nov 20, 2008 at 03:02:41AM +0100, "Marco Trevisan
>>>> (Treviño)"
>>>>> wrote:
>>>>>> Pander wrote:
>>>>>>> Of course this particular word list is very long and contains about
>>>>>>> 250,000 words and has a typical loooong tail. Many words or
>>>>>> compositions
>>>>>>> or occur seldom in average day use.
>>>>>>>
>>>>>>> What would be a good cut off point in number of words, also in
>>>> terms
>>>>>> of
>>>>>>> performance?
>>>>>>>
>>>>>>> The Portuguese list contains 56,609 words. Is this workable? How
>>>> many
>>>>>>> does the English contain?
>>>>>> The Italian one can count also 500'000 words (to be short), but I can
>>>>>> get a well working dictionary only using a smaller one (with about
>>>>>> 150'000 words that I've taken counting its google popularity).
>>>>>>
>>>>>> Btw I've written more complete posts about this on the list...
>>>>> Well, since my basis was based on a million words taken from the most
>>>>> printed daily newspaper in Portugal (I didn't count but still I
>>>> removed
>>>>> a lot of non words like numbers, etc...) already with frequency data,
>>>> my
>>>>> job was so much easier... :)
>>>>>
>>>>> As for writing SMS/text messages... I haven't found yet a word that
>>>>> wasn't there (in fact my problem is that it so often is the first of
>>>>> several matches so I have to use the menu on the left) but I must
>>>>> confess to not be one of those whose primary use of the phone is
>>>>> SMS/text!
>>>>>
>>>>> Rui
>>>>>
>>>>> --
>>>>> Frink!
>>>>> Today is Prickle-Prickle, the 32nd day of The Aftermath in the YOLD
>>>> 3174
>>>>> + No matter how much you do, you never do enough -- unknown
>>>>> + Whatever you do will be insignificant,
>>>>> | but it is very important that you do it -- Gandhi
>>>>> + So let's do it...?
>>>>>
>>>>> _______________________________________________
>>>>> Openmoko community mailing list
>>>>> community at lists.openmoko.org
>>>>> http://lists.openmoko.org/mailman/listinfo/community
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Openmoko community mailing list
>>>> community at lists.openmoko.org
>>>> http://lists.openmoko.org/mailman/listinfo/community
>>> --
>>> You are what you see.
>>> Today is Prickle-Prickle, the 32nd day of The Aftermath in the YOLD 3174
>>> + No matter how much you do, you never do enough -- unknown
>>> + Whatever you do will be insignificant,
>>> | but it is very important that you do it -- Gandhi
>>> + So let's do it...?
>>>
>>> _______________________________________________
>>> Openmoko community mailing list
>>> community at lists.openmoko.org
>>> http://lists.openmoko.org/mailman/listinfo/community
>>>
>>
>>
>> _______________________________________________
>> Openmoko community mailing list
>> community at lists.openmoko.org
>> http://lists.openmoko.org/mailman/listinfo/community
>>
> 
> 





More information about the community mailing list