[wikireader]Error on parsing the spanish wikipedia

David Reyes Samblas Martinez david at tuxbrain.com
Fri Oct 30 16:22:51 CET 2009

Are you uploading this changes to git? can I take a look?

David Reyes Samblas Martinez
Open ultraportable & embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!

2009/10/30 Sean Moss-Pultz <sean at openmoko.com>:
> On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
> <david at tuxbrain.com> wrote:
>> Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
>> after compiling succsesfuly the source on the git and solve some
>> annoyings with utf8 encoding on phyton error was somthing like this:
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
>> position....: ordinal not in range(128)
>> this was solved changing the default encode "ascii" to "utf8" int the
>> /usr/lib/python2.6/site.py file
>> after this I was hable to execute ok the instruction:
>> make DESTDIR=image WORKDIR=work
>> XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
>> parse render combine
>> Every thing seem fine for a couple(about 6-7h) of hours parsing the
>> 700000 articles in spanish but  then ... the horror
>> Count: 380000
>> Traceback (most recent call last):
>>  File "./ArticleParser.py", line 224, in <module>
>>    main()
>>  File "./ArticleParser.py", line 172, in main
>>    process_article_text(title.encode('utf-8'),  f.read(length), newf)
>>  File "./ArticleParser.py", line 218, in process_article_text
>>    newf.write(text + '\n')
>> IOError: [Errno 32] Broken pipe
>> make[1]: *** [parse] Error 1
>> make[1]: se sale del directorio
>> `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
>> make: *** [parse] Error 2
> OK that's fixed now. Chris already checked in the code. Our build
> worked fine. We need to do a few more tweaks and then we can post a
> (super) early test image. Give us until early this coming week.
>  -Sean
> _______________________________________________
> Openmoko community mailing list
> community at lists.openmoko.org
> http://lists.openmoko.org/mailman/listinfo/community

More information about the community mailing list