yaouh! 0.4 is out - 10x speedup hack

Helge Hafting helge.hafting at hist.no
Mon Feb 9 13:42:47 CET 2009

Carlo Minucci wrote:
> download from http://wiki.openmoko.org/wiki/Yaouh!
> little bufgix and add support for multiple wget download
> i think now is more fast
> please, test and feedback

Seems to work fine, but it looks like only the downloading happens in 
parallel. A common case seems to be only 5%-10% new tiles. The rest
just need checking. This is done with sequential use of "curl". I.e. no 
parallel checking, although this is a big part of a yaouh run.

I am no expert in python threading, so I did a much simpler hack.
I run 10 instances of yaouh. yaouh0.py only checks files matching
*0.png, yaouh1.py only checks files matching *1.png, and so on.

This gives a tremendous speedup because curl transfers very little data. 
Serialized curl spend most of the time waiting for an answer, while one 
request and a very small answer moves through the network to a distant 
server and back again. The bandwith of the connection is nowhere near 
fully utilized, not even when using usb networking.

So my 10 processes fires off 10 curls in parallel, giving a 10x speedup 
if the network can handle the load. usb networking seems to handle it 
fine when there are few updates.

This is a hack and not yet an real solution, because there is much room
for improvement. First, I have to press the start button in all 10 
windows which is excessive. There shouldn't be 10 windows in a real
solution. Then all my 10 yaouh processes run the same "find"
in the beginning, in order to count the number of files. That is 
unnecessary - such a startup is 10 times heavier than it need to be. 
Still, this don't take much time compared to the rest. Finally, there 
are now 10 progress bars being updated in those 10 windows.

Still, this approach has checked 5000 files (out of 50.0000) in the 20 
minutes it took to write this mail. And downloaded about 300 outdated 
tiles. There is the hope that my 50.000 tiles will be up to date in 3.5 
hours. :-)

A patch for yaouh 0.4 is attached, if anyone wants to test this, or 
improve it further. The patched file is available here:

To use it, make 10 copies named yaouh0.py, yaouh1.py, ... yaouh9.py.
In each script, edit line 159, so yaouh0.py has "0" in the if-test,
yaouh1.py has "1" in the if-test, yaouh2.py has "2" in the test, and so on.

Then, start everything with a command like:
$ yaouh0.py & yaouh1.py & yaouh2.py & yaouh3.py & yaouh4.py & yaouh5.py 
& yaouh6.py & yaouh7.py & yaouh8.py & yaouh9.py &

Running 10 scripts eats some memory, some may not be able to run all 10 
at the same time. Having a swap partition may help with that.

Helge Hafting

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: diff.yaouh4
Url: http://lists.openmoko.org/pipermail/community/attachments/20090209/eb2bc855/attachment.txt 

More information about the community mailing list