[wikireader] Fully automated generation script

Thomas HOCEDEZ thomas.hocedez at free.fr
Tue Feb 9 08:28:35 CET 2010


(This message was first send on the wrong tree, sorry)

Hi wikireaders,

As I told before, my automated script is ready !And this script is
totally "cronable" :!
It manage every step, from downloading Wikipedia archive in your
language to the installation.

it is available here :http://freerunner.daily.free.fr/files/autowiki.sh
Just modify with your ftp parameters (for ftp'ing images)&  mail (to be
adverted on each step). then launch

$ autowiki.sh fr

for french Wikipedia

$ autowiki be

for Biello-russian Wikipedia


Some of you would say that it also could check if the wikireader folder
exists&  install or upgrade source from the git, well, I let you some
work to do;-)


I'm still available for any suggestion or anything else.

The code is following.

#!/bin/sh
# Automation script for rendering a Wikipedia image for the FR
# Written by AstHrO  / openmoko-fr.org /thomas.hocedez at free.fr
# V 1.0
# Todo
#  param1 : language (fr,en,be,lu....)

# =========================
# Ftp host configuration :

ftp_host=""
ftp_login=""
ftp_passd=""
mail=""

# Folder where the wikireader tools are installed

wr_folder="/media/stocks/wikireader"

# =========================

# Language extension of the WP (fr,en, nl ...)
if [[ "$1" = "" ]];  then
lang="fr"
  else
lang=$1
fi


# You don't have to change following stuff, but you can ...
cfile="${lang}wiki-latest-pages-articles.xml.bz2"
ufile="${lang}wiki-latest-pages-articles.xml"
mail_msg="message.txt"
sep="=============================================================="

echo "---">  $mail_msg

# Going to the working folder :
cd $wr_folder

# cleaning old treatments :
echo $sep
echo "Step 1 : CLeaning..."
make clean&>/dev/null

# downloading interesting WP :
echo $sep
echo "Step 2 : Downloading dump ..."
rm $cfile&>null
wgethttp://download.wikimedia.org/${lang}wiki/latest/$cfile
ls -l $cfile | mail -s "[WR] Wikipedia dump downloaded"  "$mail"<  $mail_msg

# uncompressing :
echo $sep
echo "Step 3 : Uncompressing ..."
rm $ufile&>/dev/null
bzip2 -d ./$cfile 2>  log.txt&1>  /dev/null

#creating some folders :
mkdir work&>/dev/null
mkdir image&>/dev/null

echo $sep
echo "Step 4 : Indexing Articles..."
# Creating index of articles :
make index XML_FILES="$ufile" DESTDIR=image WORKDIR=work 2>  log.txt 1>
/dev/null

# Parsing  :  (30>60 sec / 1000)
echo $sep
echo "Step 5 : Parsing Articles ..."
make parse -j3  XML_FILES="$ufile" DESTDIR=image WORKDIR=work 2>log.txt
#mail -s "[WR] Parsing of $ufile complete !"  "$mail"<  $mail_msg

# Rendering the file : (60>200 sec / 1000)
echo $sep
echo "Step 6 : Rendering ..."
make render -j3  XML_FILES="$ufile" DESTDIR=image WORKDIR=work 2>log.txt
#mail -s "[WR] Rendering of $ufile complete !"  "$mail"<  $mail_msg

echo $sep
echo "Step 7 : Finalizing ..."
# Combining articles indexes (few seconds)
make combine -j3 DESTDIR=image WORKDIR=work>  $mail_msg

# Generating a Hash (last few seconds dude !)
make hash -j3  DESTDIR=image WORKDIR=work>>  $mail_msg

# Going to output folder
cd image
echo $sep
echo "Step 8 : Compressing files..."
# Compressing  data files
tar -cvf wr_${lang}_$(date '+%d-%m-%Y').tar.gz pedia*.*

# a little HASH to be sure ...
shasum wr_${lang}_$(date '+%d-%m-%Y').tar.gz>   sha_${lang}.txt
#mail -s "[WR] Hash of your file..."  "$mail"<  $mail_msg
echo $sep
echo "Step 9 : Let's FTP all this !"
# Let's FTP all this stuff
ftp -n<<  EOF
open $ftp_host
user $ftp_login $ftp_passd
binary
put wr_${lang}_$(date '+%d-%m-%Y').tar.gz
EOF
echo $sep
echo "Step 10 : Enjoy !"
#mail -s "[WR] All done, WP image ready to use !"  "$mail"

# That's it you can now send a mail to your friends.



_______________________________________________
Openmoko community mailing list
community at lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community

-- 
Thomas HOCEDEZ

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openmoko.org/pipermail/community/attachments/20100209/a337ee97/attachment.htm 


More information about the community mailing list