fascinating bug to do with apm and child processes...

Carsten Haitzler (The Rasterman) raster at openmoko.org
Wed May 14 13:58:18 CEST 2008


i found a fascinating (what appears to be) kernel bug...

ok.
1. e executes the "apm -s" command to suspend when it decides u've been idle
and need to. so funnily enough from ps:

root      1402  1375  0 May12 ?        00:01:22 enlightenment -profile illume
...
root      1463  1402  0 May12 ?        00:00:00 sh -c apm -s

ok - pid 1463 is a shell running apm -s - and the parent process is...
enlightenment. ok - so what is this shell process doing:

root at om-gta02:~# strace -p 1463
Process 1463 attached - interrupt to quit
wait4(-1, 

look at that. it's hung in an eternal wait for its child proc (apm), which is
over here:

root      1464  1463  0 May12 ?        00:00:00 apm -s

interestingly enough e is also hung on a wait:

root at om-gta02:~# strace -p 1402
Process 1402 attached - interrupt to quit
wait4(-1,  <unfinished ...>

which i KNOW is the following line of code:

        while ((pid = waitpid(-1, &status, WNOHANG)) > 0)

which... should NEVER HANG. EVER. if no child exited - return immediately - as
per the manual page:

       WNOHANG     return immediately if no child has exited.

so under no circumstances should this ever hang... but oooh. it does. now
interestingly i attached to apm to see what it was doing.. and lo-and-behold,
it woke up and continued to execute then exited with sh reaping the child then
e reaping the sh and e waking up again:

root at om-gta02:~# strace -p 1464
Process 1464 attached - interrupt to quit
dup(2)                                  = 5
fcntl64(5, F_GETFL)                     = 0x20001 (flags O_WRONLY|O_LARGEFILE)
close(5)                                = 0
write(2, "apm: Interrupted system call\n", 29) = 29
close(4)                                = 0
io_submit(0, 0, 0xfbad2088Process 1464 detached
root at om-gta02:~# 

so somewhere along the way apm was stuck in a forever hung syscall - which i
don't know what it is... but the chain of sigchild pain back from this is just
wrong. thar be dragons!

-- 
Carsten Haitzler (The Rasterman) <raster at openmoko.org>




More information about the openmoko-kernel mailing list