[PATCH 0/4] Re: NULL pointer dereference at s3cmci

Cesar Eduardo Barros cesarb at cesarb.net
Sun Aug 10 01:38:29 CEST 2008


Cesar Eduardo Barros escreveu:
> Cesar Eduardo Barros escreveu:
>> [21474554.520000] Unable to handle kernel NULL pointer dereference at 
>> virtual address 00000004
[...]

Finally fixed this:

root at om-gta01:~# mount
rootfs on / type rootfs (rw)
/dev/root on / type jffs2 (rw,noatime)
proc on /proc type proc (rw)
tmpfs on /mnt/.exquisite type tmpfs (rw,size=40k)
sysfs on /sys type sysfs (rw)
/dev/root on /dev/.static/dev type jffs2 (rw)
udev on /dev type tmpfs (rw,size=2048k,mode=755)
/dev/mmcblk0p1 on /media/card type vfat 
(rw,fmask=0022,dmask=0022,codepage=cp437,iocharset=iso8859-1)
tmpfs on /var/volatile type tmpfs (rw,mode=755)
tmpfs on /dev/shm type tmpfs (rw,mode=777)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
root at om-gta01:~# uname -a
Linux om-gta01 2.6.26-mokodev #9 PREEMPT Sat Aug 9 19:34:28 BRT 2008 
armv4tl unknown

This was not a bug, it was a series of bugs.

> Looking at the assembly code, the oops happens at the first 
> mmc_set_ios(host) within mmc_power_up(). For some reason, host->ops is 
> NULL.
> 
> The only possible call path I can imagine for that is s3cmci_irq_cd 
> getting called before host->ops is set, thus calling mmc_detect_change() 
> which will schedule host->detect which is mmc_rescan.

The first bug was this one, which was obvious on the oops output. But 
why didn't it happen before, since the code was always there? The answer 
would be that, as one would suspect, usually that code isn't preempted 
until well after everything is set up. The real reason it was happening 
became obvious once that initialization ordering bug was fixed (first 
two patches of this series): the oops disappeared, but still nothing 
happened. The driver had been failing its initialization the whole time!

What happened was that, due to a change on the return value of 
s3c2410_dma_request (see commit 
3886ff5f63f33c801ed3af265ac0df20d3a8dcf5, cherry picked as the third 
patch of this series), s3cmci_probe was erroneously considering a 
successful return as a failure, and going through the error path. 
However, by this time host->detect has already been scheduled. Another 
mistake (fixed by commit 2de5f79d4dfcb1be16f0b873bc77d6ec74b0426d, 
cherry picked as the fourth commit of this series) made the delay before 
it finally executes longer, making it happen in the long pause just 
before "VFS: Mounted root (jffs2 filesystem)." (the real bug was before 
that pause, as can be seen by the attached dmesg). When it finally 
executed, it was not only following a NULL pointer, it was following a 
NULL pointer in a structure which had already been freed!

The patch has been very lightly tested (it boots, 2007.2 automounts the 
card, and a ls -la /media/card shows expected values). I haven't tried 
writing or stress-testing it yet.


Given all that, I wonder whether it would be better to keep the current 
driver or to backport the 2.6.27 driver (applying whatever extra patches 
are needed; the first two patches of this series, for instance, should 
still be needed in some form).

-- 
Cesar Eduardo Barros
cesarb at cesarb.net
cesar.barros at gmail.com
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dmesg
Url: http://lists.openmoko.org/pipermail/openmoko-kernel/attachments/20080809/a41291a8/attachment.txt 


More information about the openmoko-kernel mailing list