[PATCH 0/4] Re: NULL pointer dereference at s3cmci
Cesar Eduardo Barros
cesarb at cesarb.net
Sun Aug 10 01:38:29 CEST 2008
Cesar Eduardo Barros escreveu:
> Cesar Eduardo Barros escreveu:
>> [21474554.520000] Unable to handle kernel NULL pointer dereference at
>> virtual address 00000004
[...]
Finally fixed this:
root at om-gta01:~# mount
rootfs on / type rootfs (rw)
/dev/root on / type jffs2 (rw,noatime)
proc on /proc type proc (rw)
tmpfs on /mnt/.exquisite type tmpfs (rw,size=40k)
sysfs on /sys type sysfs (rw)
/dev/root on /dev/.static/dev type jffs2 (rw)
udev on /dev type tmpfs (rw,size=2048k,mode=755)
/dev/mmcblk0p1 on /media/card type vfat
(rw,fmask=0022,dmask=0022,codepage=cp437,iocharset=iso8859-1)
tmpfs on /var/volatile type tmpfs (rw,mode=755)
tmpfs on /dev/shm type tmpfs (rw,mode=777)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
root at om-gta01:~# uname -a
Linux om-gta01 2.6.26-mokodev #9 PREEMPT Sat Aug 9 19:34:28 BRT 2008
armv4tl unknown
This was not a bug, it was a series of bugs.
> Looking at the assembly code, the oops happens at the first
> mmc_set_ios(host) within mmc_power_up(). For some reason, host->ops is
> NULL.
>
> The only possible call path I can imagine for that is s3cmci_irq_cd
> getting called before host->ops is set, thus calling mmc_detect_change()
> which will schedule host->detect which is mmc_rescan.
The first bug was this one, which was obvious on the oops output. But
why didn't it happen before, since the code was always there? The answer
would be that, as one would suspect, usually that code isn't preempted
until well after everything is set up. The real reason it was happening
became obvious once that initialization ordering bug was fixed (first
two patches of this series): the oops disappeared, but still nothing
happened. The driver had been failing its initialization the whole time!
What happened was that, due to a change on the return value of
s3c2410_dma_request (see commit
3886ff5f63f33c801ed3af265ac0df20d3a8dcf5, cherry picked as the third
patch of this series), s3cmci_probe was erroneously considering a
successful return as a failure, and going through the error path.
However, by this time host->detect has already been scheduled. Another
mistake (fixed by commit 2de5f79d4dfcb1be16f0b873bc77d6ec74b0426d,
cherry picked as the fourth commit of this series) made the delay before
it finally executes longer, making it happen in the long pause just
before "VFS: Mounted root (jffs2 filesystem)." (the real bug was before
that pause, as can be seen by the attached dmesg). When it finally
executed, it was not only following a NULL pointer, it was following a
NULL pointer in a structure which had already been freed!
The patch has been very lightly tested (it boots, 2007.2 automounts the
card, and a ls -la /media/card shows expected values). I haven't tried
writing or stress-testing it yet.
Given all that, I wonder whether it would be better to keep the current
driver or to backport the 2.6.27 driver (applying whatever extra patches
are needed; the first two patches of this series, for instance, should
still be needed in some form).
--
Cesar Eduardo Barros
cesarb at cesarb.net
cesar.barros at gmail.com
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dmesg
Url: http://lists.openmoko.org/pipermail/openmoko-kernel/attachments/20080809/a41291a8/attachment.txt
More information about the openmoko-kernel
mailing list