Openmoko Bug #1841: white screen of death (WSOD) after resume
Openmoko Public Trac
bugs at docs.openmoko.org
Tue Dec 9 14:50:28 CET 2008
#1841: white screen of death (WSOD) after resume
-----------------------+----------------------------------------------------
Reporter: Rorschach | Owner: openmoko-devel
Type: defect | Status: new
Priority: highest | Milestone:
Component: unknown | Version: GTA02v5
Severity: critical | Keywords: wsod,resume
Haspatch: 0 | Blockedby:
Estimated: | Patchreview:
Blocking: | Reproducible: always
-----------------------+----------------------------------------------------
Comment(by joerg):
Results of our tests so far:
first we found two devices to show WSOD relatively frequent and
reproduceably:
#51 from https://docs.openmoko.org/trac/ticket/1621
and a A7 PP model.
We verified temperature dependency, by warming up whole device (-> no
WSOD),
then cooling down LCM while keeping rest of device in warmed up state ->
WSOD
on first try.
We applied the no_deep_suspend patch to recent stable branch 2.6.24, and
we
found (on #51) it reduces probability of WSOD but won't fix it. There are
other reports [http://docs.openmoko.org/trac/ticket/2115] of WSOD not
being
dependent on going to deep_suspend mode at all (and thus this patch
shouldn't
be able to help there).
Seems deep_suspend can trigger WSOD very easily, but WSOD has some
different
operation scheme than exactly something going wrong during deep_suspend or
resume from that.
WSOD is dependent on time the device is suspended, i.e. it seems like it
takes
quite a few minutes sometimes until suspend triggers WSOD. This seems
somewhat paradox regarding paragraph above.
We patched JBT6K74.c driver to increase existing mdelay() and inserting
new
ones on every reasonable point of communication-flow, and even lowered
GLAMO-SPI clockfrequency, to make LCM feel quite comfortable with any
aspect
of timing regarding the control-communication. Result: none. Randomness of
WSOD seems unchanged.
We added printk() and created logs of a consecutive resume-ok, and a
resume-WSOD following immediately. On comparing both sequences we didn't
notice any significant difference, neither in sequence of function calls
nor
in timing.
We had 2 or 3 times a complete refusal of #51 to produce WSOD. After
taking
out battery for 10min it was back to normal (means 95% immediate WSOD
after
20sec suspend)
We swapped LCM of #51 with the one of a known good device. Result: 40
suspend/resume, as well as placing #51 with new LCM to the fridge for
30min
and then resuming, didn't show any WSOD.
We attached #51-LCM to a known-good device, and it didn't show WSOD on 6
cycles. So obviously the issue isn't located on the LCM entirely.
We never seen any WSOD recovering on subsequent suspend/resume cycles. It
always needed a reboot to recover. *)
So far we didn't see a single WSOD on boot.
So we are wondering what's the difference between
a) switching LCM power down via LDO6, while keeping *all* lines to LCM at
low
(to stop reverse powering by sneak currents, and not to violate JBT6K74
electrical specs), then power up and reset
~and
b) a usual boot bringing up LCM in sane state
Maybe that's pure incidence we never seen a WSOD on boot so far?
*) Further results:
we attached debug-board and resetted the device to reboot without power-
down:
WSOD recovered.
We probed for the signals on LCM-FPC by using a GTA03-debugboard (task not
completed yet): With an old image and kernel (2008.08) there was 3.2V for
powersupply and some of the datalines. We didn't find differences in
probed
signals between WSOD and clear display.
We didn't see a LCM-RESET on resume though.
By messing around with probing the signals, we got a recover from WSOD
once,
but it wasn't reproduceable and only *might* be connected with shorting
reset
to GND.
Removing a WSODed LCM from device during suspend, then reconnecting it,
then
resume: WSOD recovered at least on second resume after that (first one
probably got some confusion by reconnecting FPC made not a nice switch and
some bounces on the lines and wrong sequences for power-up).
First resume LCM usually faded from white to black.
Conclusion: root cause of WSOD is some 'analog' thing depending on LCM and
device. We can not provide a good clue to nature of the issue.
By first(! Vio <= VDD) switching all glamo->lcm IO's to 0V/high-Z, then
disabling LDO6 for suspend, and on resume first powering up device via
LDO6
and then initializing it (incl. activating glamo interface), we should
achieve to get zero power-consumption during suspend for LCM, and be able
to
recover/avoid WSOD.
As Andy is much more savvy in meddling the kernel space, and
LDO6-switchoff is
announced by him anyway, we didn't try to implement this plus the needed
glamo-lines-pulldown here in TPE.
jOERG
--
Ticket URL: <https://docs.openmoko.org/trac/ticket/1841#comment:119>
docs.openmoko.org <http://docs.openmoko.org/trac/>
openmoko trac
More information about the devel
mailing list