[Bug 788] Starting or stopping gsmd completely locks up the Neo

bugzilla-daemon at bugzilla.openmoko.org bugzilla-daemon at bugzilla.openmoko.org
Tue Jan 22 03:25:31 CET 2008


http://bugzilla.openmoko.org/cgi-bin/bugzilla/show_bug.cgi?id=788





------- Additional Comments From mwester at dls.net  2008-01-22 03:25 -------
> If I had to choose between the two patches, I'd prefer Fabien's, because it's
less intrusive.
Yes, but see comment #20 -- that approach still leaves a vulnerability where a
lockup can be triggered by something as simple as someone messing with startup
scripts or boot-time ordering issues.

> However, I'd be even happier with a patch that just clears the darn flow
control bit (in the platform-specific code).
I tried.  The way the tty stuff is handled at the highest levels in the kernel
means that we just can't find all the possible places where flow control might
be set.  The fact that the current ttys structures say that it isn't set is only
good for as long as the tty is held open; it reverts as soon as the tty is
closed.  And then, of course, another process can open it and change the flow
control at any time.  Overall, it meant mucking about in code several levels
higher in the tty subsystem, and I decided that was not the place to try to fix
this.

> Getting rid of the potential endless loop is a different issue, and certainly
a worthwhile endeavour.

Agreed.  But that doesn't solve the problem, really.  It just replaces the
kernel lockup with total lack of any console output on the serial device; might
as well just disable the console (a la Fabien's fix) if you're going to do that.

I remain firmly convinced that the correct solution is one that:
a) ensures that no combination of setting flow control and diddling with the
GTA01 serial port mux should *ever* result in the crash of the kernel -- this is
even more important now that efforts are underway to streamline the boot time.
b) ensures that in any combination of setting flow control and diddling with the
GTA01 serial port mux, the minimum of kernel messages are lost to the serial
port.  This also is important as work continues on boot time issues, but also as
effort continues to simply debug 2.6.24.

Perhaps the problem is the way the patch is written?  Does anyone have any
concrete suggestions on how to rewrite it such that it's more palatable? 
Someone had concerns about the use of the "unused[]" elements in the
datastructure to record state (although there is precedent for this elsewhere in
the driver).  Would it make it better if elements were added to the structure,
or if the data were recorded elsewhere?  I guess I'm still not really clear on
the objections to the solution.




------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.




More information about the buglog mailing list