andy git 06/15 suspend/resume observations

Sean McNeil sean at mcneil.com
Thu Jun 19 00:21:07 CEST 2008


Mike,

First I'd like to apologize for the tone of the previous email. I 
re-read it and realized it was not as I meant to express things. I do 
have a strong opinion on the subject, but "Gross Hack" was too strong.

Mike Westerhof wrote:
> Sean McNeil wrote:
>   
>> Mike (mwester) wrote:
>>     
>>>  Andy Green wrote:
>>> Somebody in the thread at some point said:
>>> | Somebody in the thread at some point said:
>>> | | Andy,
>>> | |
>>> | | I've now confirmed it is from GSM wakeup. If I do not initialize the
>>> GSM
>>> | | then the phone never locks up.
>>> |
>>> | EXCELLENT, thanks a lot.
>>> |
>>> | Mike can this plug into the serial resume problems?
>>>
>>>       
>>>> I haven't taken a thorough look at the GTA02; with the console safely
>>>> out of the way on port 2 on this device there should be absolutely no
>>>> reason for any suspend/resume ordering issue to cause lockup/hang
>>>>         
>>> problems.
>>>
>>> |
>>> | How can one provoke GSM wakes then?  Although I am in runlevel 3 I do
>>> | actually have a SIM card in and am running gsmd -- last night before I
>>> | went to bed though I put it in suspend, and it woke 100% perfect
>>> | thismorning after 7 - 8 hours suspended.  But it didn't wake before
>>> that
>>> | from GSM.... you really have to ring the phone?
>>>
>>>       
>>>> Depends on your rootfs; the other emails on this thread outline the
>>>> common things.  One can enable the "nspy" feature to find out.  Echo
>>>>         
>>> "1"
>>>       
>>>>  to the nspy_enable sysfs file to turn it on.  Then when the phone
>>>> wakes, repeatedly "cat" the nspy_buffer sysfs file to dump out the
>>>>         
>>> event
>>>       
>>>> buffer.  You'll see the suspend/resume events from the point of view of
>>>> the serial and neo1973_pm_gsm drivers, and you'll see the data stream
>>>> from the UART identifying the reason for the wakeup.
>>>>         
>>>> There is one other possibility -- when the GSM first powers up, it
>>>> always issues a GSM wakeup interrupt, although it has no data to send.
>>>> Is there a possibility that the GSM is being unpowered and powering
>>>>         
>>> back up?
>>>
>>> Hm what's going on here... in the resume:
>>>
>>>     /* We must defer the auto flowcontrol because we resume before
>>>      * the serial driver */
>>>     if (!schedule_work(&gsmwork))
>>>         dev_err(&pdev->dev,
>>>             "Unable to schedule GSM wakeup work\n");
>>>
>>> but in the work function there
>>>
>>> static void gsm_resume_work(struct work_struct *w)
>>> {
>>>     printk(KERN_INFO "%s: waiting...\n", __FUNCTION__);
>>>     nspy_add(NSPY_TYPE_RESUME, 'W', jiffies);
>>>     if (gsm_autounlock_delay)                    <=== zero on GTA02
>>>         msleep(gsm_autounlock_delay);        <=== no delay
>>>
>>>       
>>>> User-space on the GTA02 is expected to explicitly manage the
>>>> flow-control by means of the sysfs flag (preferred), or by means of
>>>> changing the modem control lines on the serial port (deprecated, IMO),
>>>> or as a last resort (and highly discouraged) to set the auto-unlock
>>>> delay to some non-zero value.
>>>>         
>> This is a gross hack. The whole point of device drivers having
>> suspend/resume callbacks is to put them in a proper state so that they
>> are there when you come back up. I, for one, do not want to use sysfs
>> flags at all. I already set the hardware flow control on opening the
>> device. Are you saying that the user-space must now monitor that is has
>> been resumed and then write to some sysfs file? I repeat: Gross Hack.
>>     
>
> Feel free to write something better.  There are wheels within wheels
> here, though.
>
> I explained the entire process some time ago (please search the archives
> to find those series of multi-page emails describing this in absurd
> detail!), but let me address your concerns in as brief a manner as is
> possible:
>
> 1 - I agree in principle, and have repeatedly suggested that we need a
> different device driver to support interfacing to the GSM, because the
> semantics do not quite match with what the serial driver actually does.
>  I've not had very good (!) response to that proposal, hence workarounds
> have been applied.
>
> 2 - You say that setting hardware flow-control should suffice - and I
> think you have a right to expect that it remains honored.  I was rather
> surprised, to say the least, to find the code in the low-level
> suspend/resume stuff that wraps saving the UARTs state in #ifdefs.  But
> to modify the serial driver to do this in what I think would be a more
> rational way would make it very specific to the GTA0x and would probably
> impact the higher layers that are not specific to the GTA0x devices.
>
> 3 - However, even if the previous points are addressed, it does not
> really solve the problem at a higher level.  Oh, sure you've not lost
> any data anymore -- it's sitting there in the UART and the kernel
> buffers waiting for you to read it.  So when your phone wakes up for
> some other reason hours later, it'll be able to start ringing in
> response to the SMS message that actually arrived hours earlier, but had
> the misfortune to be sent from the GSM to the UART in the window between
> the user-space process being frozen and the serial driver being suspended.
>
> So, since the latter point (#3) requires that your application do
> something to ensure that *somebody* is watching the GSM in order to
> generate the wakeup interrupt anyway, then it simply seems more
> reasonable to do the "Gross Hack", as you call it.  I really can't see
> that rewriting the entire S3C24xx serial driver, and modifying the layer
> above it, just to solve this problem, is going to result in code that's
> going upstream.  So all we have left to argue about (unless you submit a
> solution that is less of a "Gross Hack", of course) is what the means is
> that we use for the application to signal that it is preparing to
> suspend, and that the GSM should shut up and interrupt instead of
> blasting data.
>
>   

This is where I disagree. The user-space has no reason to need to know 
or care that it got the message while it was sleeping. The radio 
interface layer works just fine on a gta02 as things stand. Maybe there 
is some window that needs to be closed where serial data can come in 
during suspend and thus a wakeup interrupt doesn't occur. In this case, 
the suspend should setup for the wakeup interrupt and then check the 
serial fifo. If something is there then deny the suspend. As for a call, 
you might lose the first +CRING or whatever, but you'll wake up on the 
next. So I guess you are only worried about something like an SMS 
message. Also, no one in the user-space can generate a wakeup so I'm 
cloudy on this logic. Finally, the application can't do what you are 
proposing anyway. It is timing critical as you point out. So having the 
application say "prepare to suspend" and interrupt me instead of 
blasting data is not possible. Come to think of it, this is exactly what 
suspend/resume in the driver is suppose to do.

Now, on the gta01 things are very different I gather. This is because 
the uart is multiplexed as a console as well, no? I'm guessing that when 
switched over to be the GSM more work should be done to make that 
handover complete so that there is no longer any interference with the 
console.

> All said, you don't have to do it if it's so repugnant for user-space to
> take the initiative to manage the GSM in this fashion.  I sense intense
> resistance on the part of many people on this point, frankly, and I'm
> utterly perplexed as to why this is the case.  Am I missing something
> here?  Is there a specification that outlines a permissible percentage
> for missed GSM events?
>
> (I should also mention that the above is the GTA02; if you call the
> software solution a "Gross Hack" I'm truly curious what words you have
> for the hardware challenge the GTA01 poses in regard to this!)
>
>   

Again, please accept my apology for such strong wording. So far, I am 
unconvinced there is a problem that should be pushed to the user for this.

>>>> The GTA01 had different defaults because I thought (at the time)
>>>>         
>>> that by
>>>       
>>>> using the auto-unlock technique and the horrible hack in the serial
>>>> driver, we could avoid the overrun problem.  It turns out that just
>>>> defers it, so there's no longer any point to having GTA01 handled any
>>>> differently than GTA02.
>>>>         
>>>> Which (as I mentioned in an earlier email) leaves us with three
>>>>         
>>> means to
>>>       
>>>> handle flowcontrol of the GSM, which is two too many.  This particular
>>>> bit of code will be removed, along with the code in the serial driver
>>>> that does this function from the modem-control code.
>>>>         
>>>     if (gsm_auto_flowcontrolled) {
>>>         nspy_add(NSPY_TYPE_SPECIAL, '+', jiffies);
>>>         if (machine_is_neo1973_gta01())
>>>             s3c24xx_fake_rx_interrupt(10000);
>>>         s3c2410_gpio_cfgpin(S3C2410_GPH1, S3C2410_GPH1_nRTS0);
>>>         gsm_auto_flowcontrolled = 0;
>>>     }
>>>     nspy_add(NSPY_TYPE_RESUME, 'Z', jiffies);
>>>     printk(KERN_INFO "%s: done.\n", __FUNCTION__);
>>> }
>>>
>>> There's no schedule_delayed_work, no msleep, this could execute right
>>> away, and yet it says in the comment we need to wait for serial driver
>>> !?!?
>>>
>>>       
>>>> We need to wait for the serial driver in order to avoid data loss --
>>>>         
>>> not
>>>       
>>>> because of any hangs or lockups that I've ever observed.  The issue is
>>>> that depending on whether the low-level debug for suspend/resume is
>>>> enabled in the defconfig, the UART registers may be restored on
>>>>         
>>> wake, or
>>>       
>>>> they may be reset to the default boot-time settings and then reset to
>>>> the settings specified in the termio structures.  Because of the shared
>>>> console on the GTA01, that device actually sets the port to function as
>>>> a console, then immediately sets it to the desired settings -- it is
>>>> this "diddling about" with the UART status that requires that we keep
>>>> the GSM from sending anything until after things have stabilized.
>>>>         
>> OK, then it should be stabilized in the resume process.
>>     
>
> That's fine with me.  It looks like Andy has put some code together to
> do that more elegantly.  But IMO it doesn't matter -- the application
> managing communications with the GSM needs to manage flow-control of the
> GSM during suspend/resume, so it becomes a moot point.
>
>   

Yes, it is looking promising at this point. God how I wish Linux had a 
generic suspend/resume dependency mechanism. So many issues here.
>>>  I check the resume ordering
>>>
>>> [ 7187.755000] neo1973-pm-gsm neo1973-pm-gsm.0: resuming
>>>
>>> [ 7187.755000] gsm_resume_work: waiting...
>>>
>>> [ 7187.755000] gsm_resume_work: done.
>>>
>>> ...
>>> [ 7187.755000] s3c2440-uart s3c2440-uart.0: resuming
>>>
>>> [ 7187.755000] s3c24xx_serial_set_mctrl: GSM mctrl=0x00000000
>>>
>>> [ 7187.755000] s3c24xx_serial_set_mctrl: GSM mctrl=0x00000006
>>>
>>> [ 7187.755000] s3c2440-uart s3c2440-uart.1: resuming
>>>
>>> [ 7187.755000] s3c2440-uart s3c2440-uart.2: resuming
>>>
>>> Hum what happens when that completes and the uarts aren't up?
>>>
>>> -Andy
>>>       
>>>  Mike (mwester)
>>>       
> Mike (mwester)
>
>   





More information about the openmoko-kernel mailing list