Status of bug #1024 (periodic signal lost and re-registration)
spaar at openmoko.org
Mon Dec 22 13:06:27 CET 2008
here is an update about the status of bug #1024.
First some background information why it is so hard for us to
solve this bug:
- we (or better, those who work on the GSM stuff) cannot reproduce
- OM does only have a small part of the GSM firmware as source code.
Basically its the AT command interface and some drivers. The rest
is delivered by TI as binary libraries only, especially the GSM
protocol stack and Layer 1. So we cannot just have a look at the
source code and search for errors.
- To get an impression of what we talk about, here are some C metric
numbers from a comparable GSM firmware:
GSM Protocol stack: 700 files 400.000 lines 127.000 statements
Layer 1: 130 file 130.000 lines 31.000 statements
- the actual low-level RF work of decoding the GSM frames is done
by the DSP in the Calypso (there is an ARM and a DSP core inside).
The DSP has its code in ROM and OM has no documentation about it.
- The Calypso chipset is already "end of life" for quite some time,
there is not much support from TI for it any more.
The above should be no excuse, it should show why it is rather difficult
for us to fix this problem.
What we know so far about #1024:
- We have some PCO2 traces (PCO2 is an internal TI tool) which show
that in Idle Mode (the phone is registered to the cell but there is
no voice or data traffic) the periodic reading of the BCCH (Broadcast
Control Channel) of the serving cell at some point fails. We don't
know yet what exactly fails, just that an error flag set. When this
happens, the error does no longer go away and most certainly after
some timeout causes the "signal lost" indication and finally the
re-registering in the cell.
- In traces where bug #1024 does not occur, this error flag is only
set very rarely. And if it is set, it usually goes away within the
next few readings. This is similar if the "AT%SLEEP" workaround is
applied, the error flag is nearly never set.
- This periodic reading of the BCCH occur about every two seconds,
there is no difference with or without #1024 occurring.
- This periodic reading basically works like that: A special
timer ("special" because it is designed to support the
GSM frame timing very well) is programmed to wake up the
chip at the correct time so that the GSM frame of interest
can be received. Then the chip starts to sleep and waits
for the interrupt of the timer. There are two different
sleep modes, "Big Sleep" and "Deep Sleep".
- #1024 only occurs if "Deep Sleep" mode is active (this is the
standard behaviour, AT%SLEEP=2 disables it and only "Big Sleep"
is used). The special thing about "Deep Sleep" mode is that the
fast oscillator of the Calypso is turned off and it relies on the
32kHz oscillator only.
- "Big Sleep" draws less current than "Deep Sleep" so its not a
perfect workaround to disable "Deep Sleep" completely. We have not
yet measured how exact the standby time of the phone is influenced
if "Deep Sleep" is turned off. I assume that it has an influence
which should not be neglected.
There are several open questions:
- The problem could come from "drifting away" in "Deep Sleep" mode from
the right point of time to receive the frame. The firmware does some
adjusting of the 32kHz oscillator, but there are several things which
could go wrong (software and/or hardware issue).
- We should check the 32kHz oscillator, especially have a look at
the 220k resistor R1050. In one of the Calypso docs and in the TI
reference implementation this resistor is 100k. TI is very picky
about the 32KHz resonator, they mention quite a lot of things about
what to take care. Is there a reason why we choose 220k ?
- Is there a regular pattern when bug #1024 occurs ? For example
does it depend on temperature ? Or does it depend on the charging
level of the battery ?
- Is there a way to reproduce #1024 ? Does it only occur with certain
phones ? Or does it depend on the cell where the phone is registered ?
Please feel free to add your comments and thoughts, we are really trying
to fix this problem but we need your help by reporting as much details
as possible about the circumstances for bug #1024. Thank you very much.
More information about the hardware