GTA02 u-boot and kernel status

Werner Almesberger werner at openmoko.org
Tue Mar 4 20:44:52 CET 2008


Here's a quick status update on open issues and problems I know of in
the lower reaches of the system (in no particular order):

- we currently don't use the MAC addresses assigned to each machine

  - for BT, there is already a process for using a different set of
    MAC addresses, provided to us (by the maker of the module ?).
    We can use this for now, and phase in our own MAC addresses at
    a later point in time, through a rootfs/package update.

  - for EoUSB, this will require an environment update, which we can
    make as simple as running a script (which in turn can be put into
    a package). EoUSB uses randomized MAC addresses by default, so we
    don't break existing functionality by not using ours.

  - WLAN, where MAC addresses are probably most visible, is not
    affected by any of this

  Suggested resolution: no action before MP. Think about phasing in
  use of our own MAC addresses when the more pressing issues are
  under control.

- the LCM sometimes stays dark when we bring up u-boot. This one is
  a bit nasty because it's confusing if the machine just doesn't
  respond. (Even though the hardware is perfectly healthy.)

  Furthermore, since this happens in u-boot, also the "safe" u-boot
  in NOR is affected. As a work-around, the NOR setup just cycles the
  LCM backlight again. This has so far never failed to bring the
  backlight to life, so the common recovery path (i.e., boot from
  NOR) is not affected by this problem.

  AFAIK, nobody is working on this at the moment, and it may take a
  bit of time to figure out what's really happening. (The apparent
  cause is that the overvoltage/overcurrent protection of the LED
  boost converter trips. I think Andy may have details on this.)

  Suggested resolution: since it may take a while to determine the
  exact root cause, it's probably better to defer this to a u-boot
  upgrade we strongly recommend for any new devices (similar to what
  we did for GTA01).

- the PMUs driving the LEDs seem to have a bunch of issues, including
  failure to survive suspend, failure to light up, and failure to turn
  off completely.

  Also this one may take a bit to sort out. It's nothing too horrible,
  since we don't make much use of the LEDs yet.

  Suggested resolution: defer this to a strongly recommended kernel
  upgrade for new devices.

- JFFS2 suffers gradual corruption

  The corruption manifests itself mainly in an increasing number of
  complaints from JFFS2, but it may also be possible that it causes
  a performance degradation or data loss/corruption (although I
  haven't observed any of the latter).

  I believe this is a mismatch between mkfs.jffs2 parameters and the
  actual NAND geometry, but I don't have a good way to predictably
  reproduce this yet (and thus couldn't tell if an attempt at solving
  it is really effective or not).

  Suggested resolution: I'll give this one more try later today, and
  when I get back home. In the worst case, this may not get resolved
  before we have to provide an image to the factory. Since we'll want
  to do a complete application update for new devices anyway, that
  update could take the form of a new rootfs (maybe this is the plan
  anyway ?), which would then have the correct settings.

  (Related to this is also the use of hardware-accelerated ECCs. I've
  turned them off until the corruption is resolved, just to make sure
  we don't add yet another potentially destablizing factor. HW ECCs
  are probably perfectly fine, though, so we'll also have a
  performance win once all this is solved.)

- initial power settings

  So far, we don't have a comprehensive strategy for determining how
  the available power sources affect if and how we can start the
  system. And in particular, the dreaded 500mA USB issue still
  exists.

  Matt and I have discussed an algorithm that should take care of all
  this and Matt's working on it now. (Well, I hope not _right_now_,
  since it's already kinda late ;-)

  Suggested resolution: get this fixed this week. If there are
  unexpected major difficulties, I'd just revert the 500mA change,
  which achieves the goal of letting us boot from USB power alone
  only sometimes anyway, and we'll do a proper fix in the u-boot
  update for new devices.

- PMU interrupts may not work

  The I2C unbusy change may have introduced a condition that would
  make the PMU unable to bring the system out of suspend. The fix
  should be trivial, but I have to bring my test machine back to a
  state where it resumes at all, then I can verify if a fix is needed
  at all, and if it works.

  Suggested resolution: fix this later today. In case of undue
  difficulties, that would be yet another item for a later kernel
  update.

- GPIOs still float

  We've identified lots of improperly configured GPIOs, but we
  haven't actually fixed them. I haven't observed them causing any
  actual problems, but then floating inputs are unpredictable. (And
  we have seen floating inputs make serious trouble in other areas.)

  I actually think that the way how the initialization of GPIOs is
  currently implemented makes any changes there unnecessarily
  difficult and can easily lead to new bugs. So I think we should
  first bring this into a more programmer-friendly form, and then
  make the necessary changes.

  Suggested resolution: get this done for the great update.

- AUX can crash u-boot booting from NAND

  This is the NOR-kills-NAND equivalent of the NAND-kills-NOR problem
  fixed recently. I need to have a closer look a the details of
  exception handling. As far as I remember, it should be possible to
  just remap the exception vector table, but the kernel may need some
  adjusting for this as well.

  Since u-boot for NOR and NAND will initially be identical, this
  problem will not exist in devices we ship from the factory.

  Suggested resolution: fix this for good in the great update.

So I think we should plan to have a complete system software overhaul
along with the UI update that's already planned. Most of the issues
are more annoying than truly critical. What's more important is that
none point to yet undiscovered major hardware issues or break
recovery from NOR.

I don't particularly like the JFFS2 corruption, so I hope we can make
progress there before MP.

I'll have a peek at Andy's whiteboard later today to see if he's got
some more monsters lined up.

- Werner




More information about the openmoko-kernel mailing list