nWAITing for Glamo
balrogg at gmail.com
Tue Jul 15 20:16:00 CEST 2008
2008/7/14 The Rasterman Carsten Haitzler <raster at openmoko.org>:
> On Mon, 14 Jul 2008 14:26:59 +0100 Andy Green <andy at openmoko.com> babbled:
> aye. this gives the nitties of what the glamo is causing in performance issues.
> basically this is the crux of the "bus bandwidth to the glamo" issue. while
> reading or writing to the glamo the cpu gets stalled waiting for the glamo -
> limiting throughput to about 7m/sec. unfortunately for us the cpu is hung
> waiting on the slow glamo when it could be off doing something more useful with
> its time (even if we accepted limited write/read rates, if they could be async
> we'd be much better off). this was the crux behind the DMA experiment dodji
> did. the problem was that the DMA was on-soc and memory to memory and would hold
> up the cpu in wait states anyway - so you don't win over using the cpu.
Ouch. If the DMA has to block the whole SoC when waiting for the
nWAIT then it's really bad, and it seems to be confirmed by what I saw
(i.e. if I disabled the clock and tried to write to glamo, the whole
SoC would hang, even the JTAG wouldn't respond). I'm pretty sure the
S3C series must special in this regard, and someone should really
consider a completely different SoC for the next models. There's so
many things wrong with them (the timers (16-bit!), the DMA (see
below), the power management, the documentation silently taken away -
this should be alerting). Their only advantage I see is the hw team
is already familiar with them. I don't believe the cost is a blocker
for switching to, say, OMAP.
But here's an idea: we aleady know that we will have to "nWAIT" for a
moment before every write to Glamo - we can spend that time just
nWAITing or perhaps we can do something else on the CPU, and then
start our transfer just in the right moment for the write to be
instant or almost instant. The lengths of periods we spend nWAITing
may have some non-trivial pattern but it's easy to research.
How to do that: the OMAP dma can be told to wait a couple of clocks
before every atomic transfer (element, in omap speak), the s3c can't
do that, I just checked - but it can be synchronised with a timer. In
Dodjis test the transfers were not synchronised - they were like
memcpy/memset, i.e. a new transfer starts a soon as the previous one
finishes. Instead the DMA can stop after every 8, 16 or 32 bits or N
times that, and wait for the PWM timer. I'll try that some weekend.
More information about the openmoko-kernel