[qtmoko] New significant speedups coming to FreeRunner

Carsten Haitzler (The Rasterman) raster at rasterman.com
Tue Feb 16 17:18:27 CET 2010


On Tue, 16 Feb 2010 16:49:30 +0100 Thomas White <taw at bitwiz.org.uk> said:

> On Tue, 16 Feb 2010 16:19:08 +0100
> David Garabana Barro <david at garabana.com> wrote:
> 
> > >Now i just change a few kernel config options and few line patch (thanks
> > >to Thomas White) and the graphics speed is very nice. In QVGA it can
> > >probably match iPhone or any Android device.
> > 
> > No, it can't, at least until we have an OpenGL driver. But it's true that
> > using VGA resolution is a handicap for such a slow graphics chip, and it
> > would be better QVGA for this hardware.
> 
> A small point, but there are things we can do along the way to a full
> GL driver which speed things up, and I don't think we've found them all
> just yet.  For instance, adding proper fencing in the DRM driver
> unclogs things by a fairly noticable amount: fullscreen (VGA) blits at
> 100fps with 0% CPU usage, anyone?

indeed nice.. if there is 100fps of data to blit TO the screen usefully.
something has to generate that data... :) technically if you tyried to
implement full xrender accel - evas could be partly accelerated by the 2d hw -
but.. it's my guess (and still is - but if you ever get that far - prove me
wrong :)) that for every win u get from using the 2d hw accel, you will post a
loss by falling back to software ops going across the bus from cpu <-> glamo as
the 2d is only partly able to implement xrender and the kind of ops you need.

of course.. i'm open to be proven wrong, but.. it's my guess that after all the
work and effort, you'll have spent that effort standing still (i.e. gaining on
one hand, losing on the other).

as for GL.. there is gl-es1.1 - but only minimally useful. can't do vga
rendering - so u need to drop to qvga anyway. max texture size of 256x256? not
useful for 2d anymore. evas has a full opengl-es2.0 engine - but 2.0 and 1.x in
gl-es are completely incompatible api's so you choose one or the other to work
on - i chose 2.0 as it's friendly to 2d much more than 1.1. given just the 2
limits above i suspect it will be marginally useful at best.

also note - i've working on soc's with full gl-es2.0 gpu's and fast shared
buses where cpu and gpu are living in the same memory and there is no system ->
video ram bottleneck... and software can equal or beat hw gl in a lot of cases.
in others gl can beat software - but it's not immensely common. i've pushed
things to minimise gl state transitions a lot - minimise re-binds, use texture
atlases - and i've pushed shaders to take a lot of the weight of things like
enabling and dsabling blending and more. of course u'd need 2.0 to have
shaders... but in general gl-es is really good at 3d. ie rotations and
perspective transforms. for simpler things like plain alpha blending, blits,
fills etc. software can equal or beat even the best gpu's (i'm talking s3c6410,
omap3 - sgx 530 and even an sxg540 clocking in at 200mhz and 4 cores).

example:

http://www.rasterman.com/files/other-vs-gl.html

notice gl really works well on the 3d stuff (rotations/perspective), but can
lose badly on many other paths. and thats one of the top-of-the-line gpu's in
embedded.

and for s3c6410 (this is what was once considered for gta03/04 long ago, and
by now this SoC is considered old/legacy):

http://www.rasterman.com/files/s3c-gl-vs-soft.html

the amount of work needed to get it up to snuff was non-trivial with full docs
and sample driver code and a few people working on it full time for many months
from both ends (kernel driver, xserver, userspace libgl and application sides).

you just need to know that - you may get it to work in the end. sure. that may
happen. but... your results may be sorely disappointing :( just beware.

> > Fact is that glamo is a graphics "decelerator". It's known that Neo1973 was 
> > faster than FreeRunner on graphics (even on VGA), despite of slower
> > processor.
> 
> Yes, the bus speed is a fundamental limitation, and it does suck.
> But there are other reasons (see below) why the current driver and
> rendering model is a bad match for the hardware.  In fact, it's a bad
> match for almost all hardware, it's just that normally the overall
> speed is high enough to get away with it.  We haven't yet allowed
> ourselves to make meaningful use of the acceleration features, and I'm
> absolutely convinced that if we did so then the GTA02's UI could fly
> along.  It's a fact that to get to this state we're going to have to
> write a lot of hardware-specific code, and each developer who would
> potentially work on this stuff has to make their own decision about
> whether they want to do that for a GPU which won't be found elsewhere.

thats one of the saddest things. if glamo had a future.. a glamo2,3,4 that
would be found.. it'd be worth effort - but all of this will be for a 1-off gpu
that is for a dead platform (freerunner is no longer produced and there is no
successor coming that has the glamo or successors as part of it). how much
effort do you put into something like that?

me - i am a pragmatist here - i'd put as little effort as possible that is not
going to port onto newer soc's and platforms. that's what i've been doing - and
it's paid off - as above. gl-es2.0 engine - does video textures, rotations..
hell e17 even has a opengl-es capable compositor module now that works quite
decently. and there is a definite future. :)

> Extract from
> http://lists.shr-project.org/pipermail/shr-devel/2009-December/001702.html
> - see the thread for context.
> --->
> If you're only talking about the X protocol overhead, then that's true
> - although I haven't yet seen any numbers...
> 
> However, it's not the driver's fault.  By the time (say) GTK's rendering
> instructions get to our driver (i.e. xf86-video-glamo), they've been
> turned into a series of tiny rectangle operations which are almost
> impossible to accelerate in any useful way.  In this sense, the way X
> requires programs to send their rendering commands, and the way
> GTK/Cairo sends its commands, and the way the X server core communicates
> with the driver, are hurting us.
> 
> Essentially, that's why E is so much faster: it prepares larger chunks
> of data at a higher level where acceleration can be much more
> meaningful, then sends them to the server in one big block.  The price
> of this is that the acceleration done by the driver is hardly used in
> most cases, so we still don't get the best out of our hardware.

well that depends what engine - software - yes. cpu generates everything - all
pixels and blasts them to x the fastest way it can. for xrender everything is
prepared and then blasted as a series of xrender ops. for gl - same. prepared
and evas blasts lots of gldrawarrays worth of triangles. it lets the gpu
pipeline take it from there. as such - this is pretty much how most modern hw
likes it - it likes to be given large batches of commands in a pipeline, not
piecemeal ones-ies and twos-ies. :)

> A more fundamental redesign could potentially allow such pitfalls
> to be side-stepped, but this also comes at a price:  Hardware-dependent
> code would end up existing at a higher level in the software [1],
> reducing the reusability of code.
> 
> [1] In the extreme case, hardware-dependent code can be moved all the
> way up the the individual client program, abstracted by a library.
> This is what DRI does, in which case that abstraction library is
> usually Mesa, providing an OpenGL API.
> <---
> 
> My decision about this was simple:  Since I enjoy the development work,
> it doesn't make any difference to me that the hardware will go away in
> time.  Nothing is forever, and this is a perfect opportunity to learn
> about driver development on a relatively tame piece of hardware.  I
> don't have any immediate plans for world domination [2]..

and that's totally fair enough. good to know you know what you're looking at
and its usefulness etc. etc. i'm just happy that your expectations are
realistic. me - i'm less about the exercise and more about the end result...
but that's me :)

/me goes back to whooshing his smooth lists around his screen

> Tom
> 
> [2] ... or is that just what I want you to believe?  Mwahahahaha...
> 
> -- 
> Thomas White <taw at bitwiz.org.uk>
> 
> _______________________________________________
> Openmoko community mailing list
> community at lists.openmoko.org
> http://lists.openmoko.org/mailman/listinfo/community
> 


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster at rasterman.com




More information about the community mailing list