Graphics Performance

Carsten Haitzler (The Rasterman) raster at rasterman.com
Fri Apr 3 02:30:20 CEST 2009


On Thu, 02 Apr 2009 11:17:42 -0400 "Iain B. FIndleton"
<ifindleton at videotron.ca> said:

> A significant issue for me is the performance of the graphics display on 
> the FR. I recall some discussions a while back about making use of the 
> XGlamo acceleration features. Has any progress been made here? It 
> appears to me that the graphics performance on the FR is poor compared 
> to, for instance, the iPhone or iTouch, both of which have slower CPUs.

totally incorrect.

1. iphone ant itouch have 500-600mhz (depending on version) s3c6400 cpu cores.
these are about 2x the speed of the 2442 in the gta02. (armv6 vs armv4, faster
memory etc.).
2. they both have half the pixels to draw (320x480 vs 480x640).
3. they both use the mbx 3d accel for 2d (the fr can't sanely do this thanks to
the 3d unit 1. limited to 256x256 for textures, 2. no render-to-texture, 3. max
3d buffer size is 511x511 - that is dimensions, so it cant even do fullscreen
drawing).
 
> When applications running on the FR have their X output routed to a 
> machine with accelerated graphics, it is apparent that the FR processor 
> can deliver the X events fast enough, but the FR graphics chip interface 
> can't keep up.

correct. this is simply a hadrware limitation. the chip in the Fr was never
designed to work well with VGA resolution - it was designed for QVGA. it CAN do
VGA (just like you average cheap car CAN do 200km/h - but it's not going to
take a corner at 200km/h compared to a proper sports car).

the glamo can accelerate some things - but then also cannot do others. in the
end you will accelerate some things on it - only to be left by doing others in
software on the cpu. this is where hell kicks in. then you end up transferring
data from video ram back to system, then writing back to video ram for more
accelerated ops, and so on. in the end you transfer data back and fort 1, 2 or
3 times to do an operation when falling back to software. even if glamo can do
the accelerated bits 5x faster than the cpu can - you will end up with a slower
overall pipeline as you keep transferring data back and forth over the
incredibly slow video bus (which can push at best about 7m/sec of data from
system to video ram and back - about 1/6th the speed of transferring data
around system ram).

so in the end... you will spend a lot of work on accelerated routines - getting
them to work, and end up.. where you started - still just as slow. i bet this
early on. i've read the glamo specs and played with it. this was the conclusion
i came to. if you didn't have the video bus transfer slowness then software
fallbacks don't have as big an impact - but even then they need to be optimised
and there is still overhead if you cant do the operation in-place. but this is
not the case.

so the other side of that is to do everything with the cpu in system ram and
transfer to the glamo when done - so you only deal with the slow write once, at
the end. but remember the write - when being done, will hold the cpu hostage
and as it is now slowed down to 1/6th its normal speed during this write - you
lose even more cpu power.

the solution is to just update less of the screen, make drawing simple so the
cpu has to do less when software-rendering, and/or drop down to qvga.

please dig over the archives of this list. this has been gone over in gory
detail before :)

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster at rasterman.com





More information about the community mailing list