What I'm getting at is that if you did a memset() to Glamo it will go a
lot faster than if you did a memcpy().  If the first test was of the
memset() variety it would explain why DMA seems half the speed.

If both were memcpy()-type action then it's still a mystery.

