ARM FCSE - random programs segfaults

Gilles Chanteperdrix gilles.chanteperdrix at xenomai.org
Sun Apr 18 11:42:25 CEST 2010


Michael Trimarchi wrote:
> Hi,
> 
> Martix wrote:
>> Hi,
>> I've compiled latest 2.6.29-andy-tracking kernel with ARM Fast Context
>> Switch Extension patch [1], using gta02_moredrivers_defconfig kernel
>> config and CONFIG_ARM_FCSE=y and CONFIG_ARM_FCSE_BEST_EFFORT=y
>> options. I've tested this kernel with latest SHR-unstable. System with
>> FCSE enabled is really faster. However some (mostly Python)
>> applications randomly segfaults during loading[3]. But when
>> application loads properly, it runs rock stable[4].
>>
>> Seems, it isn't related to 32 MB virtual memory limit and best effort
>> mode, because applications running above this limit runs properly.
>>
>> I also tested FCSE on Qt Moko v16 with nodebug v17 kernel, but Qte
>> always restarts when loading desktop.
>>
>> ARM FSCE support is already included in Android on FreeRunner
>> distributon kernel. I like to see this included also in Openmoko
>> distributions kernels, when will be stable and ready.
>>   

Hi,

The problem you had is basically a stack overflow. Because the script
was loading a lot of libraries, they were preventing the growth of the
stack. This issue was fixed by two changes:
- first, starting mmaping libraries at 8MB instead of 16MB, this gives
us some more room for mmaping libraries, but reduces the heap size, but
that is OK, since the glibc starts creating anonymous mmapings when it
runs out of heap (so, whereas the space between 8M and 16MB was only for
 the heap, that is malloc, it is now shared between malloc and mmap).
- second, we enforce the kernel stack size limit, that is RLIMIT_STACK.
In a vanilla kernel, the stack mapping starts very small, and grows on
demand. This is what allowed the failing script to load libraries that
could prevent the stack from growing on demand. The kernel will now not
use the space reserved for stack. As this reserved space is 8MB by
default in a standard kernel, we reduced it to 1MB, which is still large
enough for most applications, but more reasonable in a 32MB address
space. And anyway, you can change that limit with the ulimit -s command
(when running ulimit in a shell, all processes created by that shell
will use the new limit).

All this is available in the version 4 of the FCSE patch, available only
for 2.6.29 for now (patches for other versions of linux should follow
next week), which also adds the following new features:
- more precise tracking of what process may have entry in cache, this
allowed us to gain 6% in hackbench runtime, see:
http://sisyphus.hd.free.fr/~gilles/pub/fcse/hackbench-fcse-v4.png
- optional dynamic pid allocation, this reduces the number of cache
flushes, at the expense of a higher context switch overhead, this is not
a win with the hackbench test, but may help with more real workloads;
- optional preemptible cache flushes; this may help reduce latencies,
but at the expense of higher cpu usage.
- optional help messages, they will tell you (in the kernel console)
when a process goes over 32MB, and when a process may have had a stack
overflow, this should help you tune the stack size limit if the default
1MB is not enough, or too much for you.

The patched kernel has been tested with LTP, it shows no difference with
an unpatched kernel, but running this new patch with real applications
is of course the best test. So, I am really interested in your feedback
(but please reply to my personal address).

Finally, the URL:
http://sisyphus.hd.free.fr/~gilles/pub/fcse/downloads/fcse-2.6.29-v4.patch.bz2

-- 
					    Gilles.



More information about the openmoko-kernel mailing list