Tue, 29 Apr 2008
GCC Tail-call Bug
Today I spent several hours debugging one of my distributed applications. The bug appeared to be in the sorting routine, which sometimes left the data not completely sorted. The strange thing was that when I added a debugging test at the end of each quick-sort recursive call (thus making sure I catch the unsorted data as early as possible), the problem has disappeared.
So, another Heisenbergish bug, I think. The source code is here: it is a straightforward recursive quick-sort, sorting entries with 32-bit key and 32-bit value (written to work on 64-bit systems only, do not bother to report that it is broken on legacy systems).
Now try to compile it and run with gcc -O2
. For me
(gcc 4.1.2 from Fedora 8) it works. However, when using -O3
(or even -O2 -finline-functions
) it doesn't. Also when
you uncomment the debug printout at the end of qsort_data()
,
it should work even with -O3
. I think the tail-call recursion
is not enabled when the debugging code is present.
Does it work for you? Is my code broken in a way I don't see? If this is not only a problem of my system, I will try to report it as a gcc bug.