[svlug] Where do I start debugging core dumps?

Akkana Peck akkana at shallowsky.com
Wed Oct 31 20:56:59 PST 2012

Dan Mashal writes:
> Don't try to debug core dumps unless you are deving.

I strongly disagree. If you have a program that's crashing
repeatedly, you can help the developers a lot by filing a bug
and including a stack trace of where it's crashing. It may be
a crash that they can't reproduce, so without your stack trace
it'll be impossible for anyone to fix it. You can even read it
yourself, without being a programmer -- the important thing is
** Don't Panic **.

(I'm assuming here we're actually talking about core dumps. As
Marco mentioned, a kernel crash is a different thing, but that
doesn't leave a core file and isn't debugged with gdb.)

If you find a core file sitting in a directory and aren't sure
where it came from, you can find out with the "file" command.
For instance, I seem to have one in my home directory right now:

$ file core
core: ELF 32-bit LSB core file Intel 80386, version 1 (SYSV), SVR4-style, from 'xchat'

Now that I know it's from xchat, I can find out where xchat is:

$ which xchat

and then I can use gdb to get a stack trace:

$ gdb /usr/bin/xchat core
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2) 7.4-2012.04
  ... (lots more boring startup chatter edited out) ...
Core was generated by `xchat'.
Program terminated with signal 11, Segmentation fault.
#0  0xb35ba3a5 in _gperl_remove_mg () from /usr/lib/perl5/auto/Glib/Glib.so
(gdb) where
#0  0xb35ba3a5 in _gperl_remove_mg () from /usr/lib/perl5/auto/Glib/Glib.so
#1  0xb35ba46f in ?? () from /usr/lib/perl5/auto/Glib/Glib.so
#2  0xb6d72cf3 in g_datalist_clear () from /lib/i386-linux-gnu/libglib-2.0.so.0
#3  0xb718ec8e in ?? () from /usr/lib/i386-linux-gnu/libgobject-2.0.so.0
#4  0xb74918c4 in ?? () from /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0
#5  0xb75a4777 in ?? () from /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0
#6  0xb73eab34 in ?? () from /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0
#7  0xb779ec27 in ?? () from /usr/lib/libsexy.so.2
#8  0xb718f288 in g_object_unref ()
   from /usr/lib/i386-linux-gnu/libgobject-2.0.so.0
#9  0xb71905cf in g_object_run_dispose ()
   from /usr/lib/i386-linux-gnu/libgobject-2.0.so.0
#10 0xb749272e in gtk_object_destroy ()
   from /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0
#11 0xb739c36b in ?? () from /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0
#12 0xb73d7837 in gtk_container_foreach ()
   from /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0
#13 0xb73d89fe in ?? () from /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0
#14 0xb718c1ec in g_cclosure_marshal_VOID__VOID ()
   from /usr/lib/i386-linux-gnu/libgobject-2.0.so.0
#15 0xb71892fd in ?? () from /usr/lib/i386-linux-gnu/libgobject-2.0.so.0
#16 0xb718a3d2 in g_closure_invoke ()
   from /usr/lib/i386-linux-gnu/libgobject-2.0.so.0
#17 0xb719bfa3 in ?? () from /usr/lib/i386-linux-gnu/libgobject-2.0.so.0
#18 0xb71a42dc in g_signal_emit_valist ()
   from /usr/lib/i386-linux-gnu/libgobject-2.0.so.0
#19 0xb71a4453 in g_signal_emit ()   from /usr/lib/i386-linux-gnu/libgobject-2.0.so.0
#20 0xb7491981 in ?? () from /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0
#21 0xb75a7e34 in ?? () from /usr/lib/i386-linux-gnu/libgtk-x11-2.0.so.0
#22 0xb71905c7 in g_object_run_dispose ()
---Type <return> to continue, or q <return> to quit---

Now, this may not mean much to you, and it's true that unless you're
an xchat (or glib or gobject) developer, you still won't know why
it's crashing.

But that's okay! Don't panic! Because this is still a perfectly
good stack trace that you can paste into a bug report ... and then
someone who IS a developer might be able to look at it and say "Oh!
That place where we're calling _gperl_remove_mg() from
g_datalist_clear() -- look, we're not checking the return value to
make sure it succeeded!" (Or whatever.)

The important thing is that you type "where" in response to the gdb
prompt to get a stack trace. (When you're done, "quit" quits gdb.)

If you want to learn to read stack traces yourself, "Don't panic"
applies again. It looks like gobbledygook but you can usually figure
some things out.  The first line in the stack trace is the place
where the crash actually occurred. The second line is the place that
called that, and so forth up the chain. In this case, I don't know
much about xchat code but it looks pretty clear that it's crashing
because of something to do with perl ... xchat isn't written in perl
but some of its plug-ins are, so it's probably something to do with
a plug-in. So I've already learned quite a bit, without knowing
anything about xchat's code or eve having the source code on my system.

I have a talk on this, "Bug Fixing for Everybody (even if you're
not a programmer!)" that I gave at a couple of conferences several
years ago.  I'd be happy to present it at SVLUG some time if there
are enough people interested.


More information about the svlug mailing list