[svlug] Box frozen, then won't reboot.

Jean-Marc Libs libs at noos.fr
Mon Jul 15 00:49:29 PDT 2002


On Sun, 14 Jul 2002, Karsten M. Self wrote:
> on Sun, Jul 14, 2002, Jeremy Zawodny (Jeremy at Zawodny.com) wrote:
> > On Mon, Jul 15, 2002 at 01:34:02AM +0200, Jean-Marc Libs wrote:
> > > 
> > > I guess the HD is mostly OK (maybe some fsck required), and either
> > > the Ram burned in the excessive heat, or the motherboard prevents
> > > proper access to it. Or something else.
> > > 
> > > Trying to boot on a linux BBC on choice 1 (simple boot, no framebuffer)
> > > displays the following:
> > > 
> > > ----------------------
> > > Partition check:
> > >   hda: .... <everything OK here> ...
> > > RAMDISK: Compressed image found at block 0
> > > Unable to handle kernel paging request at virtual address 0b413bbc
> > > current->tss.cr3 = 00101000, %cr3 = 00101000
> > 
> > Yeah, you've got bad ram now.  I had something like that hit a few
> > months ago.  I replaced the 512MB DIMM and it was back to normal.  Of
> > course, it was Micron/Crucial RAM, so I can get it replaced under
> > warranty. :-)
> 
> One way to troubleshoot this without yanking DIMMs is to use the MEM=
> boot option to restrict memory to a fraction of the total available.
> It's quite likely that if memory is your problem, there's a certain
> point at which it goes bad.

Thanks, I didn't know that. It's a very good thing to know.

>  If your problems are elsewhere (e.g.:
> thermal limits exceeded as suggested by Ira), you won't be able to
> isolate the problem this way.

It does not seem to be the problem described by Ira: the computer shows exactly
the same behaviour when it's all open in fresh air, and the (only) fan looks
like it whirs quite happily.

> My suggestion would be to specify 1/2 your available RAM.  Then
> double-and-halve the difference until you isolate the point at which you
> do/don't have problems.  You can step through a large space quickly this
> way.
> 
> E.g.:  256 MiB on board:
> 
>   - MEM = 128	OK
>   - MEM = 192 ( 128 + (256-128)/2 )   OK
>   - MEM = 224   FAIL
>   - MEM = 208   ...etc.

That was the correct general idea.
For the record, I found a BootPrompt-HOWTO.html in /usr/doc/HOWTO
and the exact syntax that worked  is:
lilo prompt: <my kernel> mem=128M

and apparently, all works fine, if a bit low on ram.
Great news :-D

Figuring out which part of the ram is faulty is not that interesting:
it's all in one DIMM, so I'll pull it out and try to figure out if
it's covered by my warranty (it should be, but I bought the PC 15 km
to the east of where I currently live, and I'm not *that* fluent in 
german).
That's going to be the next interesting part :-D

> ...actually, if you can reliably repeat the error at a certain point,
> you've almost certainly got bad memory.

Thanks to everybody who helped, it really did,
Jean-Marc Libs

-- 
And that is called paying the Dane-geld;
  But we've proved it again and again,
That if once you have paid him the Dane-geld
  You never get rid of the Dane.  -- Kipling on MS Enterprise Licensing




More information about the svlug mailing list