Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Nov 2000 03:29:50 -0800
From:      Mike Smith <msmith@freebsd.org>
To:        Richard Hodges <rh@matriplex.com>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: page fault question 
Message-ID:  <200011151129.eAFBToF02993@mass.osd.bsdi.com>
In-Reply-To: Your message of "Tue, 14 Nov 2000 10:58:30 PST." <Pine.BSF.4.10.10011141036500.38382-100000@mail.matriplex.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
> I have been having a great time :-) debugging a device driver,
> and have run into a really fun way to panic.  With one type 
> of traffic, [something] happens and the kernel drops into
> DDB, just the way I want.

8)

> Well, actually DDB seems to get trapped in some kind of loop
> that spews messages faster than a human can read them.  When
> I finally got a piece of a clue, I booted with serial console
> and captured the following (also an endless loop):
>
>   Fatal trap 12: page fault while in kernel mode
>   fault virtual address   = 0x8
>   fault code              = supervisor read, page not present
>   instruction pointer     = 0x8:0xc014ed6b
>   stack pointer           = 0x10:0xc02b1360
>   frame pointer           = 0x10:0xc02b1388
>   code segment            = base 0x0, limit 0xfffff, type 0x1b
>                           = DPL 0, pres 1, def32 1, gran 1
>   processor eflags        = interrupt enabled, resume, IOPL = 0
>   current process         = Idle
>   interrupt mask          = net tty bio cam
>         kernel: type 12 trap, code=0
>   Stopped at 
> 
> The PC seems to have died in the DDB, that's odd (or maybe not?) 
>   ts7# nm /kernel | grep c014ed
>   c014ed38 T linker_ddb_search_symbol
>   c014edbc T linker_ddb_symbol_values                                             
This is pretty normal; ddb is a little fragile sometimes.  You want to go 
back and look at the very first trap; it will probably be different and 
will be the *real* trap.  All the rest are just ddb exploding.

> Now looking back at the panic message, it looks like the stack has
> pushed into the "frame pointer".  Is this an actual problem, or
> just some side effect of the page fault?

The frame pointer is a pointer into the stack, so no, it's not a problem.

> Should I start spending my time looking for kernel stack hogs in
> the device driver?  I can very easily add code to log ESP & EBP;
> would that be productive?

Typically stack overruns lead to double faults (because there's no stack 
on which to handle the fault) and a spontaneous reboot.  This just sounds 
like there's something about your first trap that kills DDB (eg. an 
invalid instruction pointer, etc.)

> Is there a maximum size for a softc?  Maybe I'm accidentally ignoring
> some "code of the west" and am getting punished for it?  (It wouldn't
> be the first time).

Softc structures should never be allocated on the stack, they're 
malloc'ed by the newbus infrastructure so you should be OK there.

Hope this helps; let us know if the first trap isn't any more 
illuminating.  You might also try using remote gdb instead of ddb.

Regards,
Mike

-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]
           V I C T O R Y   N O T   V E N G E A N C E




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200011151129.eAFBToF02993>