Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 31 May 1996 12:28:50 +0200 (MET DST)
From:      grog@lemis.de (Greg Lehey)
To:        dyson@dyson.iquest.net (John Dyson)
Cc:        FreeBSD-current@FreeBSD.ORG (FreeBSD current users)
Subject:   Re: VM problems 05/29/96
Message-ID:  <199605311028.MAA25926@allegro.lemis.de>
In-Reply-To: <199605301816.NAA24087@dyson.iquest.net> from "John Dyson" at May 30, 96 01:16:37 pm

next in thread | previous in thread | raw e-mail | index | archive | help
John Dyson writes:
>
>>> 	Sorry I was not able to send you the output over the weekend.
>>> But, I just got done testing your VM changes you submitted on 5/29 and
>>> things are getting better.
>>
>> I'm still having the same problems with Emacs 8-(
>
> Here is a fairly detailed status report of the solution to the
> problems that you and some others have been having:
>
> There has been several bugs that both DG and I have found.
>
> 1)	DG has found that there is a case that can block that can allow
> 	the page queue can change in the deactivate-->free loop in
> 	the pageout daemon (just fixed.)
>
> The following were found last night by me, but not fixed in the tree yet:
> 2)	The inactive, free, active AND cache queues can be modified
> 	by vfs_bio at splbio interrupt time :-(.  (Have a fix on
> 	my machine.)
>
> 3)	The active queue can get modified by in the active->deactive/cache
> 	loop in the pageout daemon by vm_page_protect(m, VM_PROT_NONE).
> 	(Have a fix on my machine.)
>
> Aall of the above has caused queue corruption in the pageout daemon,
> and since DG and I both have at least 32MByte of ram, the problem is
> not manifest often for us to have trouble.  Each fix has made my system
> running in 4-8MBytes "better", and have been I have often been "tricked"
> by the problem appearing to go away with each succesive fix.

This may have a bearing.  I started getting my problems when I reduced
memory from 32 MB to 16 MB.

> There is still a problem when using 4MByte memory of my system locking up under
> X windows, but the queue corruption is gone now.  (This (1,2) above could be
> the instability that some people have seen running their system has an MMAPped
> news server.)  (3) above only manifests itself when the page that is unmapped
> from a process is just prior to it's pagetable page in the active queue
> (believe it or not, it happens often...)
>
> DG and I have found *numerous* bugs in the pageout daemon recently.
>
> I plan to commit the above fixes on the night of 30May.

I've just installed the latest kernel (ctm 2057), and since then
things seem to be much worse.  I've had the same problems starting
emacs, even on an idle machine with 16 MB, including a SIGILL.  After
that, the system hung, and in ddb I discovered it was looping in
vm_pageout_scan.  I took a dump, which unfortunately lost the
top-level stack frame.  Here's a trace:


(kgdb) bt
#0  boot (howto=256) at ../../i386/i386/machdep.c:940
#1  0xf01201a7 in panic (fmt=0xf0101328 "from debugger") at ../../kern/subr_prf.c:127
#2  0xf0101345 in db_panic (dummy1=-266591533, dummy2=0, dummy3=-1, dummy4=0xefbffd68 "") at ../../ddb/db_command.c:395
#3  0xf010122e in db_command (last_cmdp=0xf01f3b34, cmd_table=0xf01f3994) at ../../ddb/db_command.c:288
#4  0xf01013ad in db_command_loop () at ../../ddb/db_command.c:417
#5  0xf0103718 in db_trap (type=3, code=0) at ../../ddb/db_trap.c:73
#6  0xf01c20aa in kdb_trap (type=3, code=0, regs=0xefbffe64) at ../../i386/i386/db_interface.c:136
#7  0xf01ca8e0 in trap (frame={tf_es = 16, tf_ds = 16, tf_edi = 1, tf_esi = 134, tf_ebp = -272630104, 
      tf_isp = -272630132, tf_ebx = 6, tf_edx = -266591579, tf_ecx = 0, tf_eax = 38, tf_trapno = 3, tf_err = 0, 
      tf_eip = -266591533, tf_cs = -272695288, tf_eflags = 582, tf_esp = -266591595, tf_ss = -266430773})
    at ../../i386/i386/trap.c:399
#8  0xf01c2921 in calltrap ()
#9  0xf01ea3e2 in scgetc (noblock=1) at ../../i386/isa/syscons.c:2659
#10 0xf01e5dec in scintr (unit=0) at ../../i386/isa/syscons.c:562
#11 0xf01c32be in Xresume1 ()
#12 0xf01bb21f in vm_pageout () at ../../vm/vm_pageout.c:917
#13 0xf01112b6 in kproc_start (udata=0xf01f9660) at ../../kern/init_main.c:255
#14 0xf0111254 in main (framep=0xefbfffb8) at ../../kern/init_main.c:205
kgdb) f 10
#10 0xf01e5dec in scintr (unit=0) at ../../i386/isa/syscons.c:562
562         c = scgetc(1);
(kgdb) x/20x $ebp
0xefbffee8:     0xefbfff64      0xf01c32be      0x00000000      0x80000000
0xefbffef8:     0xf0a10010      0x00000010      0xf0a13980      0x00000000
0xefbfff08:     0xefbfff64      0xefbfff20      0xf026cd70      0x7fffffff
0xefbfff18:     0x0069e000      0x00000000      0x00000000      0x00000000
0xefbfff28:     0xf01ba9ff      0xf01b0008      0x00000246      0x80000000
		^^^^^^^^^^
Could this be the return address?

(kgdb) 
0xefbfff38:     0x00000058      0x0024b000      0xf011c731      0x00000000
0xefbfff48:     0x00000025      0x00000000      0x00000134      0x0000001a
0xefbfff58:     0xf0264610      0xf01bb605      0x80000000      0xefbfff7c
0xefbfff68:     0xf01bb21f      0xf01f9660      0xf01fe314      0xefbfff88
0xefbfff78:     0x000009b0      0xefbfff90      0xf01112b6      0xf094b7cf
(kgdb) i addr vm_pageout_scan
Symbol "vm_pageout_scan" is a function at address 0xf01ba8b4.
(kgdb) x 0xf01ba9ff
0xf01ba9ff <vm_pageout_scan+331>:       0x8318438b
(kgdb) i line * 0xf01ba9ff
Line 581 of "../../vm/vm_pageout.c" starts at address 0xf01ba9ff <vm_pageout_scan+331>
   and ends at 0xf01baa08 <vm_pageout_scan+340>.

I don't know if this will point you any closer.  I'll keep the dump
for a while, so if you want any other information, let me know.

Greg



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605311028.MAA25926>