Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 6 Mar 1995 11:17:46 -0500
From:      starkhome!gene@sbstark.cs.sunysb.edu (Gene Stark)
To:        davidg@Root.COM
Cc:        current@FreeBSD.org, dyson@Root.COM
Subject:   Page fault panics during make world in -current 
Message-ID:  <199503061617.LAA04199@starkhome.cs.sunysb.edu>
In-Reply-To: David Greenman's message of Mon, 06 Mar 1995 07:34:32 -0800 <199503061534.HAA00614@corbin.Root.COM>

next in thread | previous in thread | raw e-mail | index | archive | help
>   The code in vfs_bio.c is quite complex. John and I have each gone through
>this several times trying to find problems like you've mentioned. We're pretty
>sure that the page in question is always made 'busy' or 'bmapped' before any
>calls to VM_WAIT (or any other sleep) could otherwise lose the page. I'm not
>saying that we might not have missed something...but we have looked at this
>specific potential problem more than once. The object itself can't go away
>because a reference is held to it.

OK, I understand, but the current instability of the system seems to indicate
some sort of subtle problem, so I figure having a fresh eye take a look at
the code might stand a chance of finding something.  I hope you'll pardon me
if I "find" stuff that isn't a problem, as the assumptions/invariants, etc.
that are inherent in this code take awhile to flesh out by reading the code
over and over.

I am still concerned about line 1046 of vfs_bio.c, though.  At line 1031,
m is determined to be either invalid or busy.  At line 1046 there is a
possibility of sleeping in the VM_WAIT.  If m is invalid, then I don't think
there is anything stopping a pager from replacing m in the object with
another page during the sleep, so that when we wake up again, m isn't a
reference to the proper page in this object any more.  If m was busy,
of course, this can't happen, because the pagers respect the busy flags
and don't replace the pages in this case.

I have the feeling a good test to exercise some of these potential problems
would be to mmap() a file, then start accessing it via the mapped addresses,
concurrently with another process that repeatedly truncates and rewrites it.
Do you have a test like this?

							- Gene



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199503061617.LAA04199>