Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Apr 2013 15:49:42 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Carl Shapiro <carl.shapiro@gmail.com>
Cc:        FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: MADV_FREE and wait4 EFAULT
Message-ID:  <20130419124942.GA67273@kib.kiev.ua>
In-Reply-To: <CANVK_QgRBO5ZU=NHCr1XTvtxYpWk6LjWEv8Q-70mY6CzqHO2TA@mail.gmail.com>
References:  <CANVK_QgKRkpzWjA=H2u2HTp_vpxFhNLBGTVuFZmMEpBLTbzeaA@mail.gmail.com> <20130417082143.GW2930@kib.kiev.ua> <CANVK_QgRBO5ZU=NHCr1XTvtxYpWk6LjWEv8Q-70mY6CzqHO2TA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--82I3+IH0IqGh5yIs
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Apr 18, 2013 at 02:51:43PM -0700, Carl Shapiro wrote:
> On Wed, Apr 17, 2013 at 1:21 AM, Konstantin Belousov <kostikbel@gmail.com=
>wrote:
>=20
> > Did you ensured with e.g. ktrace and procstat -v that your assumptions
> > hold, i.e. the addresses supplied as wait4(2) arguments are valid ?
> > Please provide the minimal test case demonstrating the behaviour.
> >
>=20
> Yes.  I instrumented my code to check for a wait4 failure, print the
> addresses of the status and rusage arguments, and dump the contents of
> /proc/curproc/map.  The addresses of the status and rusage arguments are
> always in the range of a mapping and marked as read write.
It would be of some interest to see the evidence.

Is your code multithreaded ?

>=20
> I have yet to distill the failure to a minimal test case.  The test case I
> do have is the test harness for the Go language.  After running for about
> 45 minutes I can observe a failure.  I have been working to produce
> something smaller and faster.
The test case is required to decide whether the bug is in the application
or in the OS.

>=20
>=20
> > MADV_FREE should only result in the possible lost of the previous
> > content of the page, not in the faulting of the page access. From the
> > inspection of the code, I do not see how MADV_FREE could result in
> > the memory address becoming invalid.
> >
>=20
> I see.  What has lead us to believe this might be an issue with page faul=
ts
> is that writing zeroes to the page with memset before passing it to wait4
> makes the error go away.
There is no difference in the access performed by copyout vs. access caused
by the usermode write.

>=20
> Do you have any advice about how one might go about instrumenting wait4 to
> generate more information about a failed copyout?  Are tools such as dtra=
ce
> useful in these situations or might it be too invasive?  Because of the
> protracted test cycle and my lack of knowledge in this area, conducting
> experiments is quite painful at the moment.

No, I cannot give an advice, I think we should first decide which code
to blame.

BTW, you could try enabling sysctl machdep.uprintf_signal. Oh, you did not
specified the architecture and version of the system.

--82I3+IH0IqGh5yIs
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIcBAEBAgAGBQJRcT1lAAoJEJDCuSvBvK1B8DYP/00fJjKjqGegn+hv8HlyjlHY
zhvyKcEyeHLOKUBWV9cNvOZOmo8TsPmW95vd78dBR9xoCCjfXz0YLDA3gulkHhWz
zaNTzfn+BT8AyEmDu3/lthPchZwonLUeGlb5X0tnuQ8/beRNivMBr671ckn4rJZs
rqtW0bpsBvBmvKN5L6aHEH8Rf9yQTh8VGR6DGdrX0LK7RhzQVLgeLtnbvDXWAD9p
Rfw39LJWwVwNC/UbbhTlOfnPCf0O9kCMy9zdt2p2w+6k/Kql2XbwCzbhKSgf3c6l
YYLr0y9Xw6HujixW/aaS4LKegnAX9y2L1oNTtdOdFKbjpLuNAWJ3W10eSmCjSeWa
BYP+l7L1x6HkrHmPMibJTwC7ruJuzCoCWlOMQ2Aiw2TdpoE0ZNw+5mTl3tz8xm2d
hHRfApRqMENanidOV3qvCptT0wYCsEd+bnqCqdHHazNFV4NzeMYOUTRoshpmNE02
bhBzCUdDuMX55fLytkjdvl35u78gUOJ+0ZSJ5wy+qgzQkocxJahnj9rlnqkXKjHz
B4xmwGvUdq+mX1YGSCjGJkbbKkdPrn5sRfHr9cQuqVtP5tLP4ZlirHM5ulHHNE1a
BArZYirqaFXHEAd7eoWcoC4KPQnCNcIEWH09qsiaUR7Rswor70VzHNW5ey1wbLkg
3Ch70R2bB8oyKU75j0mI
=62ko
-----END PGP SIGNATURE-----

--82I3+IH0IqGh5yIs--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130419124942.GA67273>