Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 21 May 2013 14:40:38 +0200
From:      dennis berger <db@bsdsystems.de>
To:        Jeremy Chadwick <jdc@koitsu.org>
Cc:        Steven Hartland <killing@multiplay.co.uk>, FreeBSD stable <freebsd-stable@freebsd.org>, Ronald Klop <ronald-freebsd8@klop.yi.org>
Subject:   Re: still mbuf leak in 9.0 / 9.1?
Message-ID:  <387AEDDC-114C-40D6-B7C4-3AF1708F7CDF@bsdsystems.de>
In-Reply-To: <20130518153538.GA9228@icarus.home.lan>
References:  <FDFFFCCB-BDF8-4E27-AF9D-D14D7E0D426D@nipsi.de> <CAFOYbcmF5WybuyJ9DuotcJf_u1FxwBKOLtHvpnT-05cVG6ES=A@mail.gmail.com> <004BC6EA-D8E6-473E-851C-9CDA7578510A@nipsi.de> <20130515211436.GA42790@icarus.home.lan> <696B5622-A95D-4187-A027-07ECC9B5AD1F@nipsi.de> <F3B040438E014E958372DCD64566CED4@multiplay.co.uk> <4F319A22-E611-4EE6-A970-98315B15C12F@nipsi.de> <1186B7CE-EC84-42F6-8904-EDD0C4A5FFBD@bsdsystems.de> <20130517173101.GB87223@icarus.home.lan> <op.ww9yqee88527sy@ronaldradial> <20130518153538.GA9228@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Jeremy,
thanks for your detailed explanation.


-dennis

Am 18.05.2013 um 17:35 schrieb Jeremy Chadwick:

> On Sat, May 18, 2013 at 12:14:28PM +0200, Ronald Klop wrote:
>> On Fri, 17 May 2013 19:31:01 +0200, Jeremy Chadwick <jdc@koitsu.org> =
wrote:
>>=20
>>> On Fri, May 17, 2013 at 11:37:23AM +0200, dennis berger wrote:
>>>> Hi List,
>>>> I can confirm that it is the bug you mentioned steven.
>>>> Here is how I found it.
>>>>=20
>>>> I recorded hourly zfskern and nfsd stats. like this.
>>>>=20
>>>> echo "PROCSTAT" >> $reportname
>>>> pgrep -S "(zfskern|nfsd)" | xargs procstat -kk >> $reportname
>>>>=20
>>>> luckily it crashed this night and logged this.
>>>>=20
>>>> 1910 101508 nfsd             nfsd: service    mi_switch+0x186
>>>> sleepq_wait+0x42 _sleep+0x376 arc_lowmem+0x77 kmem_malloc+0xc1
>>>> uma_large_malloc+0x4a malloc+0xd9 arc_get_data_buf+0xb5
>>>> arc_read_nolock+0x1ec arc_read+0x93 dbuf_prefetch+0x12c
>>>> dmu_zfetch_dofetch+0x10b dmu_zfetch+0xaf8 dbuf_read+0x4a7
>>>> dmu_buf_hold_array_by_dnode+0x16b dmu_buf_hold_array+0x67
>>>> dmu_read_uio+0x3f zfs_freebsd_read+0x3e3
>>>>=20
>>>> Maybe it would be good to merge this fix into RELENG_9_1 and
>>>> distribute a fix via freebsd-update what do you think?
>>>>=20
>>>> best,
>>>> -dennis
>>>>=20
>>>>=20
>>>> Am 16.05.2013 um 11:42 schrieb dennis berger:
>>>>=20
>>>>> This is indeed a ZFS+NFS system and I can see that istgt and
>>>> nfs are stuck in some ZIO state. Maybe it's this.
>>>>> Thank's for pointing out.
>>>>>=20
>>>>> Is it this ZFS+NFS deadlock?
>>>>>=20
>>>>> --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
>>>>> +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
>>>>> @@ -3720,8 +3720,16 @@ arc_lowmem(void *arg __unused, int
>>>> howto __unused)
>>>>> 	mutex_enter(&arc_reclaim_thr_lock);
>>>>> 	needfree =3D 1;
>>>>> 	cv_signal(&arc_reclaim_thr_cv);
>>>>> -	while (needfree)
>>>>> -	 msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0);
>>>>> +
>>>>> +	/*
>>>>> +	 * It is unsafe to block here in arbitrary threads, because
>>>> we can come
>>>>> +	 * here from ARC itself and may hold ARC locks and thus risk
>>>> a deadlock
>>>>> +	 * with ARC reclaim thread.
>>>>> +	 */
>>>>> +	if (curproc =3D=3D pageproc) {
>>>>> +	 while (needfree)
>>>>> +	 msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0);
>>>>> +	}
>>>>> 	mutex_exit(&arc_reclaim_thr_lock);
>>>>> 	mutex_exit(&arc_lowmem_lock);
>>>>> }
>>>>>=20
>>>>> I'll try to crash our testsystem. I'll assume that stressing
>>>> NFS backed with ZFS a lot might trigger this bug?
>>>>>=20
>>>>> -dennis
>>>>>=20
>>>>>=20
>>>>> Am 16.05.2013 um 00:03 schrieb Steven Hartland:
>>>>>=20
>>>>>> ----- Original Message ----- From: "dennis berger" <db@nipsi.de>
>>>>>>> FreeBSD  9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec
>>>> 4 09:23:10 UTC 2012
>>>>>>>=20
>>>>>>>> 3. Regarding this:
>>>>>>>>>> A clean shutdown isn't possible though. It hangs after vnode
>>>>>>>>>> cleaning, normally you would see detaching of usb devices
>>>> here, or
>>>>>>>>>> other devices maybe?
>>>>>>>> Please don't conflate this with your above issue.  This is =
almost
>>>>>>>> certainly unrelated.  Please start a new thread about that
>>>> if desired.
>>>>>>>=20
>>>>>>> Maybe this is a misunderstanding normally this system will
>>>> shutdown cleanly, of course.
>>>>>>> This hang only appears after the network problem above.
>>>>>>=20
>>>>>> If this is a ZFS system, its a known issue which is fixed in =
current,
>>>>>> stable-9, stable-8 and the upcoming 8.4 release.
>>>>>>=20
>>>>>> If not and you have USB devices see if the following sysctl =
helps:
>>>>>> hw.usb.no_shutdown_wait=3D1
>>>=20
>>> I'm sorry to say it won't happen.  The only updates that the =
-RELEASE
>>> branches get are for security.  If you want fixes for other things, =
you
>>> need to follow/run stables branches (i.e. stable/9), otherwise you =
will
>>> need to wait until 9.2-RELEASE comes out.
>>>=20
>>=20
>> And errata notices? Are they for security?
>=20
> Example case:
>=20
> http://www.freebsd.org/releases/9.1R/errata.html
>=20
> Only the items in section "Security Advisories" would get actual =
updates
> pushed out to the 9.1-RELEASE branch (e.g. RELENG_9_1); the items in
> sections "Open Issues" and "Late-breaking News" are purely FYIs.  =
There
> are always hundreds of bugs that never show up in either of those
> sections but are mentioned in the next official versions' Release =
Notes.
> I can speculate all day and night as to why this is, but it's easier =
for
> me to just say "that's just the way it is".
>=20
> For example, compare the "Open Issues" in the 9.0-RELEASE errata to =
all
> the bugfixes in the 9.1-RELEASE Release Notes (you'll have to go =
through
> each item by hand and read it):
>=20
> http://www.freebsd.org/releases/9.0R/errata.html
> http://www.freebsd.org/releases/9.1R/relnotes-detailed.html
>=20
> ...and you'll see what I mean.
>=20
> So to recap: when you run a -RELEASE branch, you should only expect
> fixes related to security.  For any other problems, you are expected =
to
> run stable/X (e.g. stable/9) or get to backport the fix yourself.
>=20
> And because I am certain someone will bring it up: no, the fixes done =
in
> stable/X cannot necessarily be turned into a patch file for a -RELEASE
> branch.  The reason is that there are often other commits to stable/X
> branches which are for things other than bugfixes (i.e.
> re-engineering/refactoring of code, semantics changes, or entire
> portions nuked altogether).  Sometimes "backported" patches can be =
made,
> but it isn't always the case -- it is not always as simple as "the =
patch
> applied cleanly".  ZFS and NFS are two (of many) things which have =
been
> undergoing constant change.
>=20
> --=20
> | Jeremy Chadwick                                   jdc@koitsu.org |
> | UNIX Systems Administrator                http://jdc.koitsu.org/ |
> | Mountain View, CA, US                                            |
> | Making life hard for others since 1977.             PGP 4BD6C0CB |
> _______________________________________________
> freebsd-stable@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to =
"freebsd-stable-unsubscribe@freebsd.org"

Dipl.-Inform. (FH)
Dennis Berger

email:   db@bsdsystems.de
mobile: +491791231509
fon: +494054001817




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?387AEDDC-114C-40D6-B7C4-3AF1708F7CDF>