Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 18 May 2013 12:14:28 +0200
From:      "Ronald Klop" <ronald-freebsd8@klop.yi.org>
To:        "dennis berger" <db@bsdsystems.de>, "Jeremy Chadwick" <jdc@koitsu.org>
Cc:        Steven Hartland <killing@multiplay.co.uk>, FreeBSD stable <freebsd-stable@freebsd.org>
Subject:   Re: still mbuf leak in 9.0 / 9.1?
Message-ID:  <op.ww9yqee88527sy@ronaldradial>
In-Reply-To: <20130517173101.GB87223@icarus.home.lan>
References:  <FDFFFCCB-BDF8-4E27-AF9D-D14D7E0D426D@nipsi.de> <CAFOYbcmF5WybuyJ9DuotcJf_u1FxwBKOLtHvpnT-05cVG6ES=A@mail.gmail.com> <004BC6EA-D8E6-473E-851C-9CDA7578510A@nipsi.de> <20130515211436.GA42790@icarus.home.lan> <696B5622-A95D-4187-A027-07ECC9B5AD1F@nipsi.de> <F3B040438E014E958372DCD64566CED4@multiplay.co.uk> <4F319A22-E611-4EE6-A970-98315B15C12F@nipsi.de> <1186B7CE-EC84-42F6-8904-EDD0C4A5FFBD@bsdsystems.de> <20130517173101.GB87223@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 17 May 2013 19:31:01 +0200, Jeremy Chadwick <jdc@koitsu.org> wrote:

> On Fri, May 17, 2013 at 11:37:23AM +0200, dennis berger wrote:
>> Hi List,
>> I can confirm that it is the bug you mentioned steven.
>> Here is how I found it.
>>
>> I recorded hourly zfskern and nfsd stats. like this.
>>
>> echo "PROCSTAT" >> $reportname
>> pgrep -S "(zfskern|nfsd)" | xargs procstat -kk >> $reportname
>>
>> luckily it crashed this night and logged this.
>>
>>  1910 101508 nfsd             nfsd: service    mi_switch+0x186  
>> sleepq_wait+0x42 _sleep+0x376 arc_lowmem+0x77 kmem_malloc+0xc1  
>> uma_large_malloc+0x4a malloc+0xd9 arc_get_data_buf+0xb5  
>> arc_read_nolock+0x1ec arc_read+0x93 dbuf_prefetch+0x12c  
>> dmu_zfetch_dofetch+0x10b dmu_zfetch+0xaf8 dbuf_read+0x4a7  
>> dmu_buf_hold_array_by_dnode+0x16b dmu_buf_hold_array+0x67  
>> dmu_read_uio+0x3f zfs_freebsd_read+0x3e3
>>
>> Maybe it would be good to merge this fix into RELENG_9_1 and distribute  
>> a fix via freebsd-update what do you think?
>>
>> best,
>> -dennis
>>
>>
>> Am 16.05.2013 um 11:42 schrieb dennis berger:
>>
>> > This is indeed a ZFS+NFS system and I can see that istgt and nfs are  
>> stuck in some ZIO state. Maybe it's this.
>> > Thank's for pointing out.
>> >
>> > Is it this ZFS+NFS deadlock?
>> >
>> > --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
>> > +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
>> > @@ -3720,8 +3720,16 @@ arc_lowmem(void *arg __unused, int howto  
>> __unused)
>> > 	mutex_enter(&arc_reclaim_thr_lock);
>> > 	needfree = 1;
>> > 	cv_signal(&arc_reclaim_thr_cv);
>> > -	while (needfree)
>> > -	 msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0);
>> > +
>> > +	/*
>> > +	 * It is unsafe to block here in arbitrary threads, because we can  
>> come
>> > +	 * here from ARC itself and may hold ARC locks and thus risk a  
>> deadlock
>> > +	 * with ARC reclaim thread.
>> > +	 */
>> > +	if (curproc == pageproc) {
>> > +	 while (needfree)
>> > +	 msleep(&needfree, &arc_reclaim_thr_lock, 0, "zfs:lowmem", 0);
>> > +	}
>> > 	mutex_exit(&arc_reclaim_thr_lock);
>> > 	mutex_exit(&arc_lowmem_lock);
>> > }
>> >
>> > I'll try to crash our testsystem. I'll assume that stressing NFS  
>> backed with ZFS a lot might trigger this bug?
>> >
>> > -dennis
>> >
>> >
>> > Am 16.05.2013 um 00:03 schrieb Steven Hartland:
>> >
>> >> ----- Original Message ----- From: "dennis berger" <db@nipsi.de>
>> >>> FreeBSD  9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec  4  
>> 09:23:10 UTC 2012
>> >>>
>> >>>> 3. Regarding this:
>> >>>>>> A clean shutdown isn't possible though. It hangs after vnode
>> >>>>>> cleaning, normally you would see detaching of usb devices here,  
>> or
>> >>>>>> other devices maybe?
>> >>>> Please don't conflate this with your above issue.  This is almost
>> >>>> certainly unrelated.  Please start a new thread about that if  
>> desired.
>> >>>
>> >>> Maybe this is a misunderstanding normally this system will shutdown  
>> cleanly, of course.
>> >>> This hang only appears after the network problem above.
>> >>
>> >> If this is a ZFS system, its a known issue which is fixed in current,
>> >> stable-9, stable-8 and the upcoming 8.4 release.
>> >>
>> >> If not and you have USB devices see if the following sysctl helps:
>> >> hw.usb.no_shutdown_wait=1
>
> I'm sorry to say it won't happen.  The only updates that the -RELEASE
> branches get are for security.  If you want fixes for other things, you
> need to follow/run stables branches (i.e. stable/9), otherwise you will
> need to wait until 9.2-RELEASE comes out.
>

And errata notices? Are they for security?

Ronald.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?op.ww9yqee88527sy>