From owner-freebsd-fs@freebsd.org Mon Feb 8 14:15:32 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A4553AA276F for ; Mon, 8 Feb 2016 14:15:32 +0000 (UTC) (envelope-from thomasrcurry@gmail.com) Received: from mail-io0-x235.google.com (mail-io0-x235.google.com [IPv6:2607:f8b0:4001:c06::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7B31F1FA for ; Mon, 8 Feb 2016 14:15:32 +0000 (UTC) (envelope-from thomasrcurry@gmail.com) Received: by mail-io0-x235.google.com with SMTP id 9so196264600iom.1 for ; Mon, 08 Feb 2016 06:15:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=WgdGxaC5Zr/kZZafYTDSmUDsYuRxu3mDnIqZZgL7KOI=; b=J7N9cUHTRpwA85orHrn7HQxh7VmlduwNccFSc3cJQ7G07ay3CDHbWForybou/o1bQC 9FY/fwcjr6/9WoDzy9RRx2MieR8stKc9dT25Q95E5EOPIQBRmAwYicyTcPEr4pgnHvne B/lughexX3FIDqXuNNyXOGK+JdEWLcroDY/2AkKhdmY8t1afEBfr6fMWcl23M6+3LiLo XfmTs9BKgFSV8TJE+mbZ2w8QiQ56vqeCl6BVDNWI9wPU8zmaVpLbt2GRaYvMwE5PFRa6 G9umNSOcdOJdMnh3l3tq6hEtqpj9h/OG4NLitgeR1nmQeOQgSbm/BxgH+J/7b7svN2zx OAaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=WgdGxaC5Zr/kZZafYTDSmUDsYuRxu3mDnIqZZgL7KOI=; b=l6LgENkaZq5ovPE5jqOgJNy/pSWT5ipX4Rnknv8CQbVHu9H6yNJIYVBYjRAvZOJVmg M9HdE/3JSzA7827myZg61x+lcK6WswIkagGFuzSONPZbMBstqUqGXDgENyCgNPW+p02c DN+Qk5Z8DFDcwKhFgLUHGRO8GZpQi6fQVlajutIOtx3W3ox//6qIB0adqpwxByqmnbO7 oWClnSnJuXDqOtfXr1sr0cuv67nzJtIguDKTh8ZLksQM/QCDs7TqbKeahDDVBbSxSy97 XL4B9DWkdSbhaHLxfpls2XyPNyadG+u1N3wtBsw9h6xGxoVemE+ne0cg7RJa0U+ZGtZ5 amYQ== X-Gm-Message-State: AG10YOSGGCcBTuuUdzY0q8q9kQE1FRr0HJn/blO8Sjm8Wt3MavWcRezYoWe9tdo4a+B5POPVbJiAClSqEXcAkg== MIME-Version: 1.0 X-Received: by 10.107.136.200 with SMTP id s69mr27504303ioi.120.1454940931737; Mon, 08 Feb 2016 06:15:31 -0800 (PST) Received: by 10.107.4.71 with HTTP; Mon, 8 Feb 2016 06:15:31 -0800 (PST) In-Reply-To: References: Date: Mon, 8 Feb 2016 09:15:31 -0500 Message-ID: Subject: Re: Poor ZFS+NFSv3 read/write performance and panic From: Tom Curry To: David Adam Cc: FreeBSD Filesystems Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Feb 2016 14:15:32 -0000 On Sun, Feb 7, 2016 at 11:58 AM, David Adam wrote: > Just wondering if anyone has any idea how to identify which devices are > implicated in ZFS' vdev_deadman(). I have updated the firmware on the > mps(4) card that has our disks attached but that hasn't helped. > > Thanks > > David > > On Fri, 29 Jan 2016, David Adam wrote: > > We have a FreeBSD 10.2 server sharing some ZFS datasets over NFSv3. It's > > worked well until recently, but has started to routinely perform > > exceptionally poorly, eventually panicing in vdev_deadman() (which I > > understand is a feature). > > > > Initally after booting, things are fine, but performance rapidly begins > to > > degrade. Both read and write performance is terrible, with many > operations > > either hanging indefinitely or timing out. > > > > When this happens, I can break into DDB and see lots of nfsd process > stuck > > waiting for a lock: > > Process 784 (nfsd) thread 0xfffff80234795000 (100455) > > shared lockmgr zfs (zfs) r = 0 (0xfffff8000b91f548) locked @ > > > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:2196 > > > > and the backtrace looks like this: > > sched_switch() at sched_switch+0x495/frame 0xfffffe04677740b0 > > mi_switch() at mi_switch+0x179/frame 0xfffffe04677740f0 > > turnstile_wait() at turnstile_wait+0x3b2/frame 0xfffffe0467774140 > > __mtx_lock_sleep() at __mtx_lock_sleep+0x2c0/frame 0xfffffe04677741c0 > > __mtx_lock_flags() at __mtx_lock_flags+0x102/frame 0xfffffe0467774210 > > vmem_size() at vmem_size+0x5a/frame 0xfffffe0467774240 > > arc_reclaim_needed() at arc_reclaim_needed+0xd2/frame > 0xfffffe0467774260 > > arc_get_data_buf() at arc_get_data_buf+0x157/frame 0xfffffe04677742a0 > > arc_read() at arc_read+0x68b/frame 0xfffffe0467774350 > > dbuf_read() at dbuf_read+0x7ed/frame 0xfffffe04677743f0 > > dmu_tx_check_ioerr() at dmu_tx_check_ioerr+0x8b/frame > 0xfffffe0467774420 > > dmu_tx_count_write() at dmu_tx_count_write+0x17e/frame > 0xfffffe0467774540 > > dmu_tx_hold_write() at dmu_tx_hold_write+0xba/frame 0xfffffe0467774580 > > zfs_freebsd_write() at zfs_freebsd_write+0x55d/frame 0xfffffe04677747b0 > > VOP_WRITE_APV() at VOP_WRITE_APV+0x193/frame 0xfffffe04677748c0 > > nfsvno_write() at nfsvno_write+0x13e/frame 0xfffffe0467774970 > > nfsrvd_write() at nfsrvd_write+0x496/frame 0xfffffe0467774c80 > > nfsrvd_dorpc() at nfsrvd_dorpc+0x66b/frame 0xfffffe0467774e40 > > nfssvc_program() at nfssvc_program+0x4e6/frame 0xfffffe0467774ff0 > > svc_run_internal() at svc_run_internal+0xbb7/frame 0xfffffe0467775180 > > svc_run() at svc_run+0x1db/frame 0xfffffe04677751f0 > > nfsrvd_nfsd() at nfsrvd_nfsd+0x1f0/frame 0xfffffe0467775350 > > nfssvc_nfsd() at nfssvc_nfsd+0x124/frame 0xfffffe0467775970 > > sys_nfssvc() at sys_nfssvc+0xb7/frame 0xfffffe04677759a0 > > amd64_syscall() at amd64_syscall+0x278/frame 0xfffffe0467775ab0 > > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0467775ab0 > > > > Is this likely to be due to bad hardware? I can't see any problems in > > the SMART data, and `camcontrol tags da0 -v` etc. does not reveal any > > particularly long queues. Are there other useful things to check? > > > > If not, do you have any other ideas? I can make the full DDB information > > available if that would be helpful. > > > > The pool is configured thus: > > NAME STATE READ WRITE CKSUM > > space ONLINE 0 0 0 > > mirror-0 ONLINE 0 0 0 > > da0 ONLINE 0 0 0 > > da1 ONLINE 0 0 0 > > mirror-1 ONLINE 0 0 0 > > da2 ONLINE 0 0 0 > > da3 ONLINE 0 0 0 > > mirror-2 ONLINE 0 0 0 > > da4 ONLINE 0 0 0 > > da6 ONLINE 0 0 0 > > mirror-3 ONLINE 0 0 0 > > da7 ONLINE 0 0 0 > > da8 ONLINE 0 0 0 > > logs > > mirror-4 ONLINE 0 0 0 > > gpt/molmol-slog ONLINE 0 0 0 > > gpt/molmol-slog0 ONLINE 0 0 0 > > where the da? devices are WD Reds and the SLOG partitions are on Samsung > > 840s. > > > > Many thanks, > > > > David Adam > > zanchey@ucc.gu.uwa.edu.au > > > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > > Cheers, > > David Adam > zanchey@ucc.gu.uwa.edu.au > Ask Me About Our SLA! > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > I too ran into this problem and spent quite some time troubleshooting hardware. For me it turns out it was not hardware at all, but software. Specifically the ZFS ARC. Looking at your stack I see some arc reclaim up top, it's possible you're running into the same issue. There is a monster of a PR that details this here https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594 If you would like to test this theory out, the fastest way is to limit the ARC by adding the following to /boot/loader.conf and rebooting vfs.zfs.arc_max="24G" Replacing 24G with what makes sense for your system, aim for 3/4 of total memory for starters. If this solves the problem there are more scientific methods to a permanent fix, one would be applying the patch in the PR above, another would be a more finely tuned arc_max value.