From owner-freebsd-fs@FreeBSD.ORG Sun Aug 25 15:50:01 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id CAF79409 for ; Sun, 25 Aug 2013 15:50:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9B21A244C for ; Sun, 25 Aug 2013 15:50:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r7PFo1ZK017788 for ; Sun, 25 Aug 2013 15:50:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r7PFo1Rj017787; Sun, 25 Aug 2013 15:50:01 GMT (envelope-from gnats) Date: Sun, 25 Aug 2013 15:50:01 GMT Message-Id: <201308251550.r7PFo1Rj017787@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: Dmitry Sivachenko Subject: Re: kern/181226: [ufs] Writes to almost full FS eat 100% CPU and speed drops below 1MB/sec [regression] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Dmitry Sivachenko List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 15:50:01 -0000 The following reply was made to PR kern/181226; it has been noted by GNATS. From: Dmitry Sivachenko To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/181226: [ufs] Writes to almost full FS eat 100% CPU and speed drops below 1MB/sec [regression] Date: Sun, 25 Aug 2013 19:45:08 +0400 I found the exact revision number which broke that: Author: mckusick Date: Mon Apr 22 23:59:00 2013 New Revision: 249782 URL: http://svnweb.freebsd.org/changeset/base/249782 From owner-freebsd-fs@FreeBSD.ORG Sun Aug 25 17:56:23 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 0639C212; Sun, 25 Aug 2013 17:56:23 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-ee0-x22e.google.com (mail-ee0-x22e.google.com [IPv6:2a00:1450:4013:c00::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 510DD2996; Sun, 25 Aug 2013 17:56:22 +0000 (UTC) Received: by mail-ee0-f46.google.com with SMTP id c13so1189496eek.5 for ; Sun, 25 Aug 2013 10:56:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=+gKGI9+MCR712c+jNMQ3Vc6MJh2LctSWkrJ2aRhLw2g=; b=ecwVJAHEY0I2E8UZGU2ItgcQw67mqyHi+ldC/E7svb6gU99IVbhu9sCfC6kGciy4x8 Btzvp/L90GWWctrw6bwB6l0rzR2XI9RuEdZtukfslxyCbBx31jNmP9wd/YagBXIcv8x8 5FF3NfFbRu2B+u/OFhalx754sHYiwYTnQEQOe38hFgQrGK0XDQscb9phdslhKu3Hocuz 6qXRTbqXvzdN+E6daPGoD+ppxkz/L93hoKa2dQExbqN/sSWGaaDWYFWoTtBXbQC+0yXR mO0ZKp+bu+yG8QCU1QTtIw0M2pJ+gsvWvtekiMzk0O7zkNTpqfhx5bj8zTXgi319zFQf srgw== X-Received: by 10.14.177.199 with SMTP id d47mr18601549eem.14.1377453380665; Sun, 25 Aug 2013 10:56:20 -0700 (PDT) Received: from localhost ([178.150.115.244]) by mx.google.com with ESMTPSA id n48sm15331126eeg.17.1969.12.31.16.00.00 (version=TLSv1.2 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 25 Aug 2013 10:56:19 -0700 (PDT) Sender: Mikolaj Golub Date: Sun, 25 Aug 2013 20:56:17 +0300 From: Mikolaj Golub To: Yamagi Burmeister Subject: Re: 9.2-RC1: LORs / Deadlock with SU+J on HAST in "memsync" mode Message-ID: <20130825175616.GA3472@gmail.com> References: <20130819115101.ae9c0cf788f881dc4de464c5@yamagi.org> <20130822121341.0f27cb5e372d12bab8725654@yamagi.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="uAKRQypu60I7Lcqm" Content-Disposition: inline In-Reply-To: <20130822121341.0f27cb5e372d12bab8725654@yamagi.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 17:56:23 -0000 --uAKRQypu60I7Lcqm Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Aug 22, 2013 at 12:13:41PM +0200, Yamagi Burmeister wrote: > After having some systems upgraded to FreeBSD 9.2-RC1/RC2 and switched > HAST to the new "memsync" mode I've seen processes getting stuck when > accessing files on UFS filesystems with SU+J enabled. Testing showed > that this only seems to happen (while I couldn't reproduce it in other > combinations I'm not quite sure if its really the case) when HAST is > running in "memsync" mode and the UFS filesystem on HAST has SU+J > enabled. It can be reproduced easily with the instructions below. I think I found (and reproduced) a scenario, when the primary might leak HAST IO request (hio), resulting in IO getting stuck. This may happen when the secondary is disconnecting and there are pending WRITE requests in primary's hio_recv list. In primary, remote_recv_thread(): * hast_proto_recv_hdr() returns "Unable to receive reply header"; * continue (restart loop); * memsyncack = false; * take hio from hio_recv list; * goto done_queue * hio has: hio_replication == MEMSYNC hio_countdown == 2 (local write complete, not in local_send queue) thus, * refcnt_release(&hio->hio_countdown) => hio_countdown == 1 * !memsyncack => continue; As a result the hio is not put in any queue and the request is leaked. I am attaching the patch aimed to fix this. In done_queue, it checks if the request is after disconnection, and if it is and is completed locally it is put to the done queue. To disambiguate requests I had to add a flag to hio, telling if memsync ack from the secondary is already received. Yamagi, I don't know if this is your case, but could you try the patch? If it does not help, please, after the hang, get core images of the worker processes (both primary and secondary) using gcore(1) and provide them together with hastd binary and libraries it is linked with (from `ldd /sbin/hastd' list). Note, core files might expose secure information from your host, if this worries you, you can send them to me privately. Also, grep hastd /var/log/all.log from both the primary and the secondary might be useful. -- Mikolaj Golub --uAKRQypu60I7Lcqm Content-Type: text/x-diff; charset=us-ascii Content-Disposition: inline; filename="primary.c.memsync.hio_leack.1.patch" Index: sbin/hastd/primary.c =================================================================== --- sbin/hastd/primary.c (revision 254760) +++ sbin/hastd/primary.c (working copy) @@ -90,9 +90,9 @@ struct hio { */ struct g_gate_ctl_io hio_ggio; /* - * Request was already confirmed to GEOM Gate. + * HIOF_* flags. */ - bool hio_done; + int hio_flags; /* * Remember replication from the time the request was initiated, * so we won't get confused when replication changes on reload. @@ -104,6 +104,12 @@ struct hio { #define hio_done_next hio_next[0] /* + * Flags kept in hio_flags. + */ +#define HIOF_DONE 0x00000001 /* Confirmed to GEOM Gate. */ +#define HIOF_MEMSYNC_ACK 0x00000002 /* Memsync acknowledged. */ + +/* * Free list holds unused structures. When free list is empty, we have to wait * until some in-progress requests are freed. */ @@ -1119,7 +1125,7 @@ write_complete(struct hast_resource *res, struct h struct g_gate_ctl_io *ggio; unsigned int ncomp; - PJDLOG_ASSERT(!hio->hio_done); + PJDLOG_ASSERT((hio->hio_flags & HIOF_DONE) == 0); ggio = &hio->hio_ggio; PJDLOG_ASSERT(ggio->gctl_cmd == BIO_WRITE); @@ -1143,7 +1149,7 @@ write_complete(struct hast_resource *res, struct h rw_unlock(&hio_remote_lock[ncomp]); if (ioctl(res->hr_ggatefd, G_GATE_CMD_DONE, ggio) == -1) primary_exit(EX_OSERR, "G_GATE_CMD_DONE failed"); - hio->hio_done = true; + hio->hio_flags |= HIOF_DONE; } /* @@ -1174,7 +1180,7 @@ ggate_recv_thread(void *arg) ggio->gctl_unit = res->hr_ggateunit; ggio->gctl_length = MAXPHYS; ggio->gctl_error = 0; - hio->hio_done = false; + hio->hio_flags = 0; hio->hio_replication = res->hr_replication; pjdlog_debug(2, "ggate_recv: (%p) Waiting for request from the kernel.", @@ -1710,6 +1716,7 @@ remote_recv_thread(void *arg) TAILQ_REMOVE(&hio_recv_list[ncomp], hio, hio_next[ncomp]); mtx_unlock(&hio_recv_list_lock[ncomp]); + hio->hio_errors[ncomp] = ENOTCONN; goto done_queue; } if (hast_proto_recv_hdr(res->hr_remotein, &nv) == -1) { @@ -1811,6 +1818,13 @@ done_queue: PJDLOG_ASSERT(!memsyncack); break; case 1: + if (hio->hio_errors[ncomp] == ENOTCONN) { + if ((hio->hio_flags & HIOF_MEMSYNC_ACK) + == 0) { + refcnt_release(&hio->hio_countdown); + } + break; + } if (memsyncack) { /* * Local request already finished, so we @@ -1822,6 +1836,7 @@ done_queue: * We still need to wait for final * remote reply. */ + hio->hio_flags |= HIOF_MEMSYNC_ACK; pjdlog_debug(2, "remote_recv: (%p) Moving request back to the recv queue.", hio); @@ -1838,6 +1853,10 @@ done_queue: } continue; case 2: + if (hio->hio_errors[ncomp] == ENOTCONN) { + refcnt_release(&hio->hio_countdown); + break; + } /* * We received remote memsync reply even before * local write finished. @@ -1847,6 +1866,7 @@ done_queue: pjdlog_debug(2, "remote_recv: (%p) Moving request back to the recv queue.", hio); + hio->hio_flags |= HIOF_MEMSYNC_ACK; mtx_lock(&hio_recv_list_lock[ncomp]); TAILQ_INSERT_TAIL(&hio_recv_list[ncomp], hio, hio_next[ncomp]); @@ -1931,7 +1951,7 @@ ggate_send_thread(void *arg) if (range_sync_wait) cv_signal(&range_sync_cond); mtx_unlock(&range_lock); - if (!hio->hio_done) + if ((hio->hio_flags & HIOF_DONE) == 0) write_complete(res, hio); } else { if (ioctl(res->hr_ggatefd, G_GATE_CMD_DONE, ggio) == -1) { @@ -2114,7 +2134,7 @@ sync_thread(void *arg __unused) ggio->gctl_offset = offset; ggio->gctl_length = length; ggio->gctl_error = 0; - hio->hio_done = false; + hio->hio_flags = 0; hio->hio_replication = res->hr_replication; for (ii = 0; ii < ncomps; ii++) hio->hio_errors[ii] = EINVAL; --uAKRQypu60I7Lcqm-- From owner-freebsd-fs@FreeBSD.ORG Sun Aug 25 19:33:32 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 749C9505; Sun, 25 Aug 2013 19:33:32 +0000 (UTC) (envelope-from delphij@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4876C2E29; Sun, 25 Aug 2013 19:33:32 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r7PJXWlT061733; Sun, 25 Aug 2013 19:33:32 GMT (envelope-from delphij@freefall.freebsd.org) Received: (from delphij@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r7PJXWMI061732; Sun, 25 Aug 2013 19:33:32 GMT (envelope-from delphij) Date: Sun, 25 Aug 2013 19:33:32 GMT Message-Id: <201308251933.r7PJXWMI061732@freefall.freebsd.org> To: delphij@FreeBSD.org, freebsd-fs@FreeBSD.org, mckusick@FreeBSD.org From: delphij@FreeBSD.org Subject: Re: kern/181226: [ufs] Writes to almost full FS eat 100% CPU and speed drops below 1MB/sec [regression] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Aug 2013 19:33:32 -0000 Synopsis: [ufs] Writes to almost full FS eat 100% CPU and speed drops below 1MB/sec [regression] Responsible-Changed-From-To: freebsd-fs->mckusick Responsible-Changed-By: delphij Responsible-Changed-When: Sun Aug 25 19:33:07 UTC 2013 Responsible-Changed-Why: Over to UFS maintainer. http://www.freebsd.org/cgi/query-pr.cgi?pr=181226 From owner-freebsd-fs@FreeBSD.ORG Mon Aug 26 04:06:23 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D1CDED46; Mon, 26 Aug 2013 04:06:23 +0000 (UTC) (envelope-from varanasisai@gmail.com) Received: from mail-vb0-x235.google.com (mail-vb0-x235.google.com [IPv6:2607:f8b0:400c:c02::235]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 569592618; Mon, 26 Aug 2013 04:06:23 +0000 (UTC) Received: by mail-vb0-f53.google.com with SMTP id i3so1652318vbh.26 for ; Sun, 25 Aug 2013 21:06:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=nN3B7dMMvkjRdC/YcUrIEeVj99c9ANCcayP3XXZh1Ks=; b=aHMVif+i8uqfaltDV8xKahFA0s89mt1z20RqHc5EfpqmPoZeYaB7Vo3Uuus67SBckS TE5HuRHKimwQslrG8QC+a+uWXtaG6w7Q9L/sergs1Yb+M0r/ILppbIXiAcFClScNGrQ6 t4C834Phl8RJMrDUYBsD6E8YGv1jA79fNW7KtnxeRmRw5WhcnEMRel3IJqBLOt0OsFPc 6iGiVn/llg3XYOMeVLVmKCMdYxjlul9hX9xEZFSC6UCSxTpzeGISL/km3nlTRI2UOQWn JsHUJ6kR1zcUBhdOvLl0AH+FEEtLJdi7PPtkrXqI0/SL9R75Az64qztEU+ZDvCnEuVG4 HPwA== MIME-Version: 1.0 X-Received: by 10.58.100.234 with SMTP id fb10mr12965808veb.5.1377489981141; Sun, 25 Aug 2013 21:06:21 -0700 (PDT) Received: by 10.52.233.4 with HTTP; Sun, 25 Aug 2013 21:06:21 -0700 (PDT) In-Reply-To: <201308231444.15353.jhb@freebsd.org> References: <503E443D-BC48-4284-8FC4-22B0A50DF147@bsdimp.com> <201308231444.15353.jhb@freebsd.org> Date: Mon, 26 Aug 2013 09:36:21 +0530 Message-ID: Subject: Re: UUID in fstab. From: varanasi sainath To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org, freebsd-drivers@freebsd.org, Warner Losh , freebsd-questions@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 04:06:23 -0000 Thanks John, I have tried as you suggested using a Live CD and yes the partitions uuid's are present in gptid .. I found the UUID's in /dev/gptid - how do I determine which uid corresponds to which partition (ufs or swap or boot) (I used glabel status and after some trial and error I found them) edited the fstab accordingly and everything is working now .. Is there a way to have both the /dev/XXXpYY and /dev/gptid/ present in /dev/ Thanks again for your support. On Sat, Aug 24, 2013 at 12:14 AM, John Baldwin wrote: > On Wednesday, August 21, 2013 4:38:00 pm varanasi sainath wrote: > > Thanks for the support. > > > > I want to use the uuid's found using sysctl -a in fstab. > > /dev/gptid/ has only uuid for boot partition. > > You probably have the other GPT paritions already mounted via > another name which removes the names in /dev/gptid. Try > booting an install CD or USB stick such that you use an > alternate root fs and don't mount any of the partitions on > your drive. Then you should be able to see the entries in > /dev/gptid and update your fstab appropriately. If you > console access you could also try to update your fstab to > use /dev/gptid/ directly instead of /dev/XXXpYY and > reboot. If it works I believe the /dev/XXXpYY names will > now be gone from /dev and the /dev/gptid names present > instead. > > -- > John Baldwin > -- Sainath Varanasi Hyderabad 09000855250 *My Website : http://s21embedded.webs.com **Linked In Profile : http://in.linkedin.com/pub/sainathvaranasi .. .. * From owner-freebsd-fs@FreeBSD.ORG Mon Aug 26 07:03:02 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 316347E7; Mon, 26 Aug 2013 07:03:02 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-ea0-x22c.google.com (mail-ea0-x22c.google.com [IPv6:2a00:1450:4013:c01::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 8F1AB2CFD; Mon, 26 Aug 2013 07:03:01 +0000 (UTC) Received: by mail-ea0-f172.google.com with SMTP id r16so1403401ead.31 for ; Mon, 26 Aug 2013 00:02:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=+pxLpHAQ5yAOxtdW2pVDHmQbZoc9dl6NUV5VH/reYOk=; b=zUDNVP4dJVuP7h1fooo+00b2xMcGh27P2mCMY2Iyi9F2QVK9y+ZGnEsldSeOr59toc 9jT4KQH1XcttQ8M+qmIOXE88i2vWnUru+u3xxbJshLexgdAObIJ5jLbtJJYpWC1wwN97 T2sQ0x1tR63DBrdf4iF8qKmYgjbRR363mXM5Nm6nB7B2rBHVxMrCJ8w582t8Kd+PQ+YV g+cy93C9bzW6STKQlUtTyyNyoczje1FvryWUtqUPgrdtb739pwJy/ZSHwkM00p+YTlZH g2bDr+w5vK67i5XMTcpwpeY5cYLoZMD3CzQwYZZ9GBShvGAxUTl8uK4zPHQVaCKymTUp mRzg== X-Received: by 10.15.27.133 with SMTP id p5mr798277eeu.65.1377500579759; Mon, 26 Aug 2013 00:02:59 -0700 (PDT) Received: from localhost ([178.150.115.244]) by mx.google.com with ESMTPSA id f49sm19091289eec.7.1969.12.31.16.00.00 (version=TLSv1.2 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 26 Aug 2013 00:02:58 -0700 (PDT) Sender: Mikolaj Golub Date: Mon, 26 Aug 2013 10:02:56 +0300 From: Mikolaj Golub To: Yamagi Burmeister Subject: Re: 9.2-RC1: LORs / Deadlock with SU+J on HAST in "memsync" mode Message-ID: <20130826070255.GA39140@gmail.com> References: <20130819115101.ae9c0cf788f881dc4de464c5@yamagi.org> <20130822121341.0f27cb5e372d12bab8725654@yamagi.org> <20130825175616.GA3472@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130825175616.GA3472@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 07:03:02 -0000 On Sun, Aug 25, 2013 at 08:56:17PM +0300, Mikolaj Golub wrote: > If it does not help, please, after the hang, get core images of the > worker processes (both primary and secondary) using gcore(1) I forgot to mention, before reproducing the hang, hastd should be built with '-O0 -g'. -- Mikolaj Golub From owner-freebsd-fs@FreeBSD.ORG Mon Aug 26 09:50:44 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id A0AF2B99 for ; Mon, 26 Aug 2013 09:50:44 +0000 (UTC) (envelope-from maurizio.vairani@cloverinformatica.it) Received: from smtpdg9.aruba.it (smtpdg8.aruba.it [62.149.158.238]) by mx1.freebsd.org (Postfix) with ESMTP id 054D82E53 for ; Mon, 26 Aug 2013 09:50:43 +0000 (UTC) Received: from cloverinformatica.it ([188.10.129.202]) by smtpcmd03.ad.aruba.it with bizsmtp id HMpV1m01b4N8xN401MpW1Y; Mon, 26 Aug 2013 11:49:31 +0200 Received: from [192.168.0.81] (ASUS-TERMINATOR [192.168.0.81]) by cloverinformatica.it (Postfix) with ESMTP id 88DB913E88; Mon, 26 Aug 2013 11:49:30 +0200 (CEST) Message-ID: <521B24AA.50200@cloverinformatica.it> Date: Mon, 26 Aug 2013 11:49:30 +0200 From: Maurizio Vairani User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 To: Ronald Klop Subject: Re: [SOLVED] Re: Shutdown problem with an USB memory stick as ZFS cache device References: <201307171529.r6HFT4EK063849@fire.js.berklix.net> <51E79EAD.5040602@cloverinformatica.it> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 09:50:44 -0000 On 24/07/2013 13.19, Ronald Klop wrote: > On Thu, 18 Jul 2013 09:52:13 +0200, Maurizio Vairani > wrote: > >> On 17/07/2013 17:29, Julian H. Stacey wrote: >>> Maurizio Vairani wrote: >>>> On 17/07/2013 11:50, Ronald Klop wrote: >>>>> On Wed, 17 Jul 2013 10:27:09 +0200, Maurizio Vairani >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> >>>>>> on a Compaq Presario laptop I have just installed the latest stable >>>>>> >>>>>> >>>>>> #uname -a >>>>>> >>>>>> FreeBSD presario 9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #0: Tue >>>>>> Jul 16 >>>>>> 16:32:39 CEST 2013 >>>>>> root@presario:/usr/obj/usr/src/sys/GENERIC amd64 >>>>>> >>>>>> >>>>>> For speed up the compilation I have added to the pool, tank0, a >>>>>> SanDisk memory stick as cache device with the command: >>>>>> >>>>>> >>>>>> # zpool add tank0 cache /dev/da0 >>>>>> >>>>>> >>>>>> But when I shutdown the laptop the process will halt with this >>>>>> screen >>>>>> shot: >>>>>> >>>>>> >>>>>> http://www.dump-it.fr/freebsd-screen-shot/2f9169f18c7c77e52e873580f9c2d4bf.jpg.html >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> and I need to press the power button for more than 4 seconds to >>>>>> switch off the laptop. >>>>>> >>>>>> The problem is always reproducible. >>>>> Does sysctl hw.usb.no_shutdown_wait=1 help? >>>>> >>>>> Ronald. >>>> Thank you Ronald it works ! >>>> >>>> In /boot/loader.conf added the line >>>> hw.usb.no_shutdown_wait=1 >>>> >>>> Maurizio >>> I wonder (from ignorance as I dont use ZFS yet), >>> if that merely masks the symptom or cures the fault ? >>> >>> Presumably one should use a ZFS command to disassociate whatever >>> might have the cache open ? (in case something might need to be >>> written out from cache, if it was a writeable cache ?) >>> >>> I too had a USB shutdown problem (non ZFS, now solved)& several people >>> made useful comments on shutdown scripts etc, so I'm cross referencing: >>> >>> http://lists.freebsd.org/pipermail/freebsd-mobile/2013-July/012803.html >>> >>> Cheers, >>> Julian >> Probably it masks the symptom. Andriy Gapon hypothesizes a bug in the >> ZFS clean up code: >> http://lists.freebsd.org/pipermail/freebsd-fs/2013-July/017857.html >> >> Surely one can use a startup script with the command: >> zpool add tank0 cache /dev/da0 >> and a shutdown script with: >> zpool remove tank0 /dev/da0 >> but this mask the symptom too. >> >> I prefer the Ronald solution because: >> - is simpler: it adds only one line (hw.usb.no_shutdown_wait=1) to >> one file (/boot/loader.conf). >> - is fastest: the zpool add/remove commands take time and >> “hw.usb.no_shutdown_wait=1” in /boot/loader.conf speeds up the >> shutdown process. >> - is cleaner: the zpool add/remove commands pair will fill up the >> tank0 pool history. >> >> Regards >> Maurizio > > Keep an eye on this commit when it is merged to 9-stable. > http://svnweb.freebsd.org/changeset/base/253606 > It might be the fix of the problem. > > Ronald. It works ! Just upgraded the laptop to r254783. Shutdown and reboot works fine, regardless of the hw.usb.no_shutdown_wait value. Thanks Maurizio From owner-freebsd-fs@FreeBSD.ORG Mon Aug 26 11:06:43 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 8B357143 for ; Mon, 26 Aug 2013 11:06:43 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 767E42858 for ; Mon, 26 Aug 2013 11:06:43 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r7QB6hTm065912 for ; Mon, 26 Aug 2013 11:06:43 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r7QB6gU0065910 for freebsd-fs@FreeBSD.org; Mon, 26 Aug 2013 11:06:42 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 26 Aug 2013 11:06:42 GMT Message-Id: <201308261106.r7QB6gU0065910@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 11:06:43 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/181377 fs [zfs] zfs recv causes an inconsistant pool o kern/181281 fs [msdosfs] stack trace after successfull 'umount /mnt' o kern/181082 fs [fuse] [ntfs] Write to mounted NTFS filesystem using F o kern/180979 fs [netsmb][patch]: Fix large files handling o kern/180876 fs [zfs] [hast] ZFS with trim,bio_flush or bio_delete loc o kern/180678 fs [NFS] succesfully exported filesystems being reported o kern/180438 fs [smbfs] [patch] mount_smbfs fails on arm because of wr p kern/180236 fs [zfs] [nullfs] Leakage free space using ZFS with nullf o kern/178854 fs [ufs] FreeBSD kernel crash in UFS o kern/178713 fs [nfs] [patch] Correct WebNFS support in NFS server and s kern/178467 fs [zfs] [request] Optimized Checksum Code for ZFS o kern/178412 fs [smbfs] Coredump when smbfs mounted o kern/178388 fs [zfs] [patch] allow up to 8MB recordsize o kern/178387 fs [zfs] [patch] sparse files performance improvements o kern/178349 fs [zfs] zfs scrub on deduped data could be much less see o kern/178329 fs [zfs] extended attributes leak o kern/178238 fs [nullfs] nullfs don't release i-nodes on unlink. f kern/178231 fs [nfs] 8.3 nfsv4 client reports "nfsv4 client/server pr o kern/178103 fs [kernel] [nfs] [patch] Correct support of index files o kern/177985 fs [zfs] disk usage problem when copying from one zfs dat o kern/177971 fs [nfs] FreeBSD 9.1 nfs client dirlist problem w/ nfsv3, o kern/177966 fs [zfs] resilver completes but subsequent scrub reports o kern/177658 fs [ufs] FreeBSD panics after get full filesystem with uf o kern/177536 fs [zfs] zfs livelock (deadlock) with high write-to-disk o kern/177445 fs [hast] HAST panic o kern/177240 fs [zfs] zpool import failed with state UNAVAIL but all d o kern/176978 fs [zfs] [panic] zfs send -D causes "panic: System call i o kern/176857 fs [softupdates] [panic] 9.1-RELEASE/amd64/GENERIC panic o bin/176253 fs zpool(8): zfs pool indentation is misleading/wrong o kern/176141 fs [zfs] sharesmb=on makes errors for sharenfs, and still o kern/175950 fs [zfs] Possible deadlock in zfs after long uptime o kern/175897 fs [zfs] operations on readonly zpool hang o kern/175449 fs [unionfs] unionfs and devfs misbehaviour o kern/175179 fs [zfs] ZFS may attach wrong device on move o kern/175071 fs [ufs] [panic] softdep_deallocate_dependencies: unrecov o kern/174372 fs [zfs] Pagefault appears to be related to ZFS o kern/174315 fs [zfs] chflags uchg not supported o kern/174310 fs [zfs] root point mounting broken on CURRENT with multi o kern/174279 fs [ufs] UFS2-SU+J journal and filesystem corruption o kern/173830 fs [zfs] Brain-dead simple change to ZFS error descriptio o kern/173718 fs [zfs] phantom directory in zraid2 pool f kern/173657 fs [nfs] strange UID map with nfsuserd o kern/173363 fs [zfs] [panic] Panic on 'zpool replace' on readonly poo o kern/173136 fs [unionfs] mounting above the NFS read-only share panic o kern/172942 fs [smbfs] Unmounting a smb mount when the server became o kern/172348 fs [unionfs] umount -f of filesystem in use with readonly o kern/172334 fs [unionfs] unionfs permits recursive union mounts; caus o kern/171626 fs [tmpfs] tmpfs should be noisier when the requested siz o kern/171415 fs [zfs] zfs recv fails with "cannot receive incremental o kern/170945 fs [gpt] disk layout not portable between direct connect o bin/170778 fs [zfs] [panic] FreeBSD panics randomly o kern/170680 fs [nfs] Multiple NFS Client bug in the FreeBSD 7.4-RELEA o kern/170497 fs [xfs][panic] kernel will panic whenever I ls a mounted o kern/169945 fs [zfs] [panic] Kernel panic while importing zpool (afte o kern/169480 fs [zfs] ZFS stalls on heavy I/O o kern/169398 fs [zfs] Can't remove file with permanent error o kern/169339 fs panic while " : > /etc/123" o kern/169319 fs [zfs] zfs resilver can't complete o kern/168947 fs [nfs] [zfs] .zfs/snapshot directory is messed up when o kern/168942 fs [nfs] [hang] nfsd hangs after being restarted (not -HU o kern/168158 fs [zfs] incorrect parsing of sharenfs options in zfs (fs o kern/167979 fs [ufs] DIOCGDINFO ioctl does not work on 8.2 file syste o kern/167977 fs [smbfs] mount_smbfs results are differ when utf-8 or U o kern/167688 fs [fusefs] Incorrect signal handling with direct_io o kern/167685 fs [zfs] ZFS on USB drive prevents shutdown / reboot o kern/167612 fs [portalfs] The portal file system gets stuck inside po o kern/167272 fs [zfs] ZFS Disks reordering causes ZFS to pick the wron o kern/167260 fs [msdosfs] msdosfs disk was mounted the second time whe o kern/167109 fs [zfs] [panic] zfs diff kernel panic Fatal trap 9: gene o kern/167105 fs [nfs] mount_nfs can not handle source exports wiht mor o kern/167067 fs [zfs] [panic] ZFS panics the server o kern/167065 fs [zfs] boot fails when a spare is the boot disk o kern/167048 fs [nfs] [patch] RELEASE-9 crash when using ZFS+NULLFS+NF o kern/166912 fs [ufs] [panic] Panic after converting Softupdates to jo o kern/166851 fs [zfs] [hang] Copying directory from the mounted UFS di o kern/166477 fs [nfs] NFS data corruption. o kern/165950 fs [ffs] SU+J and fsck problem o kern/165521 fs [zfs] [hang] livelock on 1 Gig of RAM with zfs when 31 o kern/165392 fs Multiple mkdir/rmdir fails with errno 31 o kern/165087 fs [unionfs] lock violation in unionfs o kern/164472 fs [ufs] fsck -B panics on particular data inconsistency o kern/164370 fs [zfs] zfs destroy for snapshot fails on i386 and sparc o kern/164261 fs [nullfs] [patch] fix panic with NFS served from NULLFS o kern/164256 fs [zfs] device entry for volume is not created after zfs o kern/164184 fs [ufs] [panic] Kernel panic with ufs_makeinode o kern/163801 fs [md] [request] allow mfsBSD legacy installed in 'swap' o kern/163770 fs [zfs] [hang] LOR between zfs&syncer + vnlru leading to o kern/163501 fs [nfs] NFS exporting a dir and a subdir in that dir to o kern/162944 fs [coda] Coda file system module looks broken in 9.0 o kern/162860 fs [zfs] Cannot share ZFS filesystem to hosts with a hyph o kern/162751 fs [zfs] [panic] kernel panics during file operations o kern/162591 fs [nullfs] cross-filesystem nullfs does not work as expe o kern/162519 fs [zfs] "zpool import" relies on buggy realpath() behavi o kern/161968 fs [zfs] [hang] renaming snapshot with -r including a zvo o kern/161864 fs [ufs] removing journaling from UFS partition fails on o kern/161579 fs [smbfs] FreeBSD sometimes panics when an smb share is o kern/161533 fs [zfs] [panic] zfs receive panic: system ioctl returnin o kern/161438 fs [zfs] [panic] recursed on non-recursive spa_namespace_ o kern/161424 fs [nullfs] __getcwd() calls fail when used on nullfs mou o kern/161280 fs [zfs] Stack overflow in gptzfsboot o kern/161205 fs [nfs] [pfsync] [regression] [build] Bug report freebsd o kern/161169 fs [zfs] [panic] ZFS causes kernel panic in dbuf_dirty o kern/161112 fs [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3 o kern/160893 fs [zfs] [panic] 9.0-BETA2 kernel panic f kern/160860 fs [ufs] Random UFS root filesystem corruption with SU+J o kern/160801 fs [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o o kern/160790 fs [fusefs] [panic] VPUTX: negative ref count with FUSE o kern/160777 fs [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo o kern/160706 fs [zfs] zfs bootloader fails when a non-root vdev exists o kern/160591 fs [zfs] Fail to boot on zfs root with degraded raidz2 [r o kern/160410 fs [smbfs] [hang] smbfs hangs when transferring large fil o kern/160283 fs [zfs] [patch] 'zfs list' does abort in make_dataset_ha o kern/159930 fs [ufs] [panic] kernel core o kern/159402 fs [zfs][loader] symlinks cause I/O errors o kern/159357 fs [zfs] ZFS MAXNAMELEN macro has confusing name (off-by- o kern/159356 fs [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s o kern/159351 fs [nfs] [patch] - divide by zero in mountnfs() o kern/159251 fs [zfs] [request]: add FLETCHER4 as DEDUP hash option o kern/159077 fs [zfs] Can't cd .. with latest zfs version o kern/159048 fs [smbfs] smb mount corrupts large files o kern/159045 fs [zfs] [hang] ZFS scrub freezes system o kern/158839 fs [zfs] ZFS Bootloader Fails if there is a Dead Disk o kern/158802 fs amd(8) ICMP storm and unkillable process. o kern/158231 fs [nullfs] panic on unmounting nullfs mounted over ufs o f kern/157929 fs [nfs] NFS slow read o kern/157399 fs [zfs] trouble with: mdconfig force delete && zfs strip o kern/157179 fs [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov o kern/156797 fs [zfs] [panic] Double panic with FreeBSD 9-CURRENT and o kern/156781 fs [zfs] zfs is losing the snapshot directory, p kern/156545 fs [ufs] mv could break UFS on SMP systems o kern/156193 fs [ufs] [hang] UFS snapshot hangs && deadlocks processes o kern/156039 fs [nullfs] [unionfs] nullfs + unionfs do not compose, re o kern/155615 fs [zfs] zfs v28 broken on sparc64 -current o kern/155587 fs [zfs] [panic] kernel panic with zfs p kern/155411 fs [regression] [8.2-release] [tmpfs]: mount: tmpfs : No o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 p kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o bin/153142 fs [zfs] ls -l outputs `ls: ./.zfs: Operation not support o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server o kern/145750 fs [unionfs] [hang] unionfs locks the machine s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an f bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141950 fs [unionfs] [lor] ufs/unionfs/ufs Lock order reversal o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/137588 fs [unionfs] [lor] LOR nfs/ufs/nfs o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis p kern/133174 fs [msdosfs] [patch] msdosfs must support multibyte inter o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126973 fs [unionfs] [hang] System hang with unionfs and init chr o kern/126553 fs [unionfs] unionfs move directory problem 2 (files appe o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files o bin/123574 fs [unionfs] df(1) -t option destroys info for unionfs (a o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o kern/121385 fs [unionfs] unionfs cross mount -> kernel panic o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118318 fs [nfs] NFS server hangs under special circumstances o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118126 fs [nfs] [patch] Poor NFS server write performance o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/67326 fs [msdosfs] crash after attempt to mount write protected o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t o kern/9619 fs [nfs] Restarting mountd kills existing mounts 334 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Aug 26 14:05:17 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 81861219; Mon, 26 Aug 2013 14:05:17 +0000 (UTC) (envelope-from lists@yamagi.org) Received: from mail.yamagi.org (mail.yamagi.org [IPv6:2a01:4f8:121:2102:1::7]) by mx1.freebsd.org (Postfix) with ESMTP id 1032E2682; Mon, 26 Aug 2013 14:05:16 +0000 (UTC) Received: from lennart.pwag-local.de (unknown [212.48.125.109]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.yamagi.org (Postfix) with ESMTPSA id B20B31666312; Mon, 26 Aug 2013 16:05:04 +0200 (CEST) Date: Mon, 26 Aug 2013 16:04:58 +0200 From: Yamagi Burmeister To: freebsd-fs@freebsd.org Subject: Re: 9.2-RC1: LORs / Deadlock with SU+J on HAST in "memsync" mode Message-Id: <20130826160458.515280dfa7d12de0e7c29cd5@yamagi.org> In-Reply-To: <20130825175616.GA3472@gmail.com> References: <20130819115101.ae9c0cf788f881dc4de464c5@yamagi.org> <20130822121341.0f27cb5e372d12bab8725654@yamagi.org> <20130825175616.GA3472@gmail.com> X-Mailer: Sylpheed 3.3.0 (GTK+ 2.24.19; amd64-portbld-freebsd9.2) Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="PGP-SHA1"; boundary="Signature=_Mon__26_Aug_2013_16_04_58_+0200_fE3bWcp4ikXT1V5+" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 14:05:17 -0000 --Signature=_Mon__26_Aug_2013_16_04_58_+0200_fE3bWcp4ikXT1V5+ Content-Type: text/plain; charset=US-ASCII Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello :) On Sun, 25 Aug 2013 20:56:17 +0300 Mikolaj Golub wrote: > On Thu, Aug 22, 2013 at 12:13:41PM +0200, Yamagi Burmeister wrote: >=20 > > After having some systems upgraded to FreeBSD 9.2-RC1/RC2 and switched > > HAST to the new "memsync" mode I've seen processes getting stuck when > > accessing files on UFS filesystems with SU+J enabled. Testing showed > > that this only seems to happen (while I couldn't reproduce it in other > > combinations I'm not quite sure if its really the case) when HAST is > > running in "memsync" mode and the UFS filesystem on HAST has SU+J > > enabled. It can be reproduced easily with the instructions below.=20 >=20 > I think I found (and reproduced) a scenario, when the primary might > leak HAST IO request (hio), resulting in IO getting stuck. - snip - > I am attaching the patch aimed to fix this. In done_queue, it checks > if the request is after disconnection, and if it is and is completed > locally it is put to the done queue. To disambiguate requests I had to > add a flag to hio, telling if memsync ack from the secondary is > already received. >=20 > Yamagi, I don't know if this is your case, but could you try the > patch? I'm sorry but the patch doesn't change anything. Processes accessing the UFS on top of HAST still deadlock within a couple of minutes. trasz@ suggested that all "buf" maybe exhausted which would result in=20 an IO deadlock, but at least increasing their number by four times by "kern.nbuf" doesn't change anything.=20 > If it does not help, please, after the hang, get core images of the > worker processes (both primary and secondary) using gcore(1) and > provide them together with hastd binary and libraries it is linked > with (from `ldd /sbin/hastd' list). Note, core files might expose > secure information from your host, if this worries you, you can send > them to me privately. No problem, it's a test setup without any production data. You can find a tar archive with the binary and libs (all with debug symbols) here: http://deponie.yamagi.org/freebsd/debug/lor_hast/hast_cores.tar.xz I have two HAST providers, therefor two core dumps for each host: hast_deadlocked.core -> worker for the provider an which the processes deadlocked. hast_not_deadlocked.core -> worker for the other provider While all processes accessing the UFS filesystem on top of the provider deadlocked, HAST still seemed to transfer data to the secondary. At least the process generated CPU load, the switch LEDs were blinking=20 and the harddrive LEDs showed activity on both sides. > Also, grep hastd /var/log/all.log from both the primary and the > secondary might be useful. Nothing on the primary. The secondary aborted as soon as I reset the primary. Of course. Aug 26 13:47:14 helene hastd[4237]: [rechts] (secondary) Unable to receive request header: Operation timed out. Aug 26 13:47:19 helene hastd[1123]: [rechts] (secondary) Worker process exited ungracefully (pid=3D4237, exitcode=3D75). Aug 26 13:47:34 helene hastd[4236]: [links] (secondary) Unable to receive request header: Operation timed out. Aug 26 13:47:39 helene hastd[1123]: [links] (secondary) Worker process exited ungracefully (pid=3D4236, exitcode=3D75) Ciao, Yamagi --Signature=_Mon__26_Aug_2013_16_04_58_+0200_fE3bWcp4ikXT1V5+ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iEYEARECAAYFAlIbYJAACgkQWTjlg++8y8thzACdFhWQqZuwRgPCRfZTgsYa15SA taEAn0ghphZOkU6jDvtrFS1OagF9WIC3 =Gwvh -----END PGP SIGNATURE----- --Signature=_Mon__26_Aug_2013_16_04_58_+0200_fE3bWcp4ikXT1V5+-- From owner-freebsd-fs@FreeBSD.ORG Mon Aug 26 18:00:06 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id EE33D396 for ; Mon, 26 Aug 2013 18:00:05 +0000 (UTC) (envelope-from slivnik@tomaz.name) Received: from p3plsmtpa09-09.prod.phx3.secureserver.net (p3plsmtpa09-09.prod.phx3.secureserver.net [173.201.193.238]) by mx1.freebsd.org (Postfix) with ESMTP id CB3DE2487 for ; Mon, 26 Aug 2013 18:00:05 +0000 (UTC) Received: from [10.0.1.162] ([212.159.115.195]) by p3plsmtpa09-09.prod.phx3.secureserver.net with id HVyR1m00i4D1Akc01VyTep; Mon, 26 Aug 2013 10:58:29 -0700 From: =?windows-1252?Q?=22Dr_Slivnik_Toma=9E_MA_=28Cantab=29_MMath_=28?= =?windows-1252?Q?Cantab=29_PhD_=28Cantab=29_FTICA=22?= Subject: extattr(2) Date: Mon, 26 Aug 2013 18:58:24 +0100 Message-Id: <02F402A6-8E33-4504-8634-1362EF8C75AF@tomaz.name> To: freebsd-fs@freebsd.org Mime-Version: 1.0 (Apple Message framework v1085) X-Mailer: Apple Mail (2.1085) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 18:00:06 -0000 I posted this on FreeBSD forums = (http://forums.freebsd.org/showthread.php?t=3D41552) and was advised by = the moderator that this mailing list may be the right forum. ------------------------------------------------- There seems to me to be a race condition in the design of the extattr(2) = interface. I can't imagine I am the first person to notice this, but I = have not been able to find any discussion of it, so I mention it. To read an attribute, I have to first call len =3D extattr_get_file = (path, attrnamespace, attrname, NULL, 0) to obtain its length, then call = extattr_get_file (path, attrnamespace, attrname, buf, len) to read it. = If the attribute changes between the two calls to a longer value, I will = read data which is in an inconsistent state, and not be aware of it. One way to work around it would be to always call extattr_get_file = (path, attrnamespace, attrname, buf, len+1) and repeat if = length-extension is detected, but it seems like a clunky way of doing = it. The issue could easily be resolved in one of many ways, by some locking = or snapshotting mechanism, e.g. by adding a call like this to the = interface: Code: int extattr_snapshot_and_getlen_file (const char *path, int = attrnamespace, const char *attrname); A subsequent read of the attribute would release the snapshot. Existing = code would not be affected.= From owner-freebsd-fs@FreeBSD.ORG Mon Aug 26 18:27:47 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id E0BBBCBB; Mon, 26 Aug 2013 18:27:47 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id AC026263C; Mon, 26 Aug 2013 18:27:47 +0000 (UTC) Received: from jhbbsd.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 528FFB941; Mon, 26 Aug 2013 14:27:46 -0400 (EDT) From: John Baldwin To: varanasi sainath Subject: Re: UUID in fstab. Date: Mon, 26 Aug 2013 10:24:03 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p28; KDE/4.5.5; amd64; ; ) References: <201308231444.15353.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201308261024.03553.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 26 Aug 2013 14:27:46 -0400 (EDT) Cc: freebsd-fs@freebsd.org, freebsd-drivers@freebsd.org, Warner Losh , freebsd-questions@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 18:27:48 -0000 On Monday, August 26, 2013 12:06:21 am varanasi sainath wrote: > Thanks John, I have tried as you suggested using a Live CD and yes the > partitions uuid's are present in gptid .. > I found the UUID's in /dev/gptid - how do I determine which uid corresponds > to which partition (ufs or swap or boot) (I used glabel status and after > some trial and error I found them) edited the fstab accordingly and > everything is working now .. The other way would be to examine the kern.geom.confxml output directly as I think you can probably use that to map between them. > Is there a way to have both the /dev/XXXpYY and /dev/gptid/ present > in /dev/ Not currently. freebsd-geom@ is probably the best place to ask that question. -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Mon Aug 26 19:10:15 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 31ED4A31; Mon, 26 Aug 2013 19:10:15 +0000 (UTC) (envelope-from dan@dan.emsphone.com) Received: from email2.allantgroup.com (email2.emsphone.com [199.67.51.116]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id EAC4E2929; Mon, 26 Aug 2013 19:10:14 +0000 (UTC) Received: from dan.emsphone.com (dan.emsphone.com [172.17.17.101]) by email2.allantgroup.com (8.14.5/8.14.5) with ESMTP id r7QJ28BZ024646 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 26 Aug 2013 14:02:08 -0500 (CDT) (envelope-from dan@dan.emsphone.com) Received: from dan.emsphone.com (smmsp@localhost [127.0.0.1]) by dan.emsphone.com (8.14.7/8.14.6) with ESMTP id r7QJ28hI051455 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 26 Aug 2013 14:02:08 -0500 (CDT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.14.7/8.14.7/Submit) id r7QJ28ge051454; Mon, 26 Aug 2013 14:02:08 -0500 (CDT) (envelope-from dan) Date: Mon, 26 Aug 2013 14:02:08 -0500 From: Dan Nelson To: John Baldwin Subject: Re: UUID in fstab. Message-ID: <20130826190208.GA15654@dan.emsphone.com> References: <201308231444.15353.jhb@freebsd.org> <201308261024.03553.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201308261024.03553.jhb@freebsd.org> X-OS: FreeBSD 9.1-STABLE User-Agent: Mutt/1.5.21 (2010-09-15) X-Virus-Scanned: clamav-milter 0.97.8 at email2.allantgroup.com X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (email2.allantgroup.com [172.17.19.78]); Mon, 26 Aug 2013 14:02:09 -0500 (CDT) X-Spam-Status: No, score=-3.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on email2.allantgroup.com X-Scanned-By: MIMEDefang 2.73 Cc: freebsd-fs@freebsd.org, freebsd-drivers@freebsd.org, Warner Losh , freebsd-questions@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Aug 2013 19:10:15 -0000 In the last episode (Aug 26), John Baldwin said: > On Monday, August 26, 2013 12:06:21 am varanasi sainath wrote: > > Thanks John, I have tried as you suggested using a Live CD and yes the > > partitions uuid's are present in gptid .. > > I found the UUID's in /dev/gptid - how do I determine which uid > > corresponds to which partition (ufs or swap or boot) (I used glabel > > status and after some trial and error I found them) edited the fstab > > accordingly and everything is working now .. "gpart list" will show detailed info for each provider, including the uuid for each GPT partition. -- Dan Nelson dnelson@allantgroup.com From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 01:49:46 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B5D96DA8; Tue, 27 Aug 2013 01:49:46 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 889AE2FFD; Tue, 27 Aug 2013 01:49:46 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r7R1nkDk051143; Tue, 27 Aug 2013 01:49:46 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r7R1nk4W051142; Tue, 27 Aug 2013 01:49:46 GMT (envelope-from linimon) Date: Tue, 27 Aug 2013 01:49:46 GMT Message-Id: <201308270149.r7R1nk4W051142@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/181565: [swap] Problem with vnode-backed swap space. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 01:49:46 -0000 Synopsis: [swap] Problem with vnode-backed swap space. Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Tue Aug 27 01:49:31 UTC 2013 Responsible-Changed-Why: Make a random assignment. http://www.freebsd.org/cgi/query-pr.cgi?pr=181565 From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 07:19:10 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id A8BD6726; Tue, 27 Aug 2013 07:19:10 +0000 (UTC) (envelope-from varanasisai@gmail.com) Received: from mail-vc0-x22e.google.com (mail-vc0-x22e.google.com [IPv6:2607:f8b0:400c:c03::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 2DF822149; Tue, 27 Aug 2013 07:19:10 +0000 (UTC) Received: by mail-vc0-f174.google.com with SMTP id gd11so2686031vcb.5 for ; Tue, 27 Aug 2013 00:19:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=xVRbdNYq6HH8EFMX/SRn/kLoCJxJKJbKby+BXxbYK0A=; b=c6z4j5wm13LOmtT+gkX9O1OQhCUWu3hiOlUonUBC1p2OMq/PgEQ9nrcuuOvzJSrPBr mkH95tH9X1PEkQ/3cBHdl2DF6GhkrWeUB0vaXn0oBFJi11nxnODCAKuTzbUvxJ2ru7Hj WgkPUIdmKsmSd4lc2Ri3VWY9iElK3LJ8+oHrQzf2hLT8YReD4ncG6MQYR3O9PrzhXKVc 47AMiSvV0D9o4E234lOBtvbNAyEF9CDJpK7IMY6bqfzHJLTcuVDR2MKoePsxDsIY9udA f67tIGpZcK3mv/hKTen0z6m44pDGZTrHCUHWkMoTaSY5Mt+mViQZonqssOHEcOICfu2i ZxjA== MIME-Version: 1.0 X-Received: by 10.220.16.73 with SMTP id n9mr83412vca.24.1377587949289; Tue, 27 Aug 2013 00:19:09 -0700 (PDT) Received: by 10.52.233.4 with HTTP; Tue, 27 Aug 2013 00:19:09 -0700 (PDT) In-Reply-To: <20130826190208.GA15654@dan.emsphone.com> References: <201308231444.15353.jhb@freebsd.org> <201308261024.03553.jhb@freebsd.org> <20130826190208.GA15654@dan.emsphone.com> Date: Tue, 27 Aug 2013 12:49:09 +0530 Message-ID: Subject: Re: UUID in fstab. From: varanasi sainath To: Dan Nelson Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org, Warner Losh , freebsd-drivers@freebsd.org, freebsd-questions@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 07:19:10 -0000 Hi Dan, Thank you, gpart list does exactly what I needed. I will write a script which will list uuids, type and name of the device. Thanks. On Tue, Aug 27, 2013 at 12:32 AM, Dan Nelson wrote: > In the last episode (Aug 26), John Baldwin said: > > On Monday, August 26, 2013 12:06:21 am varanasi sainath wrote: > > > Thanks John, I have tried as you suggested using a Live CD and yes the > > > partitions uuid's are present in gptid .. > > > I found the UUID's in /dev/gptid - how do I determine which uid > > > corresponds to which partition (ufs or swap or boot) (I used glabel > > > status and after some trial and error I found them) edited the fstab > > > accordingly and everything is working now .. > > "gpart list" will show detailed info for each provider, including the uuid > for each GPT partition. > > -- > Dan Nelson > dnelson@allantgroup.com > -- Sainath Varanasi Hyderabad 09000855250 *My Website : http://s21embedded.webs.com **Linked In Profile : http://in.linkedin.com/pub/sainathvaranasi .. .. * From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 10:40:54 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id C63BFC1 for ; Tue, 27 Aug 2013 10:40:54 +0000 (UTC) (envelope-from florent@peterschmitt.fr) Received: from peterschmitt.fr (peterschmitt.fr [5.135.177.31]) by mx1.freebsd.org (Postfix) with ESMTP id 8ECE12D8A for ; Tue, 27 Aug 2013 10:40:54 +0000 (UTC) Received: from [192.168.0.128] (4ab54-4-88-163-248-31.fbx.proxad.net [88.163.248.31]) by peterschmitt.fr (Postfix) with ESMTPSA id 7FDD06690 for ; Tue, 27 Aug 2013 12:40:46 +0200 (CEST) Message-ID: <521C822A.70703@peterschmitt.fr> Date: Tue, 27 Aug 2013 12:40:42 +0200 From: Florent Peterschmitt User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130806 Thunderbird/17.0.8 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: failok not honnored? References: <52187BA5.4020801@peterschmitt.fr> In-Reply-To: <52187BA5.4020801@peterschmitt.fr> X-Enigmail-Version: 1.5.2 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="q1LsXkO7wNgAoffnwafXoAI3xobu6Bqdw" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 10:40:54 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --q1LsXkO7wNgAoffnwafXoAI3xobu6Bqdw Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I've just seen it is fixed in HEAD. Why this wasn't MFC'ed ? --=20 Florent Peterschmitt | Please: florent@peterschmitt.fr | * Avoid HTML/RTF in E-mail. +33 (0)6 64 33 97 92 | * Send PDF for documents. http://florent.peterschmitt.fr | Thank you :) --q1LsXkO7wNgAoffnwafXoAI3xobu6Bqdw Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.21 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIbBAEBAgAGBQJSHIIqAAoJEFr01BkajbiBkHgP92CyXPszKQ600M1y8PlkPi8G HoSCYbqRaA5Ohrw0hKsqEfXhb+uuLJBbTPuOqcuSDdDm2VTMTzwW8rnKce8rrNKg P7hheQzSRm5M0PpfxhPNEdh8nDBJ18DiO8gCYiZxHurUX0tl5a0q9hOCLFRlgmm5 Xa8kTDBv/00e7umQbWFb3odY8HyhjSb3ybdwgurxBy5+mre7kXQTLRrfbu8pF8XD 979B8x7OoL5+S/0qtPuOXDrav1/RQzucMaZpkyPZIHHEJZedkGAOmQm/clbjz/ha Q/XZO2/SCyGM+jYwUWGpFHOvvIU9RdIzax5xVGmZt9JnNTkF8DQwMKtDfv8UYSeD 5sFtAXiD9W6IMgD71Zy/lfgK9BXGn+zv/IFh1/roNzp8+3OD736a9PdfWIWITOJt iPZpXVosyOxunXH9cOpqCelPQTR6Rn9tecXWAb53ja0nlobdiJlXYNMK+AKlt0rV tK+UMgk1cxy3avF+cn5yJoGBf8cKLl6YQ+7MU3nVnWRFxsE3kGhsJL4CNKU8Qnvm b5BXKTWWsa5Rv/z6zY2EqXkLkjp2vs5kuYIUDlL+8stHGLuDa2cRx1sVa/aTHnq4 ++cCPR6fB8wZq4fPEfkVbybaQWsZr0IqEm6PTRDqZ6NvEM4Of+ARqTKFeHLUEIr3 O1jLbFjR+ZrmB6cs+S0= =eQrb -----END PGP SIGNATURE----- --q1LsXkO7wNgAoffnwafXoAI3xobu6Bqdw-- From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 13:56:05 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 0C31F6B3 for ; Tue, 27 Aug 2013 13:56:05 +0000 (UTC) (envelope-from Albert.Shih@obspm.fr) Received: from smtp-int-m.obspm.fr (smtp-int-m.obspm.fr [145.238.187.15]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 99DD62918 for ; Tue, 27 Aug 2013 13:56:04 +0000 (UTC) Received: from pcjas.obspm.fr (pcjas.obspm.fr [145.238.184.233]) by smtp-int-m.obspm.fr (8.14.3/8.14.3/SIO Observatoire de Paris - 07/2009) with ESMTP id r7RDs306030723 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Tue, 27 Aug 2013 15:54:04 +0200 Date: Tue, 27 Aug 2013 15:54:03 +0200 From: Albert Shih To: Steven Hartland Subject: Re: Big problem with zfs....or not Message-ID: <20130827135403.GA36938@pcjas.obspm.fr> References: <20130728153316.GA94100@pcjas.obspm.fr> <33A524DDB4E84BE5A4F9E3BEA0ACE125@multiplay.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <33A524DDB4E84BE5A4F9E3BEA0ACE125@multiplay.co.uk> User-Agent: Mutt/1.5.21 (2010-09-15) X-Miltered: at smtp-int-m.obspm.fr with ID 521CAF7B.000 by Joe's j-chkmail (http : // j-chkmail dot ensmp dot fr)! X-j-chkmail-Enveloppe: 521CAF7B.000/145.238.184.233/pcjas.obspm.fr/pcjas.obspm.fr/ Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 13:56:05 -0000 Le 28/07/2013 21:12:27+0100, Steven Hartland a crit I'm terrible sorry to no answer your mail. > Sounds like you may either have a failing disk (failing to return) or > failing controller. Exactly. In fact the problem come from one disk, this disk return it's OKso the zpool say everything is OK , but he never return any data so that's why the zfs list freeze. After changing that disk everything return to normal. Thank you for helping me, and again sorry to not answering (vacation...) > Subject: Big problem with zfs....or not > > > Hi > > I've running a server Dell R610 (with LSI-9200-8E) + 5 x MD1200 during > almost 2 years without any problem (under FreeBSD 9.0). > > Without any change on the software and the hardware, zfs status hang. And > when I try to reboot the server the kernel stuck in > > run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config > (and 120 sec etc.) > > What do you think ? > > Because I didn't change anything I think it's a hardware problem, but I > don't see what kind of problem and how to track the problem > > If someone think it's software (or both) problem have you any idea how I > can fix it. > Regards. JAS -- Albert SHIH DIO btiment 15 Observatoire de Paris 5 Place Jules Janssen 92195 Meudon Cedex France Tlphone : +33 1 45 07 76 26/+33 6 86 69 95 71 xmpp: jas@obspm.fr Heure local/Local time: mar 27 ao 2013 15:52:11 CEST From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 15:16:18 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 5E175AEC for ; Tue, 27 Aug 2013 15:16:18 +0000 (UTC) (envelope-from zeus@ibs.dn.ua) Received: from relay.ibs.dn.ua (relay.ibs.dn.ua [91.216.196.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C7CF42E7C for ; Tue, 27 Aug 2013 15:16:17 +0000 (UTC) Received: from ibs.dn.ua (relay.ibs.dn.ua [91.216.196.25]) by relay.ibs.dn.ua with ESMTP id r7RFBRSW024762 for ; Tue, 27 Aug 2013 18:11:27 +0300 (EEST) Message-ID: <20130827181127.24761@relay.ibs.dn.ua> Date: Tue, 27 Aug 2013 18:11:27 +0300 From: "Zeus Panchenko" To: cc: Subject: exictent zroot re-alignment to 4K Organization: I.B.S. LLC X-Mailer: MH-E 8.3.1; GNU Mailutils 2.99.98; GNU Emacs 24.0.93 X-Face: &sReWXo3Iwtqql1[My(t1Gkx; y?KF@KF`4X+'9Cs@PtK^y%}^.>Mtbpyz6U=,Op:KPOT.uG )Nvx`=er!l?WASh7KeaGhga"1[&yz$_7ir'cVp7o%CGbJ/V)j/=]vzvvcqcZkf; JDurQG6wTg+?/xA go`}1.Ze//K; Fk&/&OoHd'[b7iGt2UO>o(YskCT[_D)kh4!yY'<&:yt+zM=A`@`~9U+P[qS:f; #9z~ Or/Bo#N-'S'!'[3Wog'ADkyMqmGDvga?WW)qd=?)`Y&k=o}>!ST\ MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Zeus Panchenko List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 15:16:18 -0000 =2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hi all, please, I really need help ... recently I noticed that my netbook HDD has 4K phys sectors and I think it is worth to use that (is it really worth indeed?) I have FreeBSD 9-STABLE running well on it already but when it was installed, the alignment was not done properly ... =2D -------------------------------------------------------------------- what I have: > uname -a FreeBSD 9.1-STABLE #17 r250823: Mon May 20 19:39:19 EEST 2013 amd64 > smartctl -a /dev/ada0 Model Family: Western Digital Scorpio Black (AF) Device Model: WDC WD5000BPKT-60PK4T0 Serial Number: WD-WXJ1A61P0560 User Capacity: 500,107,862,016 bytes [500 GB] Sector Sizes: 512 bytes logical, 4096 bytes physical > gpart show ada0 =3D> 34 976773101 ada0 GPT (465G) 34 6 - free - (3.0k) 40 128 1 freebsd-boot (64k) 168 33554432 2 freebsd-swap (16G) 33554600 943218528 3 freebsd-zfs (449G) 976773128 7 - free - (3.5k) > zdb zroot: version: 5000 name: 'zroot' ... vdev_tree: ... children[0]: type: 'disk' id: 0 path: '/dev/ada0p3' phys_path: '/dev/ada0p3' whole_disk: 1 ashift: 9 ... > cat /boot/loader.conf ... vfs.root.mountfrom=3D"zfs:zroot" ...=20=20=20=20 =2D -------------------------------------------------------------------- what I am thinking about: 1. to backup whole zroot #> zfs send -r zroot@2align > file 2. to plug HDD (let it be detected as adaX) on other system and to: #> gpart destroy adaX #> gpart create -s gpt adaX #> gpart add -b 34 -s 94 -t freebsd-boot -l bootX adaX #> gpart add -s 4G -t freebsd-swap -l swapX adaX #> gpart add -a 1m -t freebsd-zfs -l diskX adaX #> gpart bootcode -b /path/to/backuped/pmbr -p /path/to/backuped/gptzfsboot= -i 1 adaX #> gnop create -S 4096 gtp/diskX #> zpool create zroot_z-hp gpt/diskX.nop #> cat file | zfs receive -v zroot_z-hp #> zpool export zroot_z-hp #> zpool import -f -o cachefile=3D/tmp/zpool.cache -o altroot=3D/mnt zroot_= z-hp #> zfs set mountpoint=3D/root zroot_z-hp #> cp /tmp/zpool.cache /mnt/root/boot/zfs/zpool.cache #> vi /mnt/root/boot/loader.conf s/zroot/zroot_z-hp/ #> zfs umount /mnt/root #> zfs set mountpoint=3Dlegacy zroot_z-hp #> zpool export zroot_z-hp #> gnop destroy gpt/disk0.nop #> zpool import -d /dev/gpt/diskX #> zpool export zroot_z-hp 3. return HDD to the netbook so, am I right, is it correct way to re-align to 4K an existent ZFS pool, will it work? is ZFS version have to be the same on both systems (the one to be aligned is shown as 5000 (after `zpool upgrade -a') while the one all that stuff to be performed at is 28)? =2D --=20 Zeus V. Panchenko jid:zeus@im.ibs.dn.ua IT Dpt., I.B.S. LLC GMT+2 (EET) =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlIcwZ4ACgkQr3jpPg/3oyorGgCghK0E/Gl9pAi73V2jFVRk77mU 784AnivFBHqy00i4t+qvTHM1vZdIWqFf =3DFsRY =2D----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 17:24:18 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 2E0BF948 for ; Tue, 27 Aug 2013 17:24:18 +0000 (UTC) (envelope-from ericbrowning@skaggscatholiccenter.org) Received: from mail-pb0-f44.google.com (mail-pb0-f44.google.com [209.85.160.44]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 097C02647 for ; Tue, 27 Aug 2013 17:24:17 +0000 (UTC) Received: by mail-pb0-f44.google.com with SMTP id xa7so5125197pbc.3 for ; Tue, 27 Aug 2013 10:24:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=T0m3+R4QEt925Efw84ImpnrETjVgXV6dam0yEYBR4Mk=; b=nBxkFrHCnMVfSp0081mKBPp+5X1wvlDSv74uwYmR3oaWsDLw7pJT8SN/1rpHUNcWlK U/yCQxcKoNhv9dsnI2hnSL5bkc8b0Bor1G16v7f4OwOyTTB5BPzxDJxLn7vNQddh5LJv gX6XbVKYzooW3L1tgKMW4DMud4QuAjT+9G08jRAbaTffMvTsfB7WmJ4pFpeXyu2JmILY 842CTmrugFwrURJqPYxhX1kZbMHnwwEnErqZ5wh8X1/ux7jJxcCaXNi1U9SDPEMmQstK Eakpp8nsggkCj2SjF7GuNKTGHXPjKu4mZjEEZFkFMdEzNY61jm8/9si/fJrFfRaxEq7u vwJg== X-Gm-Message-State: ALoCoQlpsfhbAkF5Zj+8JmJdHuTyrcWnnq9SHMiBHnTqgQtzYYYYoybf0zqz+HkhR8Djzd7C+QlN MIME-Version: 1.0 X-Received: by 10.67.30.225 with SMTP id kh1mr9438436pad.148.1377624257216; Tue, 27 Aug 2013 10:24:17 -0700 (PDT) Received: by 10.70.26.4 with HTTP; Tue, 27 Aug 2013 10:24:17 -0700 (PDT) Date: Tue, 27 Aug 2013 11:24:17 -0600 Message-ID: Subject: NFS on ZFS pure SSD pool From: Eric Browning To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 17:24:18 -0000 Hello, first time posting to this list. I have a new server that is not living up to the promise of SSD speeds and NFS is maxing out the CPU. I'm new to FreeBSD but I've been reading up on it as much as I can. I have obscured my IP addresses and hostname with x's so just ignore that. Server has about 200 users on it each draing under 50Mb/s peak sustained around 1-2Mb/s I've followed some network tuning guides for our I350t4 nic and that has helped with network performance somewhat but the server is still experiencing heavy load with pegging the CPU at 1250% on average with only 50Mb/s of traffin in/out of the machine. All of the network tuning came from https://calomel.org/freebsd_network_tuning.html since it was relevant to the same nic that I have. Server Specs: FreeBSD 9.1 16 cores AMDx64 64GB of ram ZFS v28 with four Intel DC S3700 drives (800GB) as a zfs stripe Intel DC S3500 for ZIL and enabling/disabling has made no difference Used a spare DC S3700 for the ZIL and that made no difference either. NFS v3 & v4 for Mac home folders whose Cache fodler is redirected. I've tried: Compression on/of <-- no appreciable difference Deduplication on/off <-- no appreciable difference sync=disabled and sync=standard <-- no appreciable difference setting arc cache to 56GB and also to 32GB <-- no difference in performance in terms of kern. I've tried to follow the freebsd tuning guide: https://wiki.freebsd.org/ZFSTuningGuide to no avail either. I've read everything I can find on NFS on ZFS and nothing has helped. WHere am I going wrong? Here's /boot/loader: [quote] # ZFS tuning tweaks aio_load="YES" # Async IO system calls autoboot_delay="10" # reduce boot menu delay time from 10 to 3 seconds vfs.zfs.arc_max="56868864000" # Reserves 10GB or ram for system, leaves 56GB for ZFS vfs.zfs.cache_flush_disable="1" #vfs.zfs.prefetch_disble="1" vfs.zfs.write_limit_override="429496728" kern.ipc.nmbclusters="264144" # increase the number of network mbufs kern.maxfiles="65535" net.inet.tcp.syncache.hashsize="1024" # Size of the syncache hash table net.inet.tcp.syncache.bucketlimit="100" # Limit the number of entries permitted in each bucket of the hash table. net.inet.tcp.tcbhashsize="32768" # Link Aggregation loader tweaks. see: https://calomel.org/freebsd_network_tuning.html hw.igb.enable_msix="1" hw.igb.num_queues="0" hw.igb.enable_aim="1" hw.igb.max_interrupt_rate="32000" hw.igb.rxd="2048" hw.igb.txd="2048" hw.igb.rx_process_limit="4096" if_lagg_load="YES" [/quote] Here's etc/sysctl.conf: [quote] # $FreeBSD: release/9.1.0/etc/sysctl.conf 112200 2003-03-13 18:43:50Z mux $ # # This file is read when going to multi-user and its contents piped thru # ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for details. # # Uncomment this to prevent users from seeing information about processes that # are being run under another UID. #security.bsd.see_other_uids=0 kern.ipc.somaxconn=1024 kern.maxusers=272 #kern.maxvnodes=1096848 #increase this if necessary kern.ipc.maxsockbuf=8388608 net.inet.tcp.mssdflt=1460 net.inet.ip.forwarding=1 net.inet.ip.fastforwarding=1 dev.igb.2.fc=0 dev.igb.3.fc=0 dev.igb.4.fc=0 dev.igb.5.fc=0 dev.igb.2.rx_procesing_limit=10000 dev.igb.3.rx_procesing_limit=10000 dev.igb.4.rx_procesing_limit=10000 dev.igb.5.rx_procesing_limit=10000 net.inet.ip.redirect=0 net.inet.icmp.bmcastecho=0 # do not respond to ICMP packets sent to IP .255 net.inet.icmp.maskfake=0 # do not fake reply to ICMP Address Mask Request packets net.inet.icmp.maskrepl=0 # replies are not sent for ICMP address mask net.inet.icmp.log_redirect=0 # do not log redirected ICMP packet attempts net.inet.icmp.drop_redirect=1 # no redirected ICMP packets net.inet.tcp.drop_synfin=1 # SYN/FIN packets get dropped on initial connection net.inet.tcp.ecn.enable=1 # explicit congestion notification (ecn) warning: some ISP routers abuse it net.inet.tcp.icmp_may_rst=0 # icmp may not send RST to avoid spoofed icmp/udp floods net.inet.tcp.maxtcptw=15000 # max number of tcp time_wait states for closing connections net.inet.tcp.msl=5000 # 5 second maximum segment life waiting for an ACK in reply to a SYN-ACK or FIN-ACK net.inet.tcp.path_mtu_discovery=0 # disable MTU discovery since most ICMP packets are dropped by others net.inet.tcp.rfc3042=0 # disable the limited transmit mechanism which can slow burst transmissions net.inet.ip.rtexpire=60 # 3600 secs net.inet.ip.rtminexpire=2 # 10 secs net.inet.ip.rtmaxcache=1024 # 128 entries [/quote] Here's /etc/rc.conf [quote] #ifconfig_igb2=" inet xxx.xx.x.xx netmask 255.255.248.0" hostname="xxxxxxxxxxxxxxxxxxx" # # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable dumpdev="NO" # ### LACP config ifconfig_igb2="up" ifconfig_igb3="up" ifconfig_igb4="up" ifconfig_igb5="up" cloned_interfaces="lagg0" ifconfig_lagg0="laggproto lacp laggport igb2 laggport igb3 laggport igb4 laggport igb5 xxx.xx.x.xx netmask 255.255.248.0" ipvr_addrs_lagg0="xxx.xx.x.xx" defaultrouter="xxx.xx.x.xx" # ### Defaults for SSH, NTP, ZFS sshd_enable="YES" ntpd_enable="YES" zfs_enable="YES" # ## NFS Server rpcbind_enable="YES" nfs_server_enable="YES" mountd_flags="-r -l" nfsd_enable="YES" mountd_enable="YES" rpc_lockd_enable="NO" rpc_statd_enable="NO" nfs_server_flags="-u -t -n 128" nfsv4_server_enable="YES" nfsuserd_enable="YES" [/quote] Thanks in advance, -- Eric Browning From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 19:29:55 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 13A61230 for ; Tue, 27 Aug 2013 19:29:55 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 9E8012D1F for ; Tue, 27 Aug 2013 19:29:54 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEAM39HFKDaFve/2dsb2JhbABZFoMmUYMnvQCBOXSCJAEBAQMBAQEBICsgCwUWGAICDRkCKQEJJgYIBwQBGAQEh1oGDKY+ki2BKYx8AQWBBTQHgmiBMQOVJoN2kDODPCAyegEIFyI X-IronPort-AV: E=Sophos;i="4.89,970,1367985600"; d="scan'208";a="47699850" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 27 Aug 2013 15:29:52 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 59540B40D9; Tue, 27 Aug 2013 15:29:52 -0400 (EDT) Date: Tue, 27 Aug 2013 15:29:52 -0400 (EDT) From: Rick Macklem To: Eric Browning Message-ID: <2008996797.14358576.1377631792358.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: NFS on ZFS pure SSD pool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 19:29:55 -0000 Eric Browning wrote: > Hello, first time posting to this list. I have a new server that is > not > living up to the promise of SSD speeds and NFS is maxing out the CPU. > I'm > new to FreeBSD but I've been reading up on it as much as I can. I > have > obscured my IP addresses and hostname with x's so just ignore that. > Server has about 200 users on it each draing under 50Mb/s peak > sustained > around 1-2Mb/s > > I've followed some network tuning guides for our I350t4 nic and that > has > helped with network performance somewhat but the server is still > experiencing heavy load with pegging the CPU at 1250% on average with > only > 50Mb/s of traffin in/out of the machine. All of the network tuning > came > from https://calomel.org/freebsd_network_tuning.html since it was > relevant > to the same nic that I have. > > Server Specs: > FreeBSD 9.1 > 16 cores AMDx64 > 64GB of ram > ZFS v28 with four Intel DC S3700 drives (800GB) as a zfs stripe > Intel DC S3500 for ZIL and enabling/disabling has made no > difference > Used a spare DC S3700 for the ZIL and that made no difference > either. > NFS v3 & v4 for Mac home folders whose Cache fodler is redirected. > > I've tried: > Compression on/of <-- no appreciable difference > Deduplication on/off <-- no appreciable difference > sync=disabled and sync=standard <-- no appreciable difference > setting arc cache to 56GB and also to 32GB <-- no difference in > performance > in terms of kern. > > I've tried to follow the freebsd tuning guide: > https://wiki.freebsd.org/ZFSTuningGuide to no avail either. I've > read > everything I can find on NFS on ZFS and nothing has helped. WHere am > I > going wrong? > You could try this patch: http://people.freebsd.org/~rmacklem/drc4-stable9.patch - After applying the patch and booting a kernel built from the patched sources, you need to increase the value of vfs.nfsd.tcphighwater. (Try something like 5000 for it as a starting point.) Although this patch is somewhat different code, it should be semantically the same as r254337 in head, that is scheduled to be MFC'd to stable/9 in a couple of weeks. rick > Here's /boot/loader: > [quote] > # ZFS tuning tweaks > aio_load="YES" # Async IO system calls > autoboot_delay="10" # reduce boot menu delay time from 10 to 3 > seconds > vfs.zfs.arc_max="56868864000" # Reserves 10GB or ram for system, > leaves > 56GB for ZFS > vfs.zfs.cache_flush_disable="1" > #vfs.zfs.prefetch_disble="1" > vfs.zfs.write_limit_override="429496728" > > kern.ipc.nmbclusters="264144" # increase the number of network > mbufs > kern.maxfiles="65535" > net.inet.tcp.syncache.hashsize="1024" # Size of the syncache hash > table > net.inet.tcp.syncache.bucketlimit="100" # Limit the number of entries > permitted in each bucket of the hash table. > net.inet.tcp.tcbhashsize="32768" > > # Link Aggregation loader tweaks. see: > https://calomel.org/freebsd_network_tuning.html > hw.igb.enable_msix="1" > hw.igb.num_queues="0" > hw.igb.enable_aim="1" > hw.igb.max_interrupt_rate="32000" > hw.igb.rxd="2048" > hw.igb.txd="2048" > hw.igb.rx_process_limit="4096" > if_lagg_load="YES" > [/quote] > > Here's etc/sysctl.conf: > [quote] > # $FreeBSD: release/9.1.0/etc/sysctl.conf 112200 2003-03-13 18:43:50Z > mux $ > # > # This file is read when going to multi-user and its contents piped > thru > # ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for > details. > # > > # Uncomment this to prevent users from seeing information about > processes > that > # are being run under another UID. > #security.bsd.see_other_uids=0 > kern.ipc.somaxconn=1024 > kern.maxusers=272 > #kern.maxvnodes=1096848 #increase this if necessary > kern.ipc.maxsockbuf=8388608 > net.inet.tcp.mssdflt=1460 > net.inet.ip.forwarding=1 > net.inet.ip.fastforwarding=1 > dev.igb.2.fc=0 > dev.igb.3.fc=0 > dev.igb.4.fc=0 > dev.igb.5.fc=0 > dev.igb.2.rx_procesing_limit=10000 > dev.igb.3.rx_procesing_limit=10000 > dev.igb.4.rx_procesing_limit=10000 > dev.igb.5.rx_procesing_limit=10000 > net.inet.ip.redirect=0 > net.inet.icmp.bmcastecho=0 # do not respond to ICMP packets sent > to IP > .255 > net.inet.icmp.maskfake=0 # do not fake reply to ICMP Address > Mask > Request packets > net.inet.icmp.maskrepl=0 # replies are not sent for ICMP > address mask > net.inet.icmp.log_redirect=0 # do not log redirected ICMP packet > attempts > net.inet.icmp.drop_redirect=1 # no redirected ICMP packets > net.inet.tcp.drop_synfin=1 # SYN/FIN packets get dropped on > initial > connection > net.inet.tcp.ecn.enable=1 # explicit congestion notification > (ecn) > warning: some ISP routers abuse it > net.inet.tcp.icmp_may_rst=0 # icmp may not send RST to avoid > spoofed > icmp/udp floods > net.inet.tcp.maxtcptw=15000 # max number of tcp time_wait states > for > closing connections > net.inet.tcp.msl=5000 # 5 second maximum segment life > waiting for > an ACK in reply to a SYN-ACK or FIN-ACK > net.inet.tcp.path_mtu_discovery=0 # disable MTU discovery since most > ICMP > packets are dropped by others > net.inet.tcp.rfc3042=0 # disable the limited transmit > mechanism > which can slow burst transmissions > net.inet.ip.rtexpire=60 # 3600 secs > net.inet.ip.rtminexpire=2 # 10 secs > net.inet.ip.rtmaxcache=1024 # 128 entries > [/quote] > > Here's /etc/rc.conf > [quote] > #ifconfig_igb2=" inet xxx.xx.x.xx netmask 255.255.248.0" > hostname="xxxxxxxxxxxxxxxxxxx" > # > # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable > dumpdev="NO" > # > ### LACP config > ifconfig_igb2="up" > ifconfig_igb3="up" > ifconfig_igb4="up" > ifconfig_igb5="up" > cloned_interfaces="lagg0" > ifconfig_lagg0="laggproto lacp laggport igb2 laggport igb3 laggport > igb4 > laggport igb5 xxx.xx.x.xx netmask 255.255.248.0" > ipvr_addrs_lagg0="xxx.xx.x.xx" > defaultrouter="xxx.xx.x.xx" > # > ### Defaults for SSH, NTP, ZFS > sshd_enable="YES" > ntpd_enable="YES" > zfs_enable="YES" > # > ## NFS Server > rpcbind_enable="YES" > nfs_server_enable="YES" > mountd_flags="-r -l" > nfsd_enable="YES" > mountd_enable="YES" > rpc_lockd_enable="NO" > rpc_statd_enable="NO" > nfs_server_flags="-u -t -n 128" > nfsv4_server_enable="YES" > nfsuserd_enable="YES" > [/quote] > > Thanks in advance, > -- > Eric Browning > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 19:40:56 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id A0CA4365 for ; Tue, 27 Aug 2013 19:40:56 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 5C0CC2DC0 for ; Tue, 27 Aug 2013 19:40:56 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id B7E433C2EF; Tue, 27 Aug 2013 19:40:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=ZY7bXaJ0h5yoHUDVSlq0heAbfgw=; b=B9m9sLQSulWdeGflpe7fh+5VNQre w4YHa9+wO0Nv3PDzNHh3pY66IWoonSahRYlenK/In5HBhgp6oEbGx8lkXAq1ioBX N47Fj03kdfRhiXKoBshSz+sP9Np416xP67F+Ie7DypZ+c2cMOw1L1yejddXwYssq InyQEUOHvjE7mDM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=fdvjZf UANo3hwueaYn/8i1OCgtaXXya8QCfkI+SdT5BfAA2UB4F6+N1pIk5SjKzIqdYu29 jExVAuEZLlq/Fqz2DVP6bWlWC0mfppZIM3KyTEpAs1p10jO1FB4cBcPN2eRSijql n2fnvAZJuhIfE6ODy3PBgWyGSK8p/2hhiN47E= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id ABD5A3C2EE; Tue, 27 Aug 2013 19:40:48 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.231.117]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id ED9733C2EB; Tue, 27 Aug 2013 19:40:47 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id B7E175D88; Wed, 28 Aug 2013 07:40:45 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 8CFE9423595E; Wed, 28 Aug 2013 07:40:45 +1200 (NZST) Date: Wed, 28 Aug 2013 07:40:45 +1200 Message-ID: <8738pugbwi.wl%berend@pobox.com> From: Berend de Boer To: Eric Browning Subject: Re: NFS on ZFS pure SSD pool In-Reply-To: References: User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Wed_Aug_28_07:40:45_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 92700018-0F50-11E3-833B-CA9B8506CD1E-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 19:40:56 -0000 --pgp-sign-Multipart_Wed_Aug_28_07:40:45_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Eric" == Eric Browning writes: Eric> Hello, first time posting to this list. I have a new server Eric> that is not living up to the promise of SSD speeds and NFS Eric> is maxing out the CPU. Switch to nfs3. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Wed_Aug_28_07:40:45_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJSHQC9AAoJEKOfeD48G3g5hWQP/jNB7XLDefqd9kiwUEDQsQmC g3zuLS+SjJDU1FdJpLZ5g/1BsBm41e/Evjti7sPD83cz+474K7zhR/ENeWWrNcY3 loOrVIgXtr71aEllhTWaJNVyER/+Ueu+wJ0lQdKdqIcM4MLfrWPuvf/RqjaG7/98 ILSDRnDwHgGzb8UbAR4nklN9s1l2el2ZG6z6xJLWxaPbBD91ZIWrWjp5WLN57RI4 fUGYSjIxk+ti752kRjqspVMcOL9fGh9LoQR3UHvDOlQknE4lNkS3l2Ee8vOs1aPO xsW0NprL17hWJKKNy4JFs6EJnuOHIh6U0qVHTTikVndRJq2OOSy3BzL4Y8IwXP0i OlPQjNb1woQ7PYwt1kVy6S08p+FDeGBtJZzohHsIN7sdRQpLy1CemtO8siD+PHZV cTe+/7YsVG4sfWHiVK5jwlON7MvW4tjFq+g9B+Hi8waTa6Od53qrylyX7QEMZg75 xP2E7Mu/1PJh1stiMKsqasEIn3oHCJoMTi6gq9PvdJ/nL1vDaaiFad5kwTeInRQ0 DajCe3v7x0gsdE015S6GviMfLShNzEtFF2gyFc0CllvfNsFhey7BwEcS2+x3TcKI OR1PAYQjzjOCB4t6EnnOut8QbvxuDyYnaH/ktn73SKHsVOeYW1xEE+34fLmMee+y 3TLkMtiw/I7QGRn17l8w =8Mmq -----END PGP SIGNATURE----- --pgp-sign-Multipart_Wed_Aug_28_07:40:45_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 21:59:25 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 59CBFF71 for ; Tue, 27 Aug 2013 21:59:25 +0000 (UTC) (envelope-from outbackdingo@gmail.com) Received: from mail-ob0-x22c.google.com (mail-ob0-x22c.google.com [IPv6:2607:f8b0:4003:c01::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 22E3C2607 for ; Tue, 27 Aug 2013 21:59:25 +0000 (UTC) Received: by mail-ob0-f172.google.com with SMTP id er7so5766811obc.31 for ; Tue, 27 Aug 2013 14:59:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=M7pD7EMz1Vup6uLeewWfsXUMRmL5YwYMZNvvQVScNKE=; b=ihVziNt9TwAh0Phhbv2QJ7hJuNhMp0KT2Fo5fqlGaRkMUrYtcJ/beNA3Lt8tUcfupN df2wUoHwnf8B7mIt3KxuMHZhphq6+sCEDnE+y+m7wsuky5mkWgObU5g5oZl4aF8b2naR IpERqqP5qcFdUg+PfJziFk3hB0+sLWCy3gzjlvYYgnfSGK15n/p5oWkTuRrEuKNi4ola /RcShsrVruKO+uxygL3luPMHKw6IFame6bUmP5DOtnKUNOHL0P7V2mkNj330Jvt1HAW2 cQr02HmBwiQN40vIKnlZAgtmd8BdV5EilEPsNux94aLYs2vh9hPC2xnogDt785yihqih UgMQ== MIME-Version: 1.0 X-Received: by 10.182.51.132 with SMTP id k4mr3992316obo.101.1377640764352; Tue, 27 Aug 2013 14:59:24 -0700 (PDT) Received: by 10.76.2.110 with HTTP; Tue, 27 Aug 2013 14:59:24 -0700 (PDT) In-Reply-To: <2008996797.14358576.1377631792358.JavaMail.root@uoguelph.ca> References: <2008996797.14358576.1377631792358.JavaMail.root@uoguelph.ca> Date: Tue, 27 Aug 2013 17:59:24 -0400 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: Outback Dingo To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 21:59:25 -0000 On Tue, Aug 27, 2013 at 3:29 PM, Rick Macklem wrote: > Eric Browning wrote: > > Hello, first time posting to this list. I have a new server that is > > not > > living up to the promise of SSD speeds and NFS is maxing out the CPU. > > I'm > > new to FreeBSD but I've been reading up on it as much as I can. I > > have > > obscured my IP addresses and hostname with x's so just ignore that. > > Server has about 200 users on it each draing under 50Mb/s peak > > sustained > > around 1-2Mb/s > > > > I've followed some network tuning guides for our I350t4 nic and that > > has > > helped with network performance somewhat but the server is still > > experiencing heavy load with pegging the CPU at 1250% on average with > > only > > 50Mb/s of traffin in/out of the machine. All of the network tuning > > came > > from https://calomel.org/freebsd_network_tuning.html since it was > > relevant > > to the same nic that I have. > > > > Server Specs: > > FreeBSD 9.1 > > 16 cores AMDx64 > > 64GB of ram > > ZFS v28 with four Intel DC S3700 drives (800GB) as a zfs stripe > > Intel DC S3500 for ZIL and enabling/disabling has made no > > difference > > Used a spare DC S3700 for the ZIL and that made no difference > > either. > > NFS v3 & v4 for Mac home folders whose Cache fodler is redirected. > > > > I've tried: > > Compression on/of <-- no appreciable difference > > Deduplication on/off <-- no appreciable difference > > sync=disabled and sync=standard <-- no appreciable difference > > setting arc cache to 56GB and also to 32GB <-- no difference in > > performance > > in terms of kern. > > > > I've tried to follow the freebsd tuning guide: > > https://wiki.freebsd.org/ZFSTuningGuide to no avail either. I've > > read > > everything I can find on NFS on ZFS and nothing has helped. WHere am > > I > > going wrong? > > > You could try this patch: > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > - After applying the patch and booting a kernel built from the patched > sources, you need to increase the value of vfs.nfsd.tcphighwater. > (Try something like 5000 for it as a starting point.) > > can we get a brief on what this is supposed to improve upon ? > Although this patch is somewhat different code, it should be semantically > the same as r254337 in head, that is scheduled to be MFC'd to stable/9 in > a couple of weeks. > > rick > > > Here's /boot/loader: > > [quote] > > # ZFS tuning tweaks > > aio_load="YES" # Async IO system calls > > autoboot_delay="10" # reduce boot menu delay time from 10 to 3 > > seconds > > vfs.zfs.arc_max="56868864000" # Reserves 10GB or ram for system, > > leaves > > 56GB for ZFS > > vfs.zfs.cache_flush_disable="1" > > #vfs.zfs.prefetch_disble="1" > > vfs.zfs.write_limit_override="429496728" > > > > kern.ipc.nmbclusters="264144" # increase the number of network > > mbufs > > kern.maxfiles="65535" > > net.inet.tcp.syncache.hashsize="1024" # Size of the syncache hash > > table > > net.inet.tcp.syncache.bucketlimit="100" # Limit the number of entries > > permitted in each bucket of the hash table. > > net.inet.tcp.tcbhashsize="32768" > > > > # Link Aggregation loader tweaks. see: > > https://calomel.org/freebsd_network_tuning.html > > hw.igb.enable_msix="1" > > hw.igb.num_queues="0" > > hw.igb.enable_aim="1" > > hw.igb.max_interrupt_rate="32000" > > hw.igb.rxd="2048" > > hw.igb.txd="2048" > > hw.igb.rx_process_limit="4096" > > if_lagg_load="YES" > > [/quote] > > > > Here's etc/sysctl.conf: > > [quote] > > # $FreeBSD: release/9.1.0/etc/sysctl.conf 112200 2003-03-13 18:43:50Z > > mux $ > > # > > # This file is read when going to multi-user and its contents piped > > thru > > # ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for > > details. > > # > > > > # Uncomment this to prevent users from seeing information about > > processes > > that > > # are being run under another UID. > > #security.bsd.see_other_uids=0 > > kern.ipc.somaxconn=1024 > > kern.maxusers=272 > > #kern.maxvnodes=1096848 #increase this if necessary > > kern.ipc.maxsockbuf=8388608 > > net.inet.tcp.mssdflt=1460 > > net.inet.ip.forwarding=1 > > net.inet.ip.fastforwarding=1 > > dev.igb.2.fc=0 > > dev.igb.3.fc=0 > > dev.igb.4.fc=0 > > dev.igb.5.fc=0 > > dev.igb.2.rx_procesing_limit=10000 > > dev.igb.3.rx_procesing_limit=10000 > > dev.igb.4.rx_procesing_limit=10000 > > dev.igb.5.rx_procesing_limit=10000 > > net.inet.ip.redirect=0 > > net.inet.icmp.bmcastecho=0 # do not respond to ICMP packets sent > > to IP > > .255 > > net.inet.icmp.maskfake=0 # do not fake reply to ICMP Address > > Mask > > Request packets > > net.inet.icmp.maskrepl=0 # replies are not sent for ICMP > > address mask > > net.inet.icmp.log_redirect=0 # do not log redirected ICMP packet > > attempts > > net.inet.icmp.drop_redirect=1 # no redirected ICMP packets > > net.inet.tcp.drop_synfin=1 # SYN/FIN packets get dropped on > > initial > > connection > > net.inet.tcp.ecn.enable=1 # explicit congestion notification > > (ecn) > > warning: some ISP routers abuse it > > net.inet.tcp.icmp_may_rst=0 # icmp may not send RST to avoid > > spoofed > > icmp/udp floods > > net.inet.tcp.maxtcptw=15000 # max number of tcp time_wait states > > for > > closing connections > > net.inet.tcp.msl=5000 # 5 second maximum segment life > > waiting for > > an ACK in reply to a SYN-ACK or FIN-ACK > > net.inet.tcp.path_mtu_discovery=0 # disable MTU discovery since most > > ICMP > > packets are dropped by others > > net.inet.tcp.rfc3042=0 # disable the limited transmit > > mechanism > > which can slow burst transmissions > > net.inet.ip.rtexpire=60 # 3600 secs > > net.inet.ip.rtminexpire=2 # 10 secs > > net.inet.ip.rtmaxcache=1024 # 128 entries > > [/quote] > > > > Here's /etc/rc.conf > > [quote] > > #ifconfig_igb2=" inet xxx.xx.x.xx netmask 255.255.248.0" > > hostname="xxxxxxxxxxxxxxxxxxx" > > # > > # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable > > dumpdev="NO" > > # > > ### LACP config > > ifconfig_igb2="up" > > ifconfig_igb3="up" > > ifconfig_igb4="up" > > ifconfig_igb5="up" > > cloned_interfaces="lagg0" > > ifconfig_lagg0="laggproto lacp laggport igb2 laggport igb3 laggport > > igb4 > > laggport igb5 xxx.xx.x.xx netmask 255.255.248.0" > > ipvr_addrs_lagg0="xxx.xx.x.xx" > > defaultrouter="xxx.xx.x.xx" > > # > > ### Defaults for SSH, NTP, ZFS > > sshd_enable="YES" > > ntpd_enable="YES" > > zfs_enable="YES" > > # > > ## NFS Server > > rpcbind_enable="YES" > > nfs_server_enable="YES" > > mountd_flags="-r -l" > > nfsd_enable="YES" > > mountd_enable="YES" > > rpc_lockd_enable="NO" > > rpc_statd_enable="NO" > > nfs_server_flags="-u -t -n 128" > > nfsv4_server_enable="YES" > > nfsuserd_enable="YES" > > [/quote] > > > > Thanks in advance, > > -- > > Eric Browning > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 23:02:21 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 5126419A for ; Tue, 27 Aug 2013 23:02:21 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id D224429CB for ; Tue, 27 Aug 2013 23:02:20 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEAIMvHVKDaFve/2dsb2JhbABZFoMmUYMnvQKBOHSCJAEBAQMBAQEBICsgCxsYAgINGQIpAQkmBggHBAEYBASHWgYMpjOSKoEpjHwBBYEFNAeCaIExA5Umg3aQM4M8IDJ6AQgXIg X-IronPort-AV: E=Sophos;i="4.89,971,1367985600"; d="scan'208";a="47006919" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 27 Aug 2013 19:02:19 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 1D135B3F1B; Tue, 27 Aug 2013 19:02:19 -0400 (EDT) Date: Tue, 27 Aug 2013 19:02:19 -0400 (EDT) From: Rick Macklem To: Outback Dingo Message-ID: <1115794974.14452056.1377644539106.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: NFS on ZFS pure SSD pool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 23:02:21 -0000 Outback Dingo wrote: > > > > > > > On Tue, Aug 27, 2013 at 3:29 PM, Rick Macklem < rmacklem@uoguelph.ca > > wrote: > > > > > Eric Browning wrote: > > Hello, first time posting to this list. I have a new server that is > > not > > living up to the promise of SSD speeds and NFS is maxing out the > > CPU. > > I'm > > new to FreeBSD but I've been reading up on it as much as I can. I > > have > > obscured my IP addresses and hostname with x's so just ignore that. > > Server has about 200 users on it each draing under 50Mb/s peak > > sustained > > around 1-2Mb/s > > > > I've followed some network tuning guides for our I350t4 nic and > > that > > has > > helped with network performance somewhat but the server is still > > experiencing heavy load with pegging the CPU at 1250% on average > > with > > only > > 50Mb/s of traffin in/out of the machine. All of the network tuning > > came > > from https://calomel.org/freebsd_network_tuning.html since it was > > relevant > > to the same nic that I have. > > > > Server Specs: > > FreeBSD 9.1 > > 16 cores AMDx64 > > 64GB of ram > > ZFS v28 with four Intel DC S3700 drives (800GB) as a zfs stripe > > Intel DC S3500 for ZIL and enabling/disabling has made no > > difference > > Used a spare DC S3700 for the ZIL and that made no difference > > either. > > NFS v3 & v4 for Mac home folders whose Cache fodler is redirected. > > > > I've tried: > > Compression on/of <-- no appreciable difference > > Deduplication on/off <-- no appreciable difference > > sync=disabled and sync=standard <-- no appreciable difference > > setting arc cache to 56GB and also to 32GB <-- no difference in > > performance > > in terms of kern. > > > > I've tried to follow the freebsd tuning guide: > > https://wiki.freebsd.org/ZFSTuningGuide to no avail either. I've > > read > > everything I can find on NFS on ZFS and nothing has helped. WHere > > am > > I > > going wrong? > > > You could try this patch: > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > - After applying the patch and booting a kernel built from the > patched > sources, you need to increase the value of vfs.nfsd.tcphighwater. > (Try something like 5000 for it as a starting point.) > > > > > can we get a brief on what this is supposed to improve upon ? > It was developed for and tested by wollman@ to reduce mutex lock contention and CPU overheads for the duplicate request cache, mainly for NFS over TCP. (For the CPU overheads case, it allows the cache to grow larger, reducing the frequency and, therefore, overhead of trimming out stale entries.) Here is the commit message, which I think covers it: Fix several performance related issues in the new NFS server's DRC for NFS over TCP. - Increase the size of the hash tables. - Create a separate mutex for each hash list of the TCP hash table. - Single thread the code that deletes stale cache entries. - Add a tunable called vfs.nfsd.tcphighwater, which can be increased to allow the cache to grow larger, avoiding the overhead of frequent scans to delete stale cache entries. (The default value will result in frequent scans to delete stale cache entries, analagous to what the pre-patched code does.) - Add a tunable called vfs.nfsd.cachetcp that can be used to disable DRC caching for NFS over TCP, since the old NFS server didn't DRC cache TCP. It also adjusts the size of nfsrc_floodlevel dynamically, so that it is always greater than vfs.nfsd.tcphighwater. For UDP the algorithm remains the same as the pre-patched code, but the tunable vfs.nfsd.udphighwater can be used to allow the cache to grow larger and reduce the overhead caused by frequent scans for stale entries. UDP also uses a larger hash table size than the pre-patched code. Reported by: wollman Tested by: wollman (earlier version of patch) Submitted by: ivoras (earlier patch) Reviewed by: jhb (earlier version of patch) > > Although this patch is somewhat different code, it should be > semantically > the same as r254337 in head, that is scheduled to be MFC'd to > stable/9 in > a couple of weeks. > > rick > > > > > Here's /boot/loader: > > [quote] > > # ZFS tuning tweaks > > aio_load="YES" # Async IO system calls > > autoboot_delay="10" # reduce boot menu delay time from 10 to 3 > > seconds > > vfs.zfs.arc_max="56868864000" # Reserves 10GB or ram for system, > > leaves > > 56GB for ZFS > > vfs.zfs.cache_flush_disable="1" > > #vfs.zfs.prefetch_disble="1" > > vfs.zfs.write_limit_override="429496728" > > > > kern.ipc.nmbclusters="264144" # increase the number of network > > mbufs > > kern.maxfiles="65535" > > net.inet.tcp.syncache.hashsize="1024" # Size of the syncache hash > > table > > net.inet.tcp.syncache.bucketlimit="100" # Limit the number of > > entries > > permitted in each bucket of the hash table. > > net.inet.tcp.tcbhashsize="32768" > > > > # Link Aggregation loader tweaks. see: > > https://calomel.org/freebsd_network_tuning.html > > hw.igb.enable_msix="1" > > hw.igb.num_queues="0" > > hw.igb.enable_aim="1" > > hw.igb.max_interrupt_rate="32000" > > hw.igb.rxd="2048" > > hw.igb.txd="2048" > > hw.igb.rx_process_limit="4096" > > if_lagg_load="YES" > > [/quote] > > > > Here's etc/sysctl.conf: > > [quote] > > # $FreeBSD: release/9.1.0/etc/sysctl.conf 112200 2003-03-13 > > 18:43:50Z > > mux $ > > # > > # This file is read when going to multi-user and its contents piped > > thru > > # ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for > > details. > > # > > > > # Uncomment this to prevent users from seeing information about > > processes > > that > > # are being run under another UID. > > #security.bsd.see_other_uids=0 > > kern.ipc.somaxconn=1024 > > kern.maxusers=272 > > #kern.maxvnodes=1096848 #increase this if necessary > > kern.ipc.maxsockbuf=8388608 > > net.inet.tcp.mssdflt=1460 > > net.inet.ip.forwarding=1 > > net.inet.ip.fastforwarding=1 > > dev.igb.2.fc=0 > > dev.igb.3.fc=0 > > dev.igb.4.fc=0 > > dev.igb.5.fc=0 > > dev.igb.2.rx_procesing_limit=10000 > > dev.igb.3.rx_procesing_limit=10000 > > dev.igb.4.rx_procesing_limit=10000 > > dev.igb.5.rx_procesing_limit=10000 > > net.inet.ip.redirect=0 > > net.inet.icmp.bmcastecho=0 # do not respond to ICMP packets sent > > to IP > > .255 > > net.inet.icmp.maskfake=0 # do not fake reply to ICMP Address > > Mask > > Request packets > > net.inet.icmp.maskrepl=0 # replies are not sent for ICMP > > address mask > > net.inet.icmp.log_redirect=0 # do not log redirected ICMP packet > > attempts > > net.inet.icmp.drop_redirect=1 # no redirected ICMP packets > > net.inet.tcp.drop_synfin=1 # SYN/FIN packets get dropped on > > initial > > connection > > net.inet.tcp.ecn.enable=1 # explicit congestion notification > > (ecn) > > warning: some ISP routers abuse it > > net.inet.tcp.icmp_may_rst=0 # icmp may not send RST to avoid > > spoofed > > icmp/udp floods > > net.inet.tcp.maxtcptw=15000 # max number of tcp time_wait states > > for > > closing connections > > net.inet.tcp.msl=5000 # 5 second maximum segment life > > waiting for > > an ACK in reply to a SYN-ACK or FIN-ACK > > net.inet.tcp.path_mtu_discovery=0 # disable MTU discovery since > > most > > ICMP > > packets are dropped by others > > net.inet.tcp.rfc3042=0 # disable the limited transmit > > mechanism > > which can slow burst transmissions > > net.inet.ip.rtexpire=60 # 3600 secs > > net.inet.ip.rtminexpire=2 # 10 secs > > net.inet.ip.rtmaxcache=1024 # 128 entries > > [/quote] > > > > Here's /etc/rc.conf > > [quote] > > #ifconfig_igb2=" inet xxx.xx.x.xx netmask 255.255.248.0" > > hostname="xxxxxxxxxxxxxxxxxxx" > > # > > # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable > > dumpdev="NO" > > # > > ### LACP config > > ifconfig_igb2="up" > > ifconfig_igb3="up" > > ifconfig_igb4="up" > > ifconfig_igb5="up" > > cloned_interfaces="lagg0" > > ifconfig_lagg0="laggproto lacp laggport igb2 laggport igb3 laggport > > igb4 > > laggport igb5 xxx.xx.x.xx netmask 255.255.248.0" > > ipvr_addrs_lagg0="xxx.xx.x.xx" > > defaultrouter="xxx.xx.x.xx" > > # > > ### Defaults for SSH, NTP, ZFS > > sshd_enable="YES" > > ntpd_enable="YES" > > zfs_enable="YES" > > # > > ## NFS Server > > rpcbind_enable="YES" > > nfs_server_enable="YES" > > mountd_flags="-r -l" > > nfsd_enable="YES" > > mountd_enable="YES" > > rpc_lockd_enable="NO" > > rpc_statd_enable="NO" > > nfs_server_flags="-u -t -n 128" > > nfsv4_server_enable="YES" > > nfsuserd_enable="YES" > > [/quote] > > > > Thanks in advance, > > -- > > Eric Browning > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to " > > freebsd-fs-unsubscribe@freebsd.org " > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to " freebsd-fs-unsubscribe@freebsd.org > " > > From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 23:05:24 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 9CDFD377 for ; Tue, 27 Aug 2013 23:05:24 +0000 (UTC) (envelope-from outbackdingo@gmail.com) Received: from mail-oa0-x22e.google.com (mail-oa0-x22e.google.com [IPv6:2607:f8b0:4003:c02::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 62F1829F0 for ; Tue, 27 Aug 2013 23:05:24 +0000 (UTC) Received: by mail-oa0-f46.google.com with SMTP id j10so6746870oah.33 for ; Tue, 27 Aug 2013 16:05:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=XtjaKFRytuvj8uW53qXhyttuV9UcXueIxnq5V0Ov1tc=; b=BKJx4Bb2SAMkZQwWLAipm2oObQWKxiGwCSFwQJthLWeulDNtK6Q06CT/+h5K3aCmXt peeUpmc0VMXiHnGIPqOM8en5S2C9udTHQRVzKdjckgmWIKSMnb6vk3MpyULNDg6YwAf1 nexJds6pW3PxnJ3eF5K30M6zurEb+aqfKKXqeWZK7FdepVfwUJebcnRQ9ag+jlt73mh7 TTPYZV4O0FwpiJ/D7PYmCkqV/usdgzJpr6euMdrplLyLD1Kk3yLPjI7gfTdTl9L7EJvR s7qaHYrdncAZ7CVm96v5+Aj1anOJp+r7UgixAZrMcSeBX7Q+oi6SYWtjx0rvb2j9pfK8 pOog== MIME-Version: 1.0 X-Received: by 10.60.93.67 with SMTP id cs3mr21120908oeb.12.1377644723344; Tue, 27 Aug 2013 16:05:23 -0700 (PDT) Received: by 10.76.2.110 with HTTP; Tue, 27 Aug 2013 16:05:23 -0700 (PDT) In-Reply-To: <1115794974.14452056.1377644539106.JavaMail.root@uoguelph.ca> References: <1115794974.14452056.1377644539106.JavaMail.root@uoguelph.ca> Date: Tue, 27 Aug 2013 19:05:23 -0400 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: Outback Dingo To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 23:05:24 -0000 On Tue, Aug 27, 2013 at 7:02 PM, Rick Macklem wrote: > Outback Dingo wrote: > > > > > > > > > > > > > > On Tue, Aug 27, 2013 at 3:29 PM, Rick Macklem < rmacklem@uoguelph.ca > > > wrote: > > > > > > > > > > Eric Browning wrote: > > > Hello, first time posting to this list. I have a new server that is > > > not > > > living up to the promise of SSD speeds and NFS is maxing out the > > > CPU. > > > I'm > > > new to FreeBSD but I've been reading up on it as much as I can. I > > > have > > > obscured my IP addresses and hostname with x's so just ignore that. > > > Server has about 200 users on it each draing under 50Mb/s peak > > > sustained > > > around 1-2Mb/s > > > > > > I've followed some network tuning guides for our I350t4 nic and > > > that > > > has > > > helped with network performance somewhat but the server is still > > > experiencing heavy load with pegging the CPU at 1250% on average > > > with > > > only > > > 50Mb/s of traffin in/out of the machine. All of the network tuning > > > came > > > from https://calomel.org/freebsd_network_tuning.html since it was > > > relevant > > > to the same nic that I have. > > > > > > Server Specs: > > > FreeBSD 9.1 > > > 16 cores AMDx64 > > > 64GB of ram > > > ZFS v28 with four Intel DC S3700 drives (800GB) as a zfs stripe > > > Intel DC S3500 for ZIL and enabling/disabling has made no > > > difference > > > Used a spare DC S3700 for the ZIL and that made no difference > > > either. > > > NFS v3 & v4 for Mac home folders whose Cache fodler is redirected. > > > > > > I've tried: > > > Compression on/of <-- no appreciable difference > > > Deduplication on/off <-- no appreciable difference > > > sync=disabled and sync=standard <-- no appreciable difference > > > setting arc cache to 56GB and also to 32GB <-- no difference in > > > performance > > > in terms of kern. > > > > > > I've tried to follow the freebsd tuning guide: > > > https://wiki.freebsd.org/ZFSTuningGuide to no avail either. I've > > > read > > > everything I can find on NFS on ZFS and nothing has helped. WHere > > > am > > > I > > > going wrong? > > > > > You could try this patch: > > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > > - After applying the patch and booting a kernel built from the > > patched > > sources, you need to increase the value of vfs.nfsd.tcphighwater. > > (Try something like 5000 for it as a starting point.) > > > > > > > > > > can we get a brief on what this is supposed to improve upon ? > > > It was developed for and tested by wollman@ to reduce mutex lock > contention and CPU overheads for the duplicate request cache, mainly > for NFS over TCP. (For the CPU overheads case, it allows the cache > to grow larger, reducing the frequency and, therefore, overhead of > trimming out stale entries.) > Here is the commit message, which I think covers it: > > Fix several performance related issues in the new NFS server's > DRC for NFS over TCP. > - Increase the size of the hash tables. > - Create a separate mutex for each hash list of the TCP hash table. > - Single thread the code that deletes stale cache entries. > - Add a tunable called vfs.nfsd.tcphighwater, which can be increased > to allow the cache to grow larger, avoiding the overhead of frequent > scans to delete stale cache entries. > (The default value will result in frequent scans to delete stale cache > entries, analagous to what the pre-patched code does.) > - Add a tunable called vfs.nfsd.cachetcp that can be used to disable > DRC caching for NFS over TCP, since the old NFS server didn't DRC cache > TCP. > It also adjusts the size of nfsrc_floodlevel dynamically, so that it is > always greater than vfs.nfsd.tcphighwater. > > For UDP the algorithm remains the same as the pre-patched code, but the > tunable vfs.nfsd.udphighwater can be used to allow the cache to grow > larger and reduce the overhead caused by frequent scans for stale entries. > UDP also uses a larger hash table size than the pre-patched code. > > Reported by: wollman > Tested by: wollman (earlier version of patch) > Submitted by: ivoras (earlier patch) > Reviewed by: jhb (earlier version of patch) > Thanks, much appreciated > > > > > Although this patch is somewhat different code, it should be > > semantically > > the same as r254337 in head, that is scheduled to be MFC'd to > > stable/9 in > > a couple of weeks. > > > > rick > > > > > > > > > Here's /boot/loader: > > > [quote] > > > # ZFS tuning tweaks > > > aio_load="YES" # Async IO system calls > > > autoboot_delay="10" # reduce boot menu delay time from 10 to 3 > > > seconds > > > vfs.zfs.arc_max="56868864000" # Reserves 10GB or ram for system, > > > leaves > > > 56GB for ZFS > > > vfs.zfs.cache_flush_disable="1" > > > #vfs.zfs.prefetch_disble="1" > > > vfs.zfs.write_limit_override="429496728" > > > > > > kern.ipc.nmbclusters="264144" # increase the number of network > > > mbufs > > > kern.maxfiles="65535" > > > net.inet.tcp.syncache.hashsize="1024" # Size of the syncache hash > > > table > > > net.inet.tcp.syncache.bucketlimit="100" # Limit the number of > > > entries > > > permitted in each bucket of the hash table. > > > net.inet.tcp.tcbhashsize="32768" > > > > > > # Link Aggregation loader tweaks. see: > > > https://calomel.org/freebsd_network_tuning.html > > > hw.igb.enable_msix="1" > > > hw.igb.num_queues="0" > > > hw.igb.enable_aim="1" > > > hw.igb.max_interrupt_rate="32000" > > > hw.igb.rxd="2048" > > > hw.igb.txd="2048" > > > hw.igb.rx_process_limit="4096" > > > if_lagg_load="YES" > > > [/quote] > > > > > > Here's etc/sysctl.conf: > > > [quote] > > > # $FreeBSD: release/9.1.0/etc/sysctl.conf 112200 2003-03-13 > > > 18:43:50Z > > > mux $ > > > # > > > # This file is read when going to multi-user and its contents piped > > > thru > > > # ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for > > > details. > > > # > > > > > > # Uncomment this to prevent users from seeing information about > > > processes > > > that > > > # are being run under another UID. > > > #security.bsd.see_other_uids=0 > > > kern.ipc.somaxconn=1024 > > > kern.maxusers=272 > > > #kern.maxvnodes=1096848 #increase this if necessary > > > kern.ipc.maxsockbuf=8388608 > > > net.inet.tcp.mssdflt=1460 > > > net.inet.ip.forwarding=1 > > > net.inet.ip.fastforwarding=1 > > > dev.igb.2.fc=0 > > > dev.igb.3.fc=0 > > > dev.igb.4.fc=0 > > > dev.igb.5.fc=0 > > > dev.igb.2.rx_procesing_limit=10000 > > > dev.igb.3.rx_procesing_limit=10000 > > > dev.igb.4.rx_procesing_limit=10000 > > > dev.igb.5.rx_procesing_limit=10000 > > > net.inet.ip.redirect=0 > > > net.inet.icmp.bmcastecho=0 # do not respond to ICMP packets sent > > > to IP > > > .255 > > > net.inet.icmp.maskfake=0 # do not fake reply to ICMP Address > > > Mask > > > Request packets > > > net.inet.icmp.maskrepl=0 # replies are not sent for ICMP > > > address mask > > > net.inet.icmp.log_redirect=0 # do not log redirected ICMP packet > > > attempts > > > net.inet.icmp.drop_redirect=1 # no redirected ICMP packets > > > net.inet.tcp.drop_synfin=1 # SYN/FIN packets get dropped on > > > initial > > > connection > > > net.inet.tcp.ecn.enable=1 # explicit congestion notification > > > (ecn) > > > warning: some ISP routers abuse it > > > net.inet.tcp.icmp_may_rst=0 # icmp may not send RST to avoid > > > spoofed > > > icmp/udp floods > > > net.inet.tcp.maxtcptw=15000 # max number of tcp time_wait states > > > for > > > closing connections > > > net.inet.tcp.msl=5000 # 5 second maximum segment life > > > waiting for > > > an ACK in reply to a SYN-ACK or FIN-ACK > > > net.inet.tcp.path_mtu_discovery=0 # disable MTU discovery since > > > most > > > ICMP > > > packets are dropped by others > > > net.inet.tcp.rfc3042=0 # disable the limited transmit > > > mechanism > > > which can slow burst transmissions > > > net.inet.ip.rtexpire=60 # 3600 secs > > > net.inet.ip.rtminexpire=2 # 10 secs > > > net.inet.ip.rtmaxcache=1024 # 128 entries > > > [/quote] > > > > > > Here's /etc/rc.conf > > > [quote] > > > #ifconfig_igb2=" inet xxx.xx.x.xx netmask 255.255.248.0" > > > hostname="xxxxxxxxxxxxxxxxxxx" > > > # > > > # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable > > > dumpdev="NO" > > > # > > > ### LACP config > > > ifconfig_igb2="up" > > > ifconfig_igb3="up" > > > ifconfig_igb4="up" > > > ifconfig_igb5="up" > > > cloned_interfaces="lagg0" > > > ifconfig_lagg0="laggproto lacp laggport igb2 laggport igb3 laggport > > > igb4 > > > laggport igb5 xxx.xx.x.xx netmask 255.255.248.0" > > > ipvr_addrs_lagg0="xxx.xx.x.xx" > > > defaultrouter="xxx.xx.x.xx" > > > # > > > ### Defaults for SSH, NTP, ZFS > > > sshd_enable="YES" > > > ntpd_enable="YES" > > > zfs_enable="YES" > > > # > > > ## NFS Server > > > rpcbind_enable="YES" > > > nfs_server_enable="YES" > > > mountd_flags="-r -l" > > > nfsd_enable="YES" > > > mountd_enable="YES" > > > rpc_lockd_enable="NO" > > > rpc_statd_enable="NO" > > > nfs_server_flags="-u -t -n 128" > > > nfsv4_server_enable="YES" > > > nfsuserd_enable="YES" > > > [/quote] > > > > > > Thanks in advance, > > > -- > > > Eric Browning > > > _______________________________________________ > > > freebsd-fs@freebsd.org mailing list > > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > > To unsubscribe, send any mail to " > > > freebsd-fs-unsubscribe@freebsd.org " > > > > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to " freebsd-fs-unsubscribe@freebsd.org > > " > > > > > From owner-freebsd-fs@FreeBSD.ORG Tue Aug 27 23:08:12 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 37DB640E for ; Tue, 27 Aug 2013 23:08:12 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id DD9B12A0A for ; Tue, 27 Aug 2013 23:08:11 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEALwwHVKDaFve/2dsb2JhbABZFoMmUYMnvQKBOHSCJAEBAQMBAQEBICsgCwUWGAICDRkCKQEJJgYIBwQBGAQEh1oGDKY1kkuBKYx8AQWBBTQHgmiBMQOVJoN2kDODPCAyegEIFyI X-IronPort-AV: E=Sophos;i="4.89,971,1367985600"; d="scan'208";a="47007925" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 27 Aug 2013 19:08:10 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 3495AB3F48; Tue, 27 Aug 2013 19:08:10 -0400 (EDT) Date: Tue, 27 Aug 2013 19:08:10 -0400 (EDT) From: Rick Macklem To: Outback Dingo Message-ID: <220720097.14453943.1377644890208.JavaMail.root@uoguelph.ca> In-Reply-To: <1115794974.14452056.1377644539106.JavaMail.root@uoguelph.ca> Subject: Re: NFS on ZFS pure SSD pool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 23:08:12 -0000 I wrote: > Outback Dingo wrote: > > > > > > > > > > > > > > On Tue, Aug 27, 2013 at 3:29 PM, Rick Macklem < > > rmacklem@uoguelph.ca > > > wrote: > > > > > > > > > > Eric Browning wrote: > > > Hello, first time posting to this list. I have a new server that > > > is > > > not > > > living up to the promise of SSD speeds and NFS is maxing out the > > > CPU. > > > I'm > > > new to FreeBSD but I've been reading up on it as much as I can. I > > > have > > > obscured my IP addresses and hostname with x's so just ignore > > > that. > > > Server has about 200 users on it each draing under 50Mb/s peak > > > sustained > > > around 1-2Mb/s > > > > > > I've followed some network tuning guides for our I350t4 nic and > > > that > > > has > > > helped with network performance somewhat but the server is still > > > experiencing heavy load with pegging the CPU at 1250% on average > > > with > > > only > > > 50Mb/s of traffin in/out of the machine. All of the network > > > tuning > > > came > > > from https://calomel.org/freebsd_network_tuning.html since it was > > > relevant > > > to the same nic that I have. > > > > > > Server Specs: > > > FreeBSD 9.1 > > > 16 cores AMDx64 > > > 64GB of ram > > > ZFS v28 with four Intel DC S3700 drives (800GB) as a zfs stripe > > > Intel DC S3500 for ZIL and enabling/disabling has made no > > > difference > > > Used a spare DC S3700 for the ZIL and that made no difference > > > either. > > > NFS v3 & v4 for Mac home folders whose Cache fodler is > > > redirected. > > > > > > I've tried: > > > Compression on/of <-- no appreciable difference > > > Deduplication on/off <-- no appreciable difference > > > sync=disabled and sync=standard <-- no appreciable difference > > > setting arc cache to 56GB and also to 32GB <-- no difference in > > > performance > > > in terms of kern. > > > > > > I've tried to follow the freebsd tuning guide: > > > https://wiki.freebsd.org/ZFSTuningGuide to no avail either. I've > > > read > > > everything I can find on NFS on ZFS and nothing has helped. WHere > > > am > > > I > > > going wrong? > > > > > You could try this patch: > > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > > - After applying the patch and booting a kernel built from the > > patched > > sources, you need to increase the value of vfs.nfsd.tcphighwater. > > (Try something like 5000 for it as a starting point.) > > > > > > > > > > can we get a brief on what this is supposed to improve upon ? > > > It was developed for and tested by wollman@ to reduce mutex lock > contention and CPU overheads for the duplicate request cache, mainly > for NFS over TCP. (For the CPU overheads case, it allows the cache > to grow larger, reducing the frequency and, therefore, overhead of > trimming out stale entries.) Oh, and I should also mention that ivoras@ developed a similar patch which had better code structure than mine. I did use some of his code in the patch that went into head, but not as much as I would have liked, because I wanted to get it into head before code slush for 10.0. (I had already missed the 9.2 release.) I think I did convince ivoras@ that global LRU was appropriate for UDP and that using a single list/mutex was the simplest coding of this. rick > Here is the commit message, which I think covers it: > > Fix several performance related issues in the new NFS server's > DRC for NFS over TCP. > - Increase the size of the hash tables. > - Create a separate mutex for each hash list of the TCP hash table. > - Single thread the code that deletes stale cache entries. > - Add a tunable called vfs.nfsd.tcphighwater, which can be increased > to allow the cache to grow larger, avoiding the overhead of > frequent > scans to delete stale cache entries. > (The default value will result in frequent scans to delete stale > cache > entries, analagous to what the pre-patched code does.) > - Add a tunable called vfs.nfsd.cachetcp that can be used to disable > DRC caching for NFS over TCP, since the old NFS server didn't DRC > cache TCP. > It also adjusts the size of nfsrc_floodlevel dynamically, so that it > is > always greater than vfs.nfsd.tcphighwater. > > For UDP the algorithm remains the same as the pre-patched code, but > the > tunable vfs.nfsd.udphighwater can be used to allow the cache to grow > larger and reduce the overhead caused by frequent scans for stale > entries. > UDP also uses a larger hash table size than the pre-patched code. > > Reported by: wollman > Tested by: wollman (earlier version of patch) > Submitted by: ivoras (earlier patch) > Reviewed by: jhb (earlier version of patch) > > > > > Although this patch is somewhat different code, it should be > > semantically > > the same as r254337 in head, that is scheduled to be MFC'd to > > stable/9 in > > a couple of weeks. > > > > rick > > > > > > > > > Here's /boot/loader: > > > [quote] > > > # ZFS tuning tweaks > > > aio_load="YES" # Async IO system calls > > > autoboot_delay="10" # reduce boot menu delay time from 10 to 3 > > > seconds > > > vfs.zfs.arc_max="56868864000" # Reserves 10GB or ram for system, > > > leaves > > > 56GB for ZFS > > > vfs.zfs.cache_flush_disable="1" > > > #vfs.zfs.prefetch_disble="1" > > > vfs.zfs.write_limit_override="429496728" > > > > > > kern.ipc.nmbclusters="264144" # increase the number of network > > > mbufs > > > kern.maxfiles="65535" > > > net.inet.tcp.syncache.hashsize="1024" # Size of the syncache hash > > > table > > > net.inet.tcp.syncache.bucketlimit="100" # Limit the number of > > > entries > > > permitted in each bucket of the hash table. > > > net.inet.tcp.tcbhashsize="32768" > > > > > > # Link Aggregation loader tweaks. see: > > > https://calomel.org/freebsd_network_tuning.html > > > hw.igb.enable_msix="1" > > > hw.igb.num_queues="0" > > > hw.igb.enable_aim="1" > > > hw.igb.max_interrupt_rate="32000" > > > hw.igb.rxd="2048" > > > hw.igb.txd="2048" > > > hw.igb.rx_process_limit="4096" > > > if_lagg_load="YES" > > > [/quote] > > > > > > Here's etc/sysctl.conf: > > > [quote] > > > # $FreeBSD: release/9.1.0/etc/sysctl.conf 112200 2003-03-13 > > > 18:43:50Z > > > mux $ > > > # > > > # This file is read when going to multi-user and its contents > > > piped > > > thru > > > # ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for > > > details. > > > # > > > > > > # Uncomment this to prevent users from seeing information about > > > processes > > > that > > > # are being run under another UID. > > > #security.bsd.see_other_uids=0 > > > kern.ipc.somaxconn=1024 > > > kern.maxusers=272 > > > #kern.maxvnodes=1096848 #increase this if necessary > > > kern.ipc.maxsockbuf=8388608 > > > net.inet.tcp.mssdflt=1460 > > > net.inet.ip.forwarding=1 > > > net.inet.ip.fastforwarding=1 > > > dev.igb.2.fc=0 > > > dev.igb.3.fc=0 > > > dev.igb.4.fc=0 > > > dev.igb.5.fc=0 > > > dev.igb.2.rx_procesing_limit=10000 > > > dev.igb.3.rx_procesing_limit=10000 > > > dev.igb.4.rx_procesing_limit=10000 > > > dev.igb.5.rx_procesing_limit=10000 > > > net.inet.ip.redirect=0 > > > net.inet.icmp.bmcastecho=0 # do not respond to ICMP packets sent > > > to IP > > > .255 > > > net.inet.icmp.maskfake=0 # do not fake reply to ICMP Address > > > Mask > > > Request packets > > > net.inet.icmp.maskrepl=0 # replies are not sent for ICMP > > > address mask > > > net.inet.icmp.log_redirect=0 # do not log redirected ICMP packet > > > attempts > > > net.inet.icmp.drop_redirect=1 # no redirected ICMP packets > > > net.inet.tcp.drop_synfin=1 # SYN/FIN packets get dropped on > > > initial > > > connection > > > net.inet.tcp.ecn.enable=1 # explicit congestion notification > > > (ecn) > > > warning: some ISP routers abuse it > > > net.inet.tcp.icmp_may_rst=0 # icmp may not send RST to avoid > > > spoofed > > > icmp/udp floods > > > net.inet.tcp.maxtcptw=15000 # max number of tcp time_wait states > > > for > > > closing connections > > > net.inet.tcp.msl=5000 # 5 second maximum segment life > > > waiting for > > > an ACK in reply to a SYN-ACK or FIN-ACK > > > net.inet.tcp.path_mtu_discovery=0 # disable MTU discovery since > > > most > > > ICMP > > > packets are dropped by others > > > net.inet.tcp.rfc3042=0 # disable the limited transmit > > > mechanism > > > which can slow burst transmissions > > > net.inet.ip.rtexpire=60 # 3600 secs > > > net.inet.ip.rtminexpire=2 # 10 secs > > > net.inet.ip.rtmaxcache=1024 # 128 entries > > > [/quote] > > > > > > Here's /etc/rc.conf > > > [quote] > > > #ifconfig_igb2=" inet xxx.xx.x.xx netmask 255.255.248.0" > > > hostname="xxxxxxxxxxxxxxxxxxx" > > > # > > > # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable > > > dumpdev="NO" > > > # > > > ### LACP config > > > ifconfig_igb2="up" > > > ifconfig_igb3="up" > > > ifconfig_igb4="up" > > > ifconfig_igb5="up" > > > cloned_interfaces="lagg0" > > > ifconfig_lagg0="laggproto lacp laggport igb2 laggport igb3 > > > laggport > > > igb4 > > > laggport igb5 xxx.xx.x.xx netmask 255.255.248.0" > > > ipvr_addrs_lagg0="xxx.xx.x.xx" > > > defaultrouter="xxx.xx.x.xx" > > > # > > > ### Defaults for SSH, NTP, ZFS > > > sshd_enable="YES" > > > ntpd_enable="YES" > > > zfs_enable="YES" > > > # > > > ## NFS Server > > > rpcbind_enable="YES" > > > nfs_server_enable="YES" > > > mountd_flags="-r -l" > > > nfsd_enable="YES" > > > mountd_enable="YES" > > > rpc_lockd_enable="NO" > > > rpc_statd_enable="NO" > > > nfs_server_flags="-u -t -n 128" > > > nfsv4_server_enable="YES" > > > nfsuserd_enable="YES" > > > [/quote] > > > > > > Thanks in advance, > > > -- > > > Eric Browning > > > _______________________________________________ > > > freebsd-fs@freebsd.org mailing list > > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > > To unsubscribe, send any mail to " > > > freebsd-fs-unsubscribe@freebsd.org " > > > > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to " > > freebsd-fs-unsubscribe@freebsd.org > > " > > > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 07:33:43 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B34BE598 for ; Wed, 28 Aug 2013 07:33:43 +0000 (UTC) (envelope-from kraduk@gmail.com) Received: from mail-qc0-x235.google.com (mail-qc0-x235.google.com [IPv6:2607:f8b0:400d:c01::235]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 7609A2198 for ; Wed, 28 Aug 2013 07:33:43 +0000 (UTC) Received: by mail-qc0-f181.google.com with SMTP id i17so1135007qcy.12 for ; Wed, 28 Aug 2013 00:33:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Yf4lLDHwyz0mxb8G420TUBYI7rikCXF/8AATXy/JpAU=; b=bav9Tpw2jLRHTfTzv4HUL/vuo3YHLApd0r0vV/Dd1UN+YIEgd542y2y05Xkm4DNKax z1/LuTLfv+iue2+xr98I7fKdnGnwPJVsTqqHetVvBxmao7VwVuXEjgoCcdPRBwD7PEGv dxXoZnIymMiB1XgJsRp14yIquJVnEAmaAZw/ipgaECjfwyghbtKMdZhfi+iID3A7/f2R dE30wWT2JeFX90FvzWMlCZkKBUqTgtEuH1Dpv/GcO8oOU9ATxCUJSzrnCl7CeQ4YOFeG b4aCIkjfSn2Rzdjv64kBNrJPSFYcQo34N0ZlZijKRrZisa+UxUrxJ+IPMg3Y6YaXyaDI NMfA== MIME-Version: 1.0 X-Received: by 10.224.166.197 with SMTP id n5mr125593qay.98.1377675222576; Wed, 28 Aug 2013 00:33:42 -0700 (PDT) Received: by 10.224.189.195 with HTTP; Wed, 28 Aug 2013 00:33:42 -0700 (PDT) In-Reply-To: <20130827172727.GA73465@neutralgood.org> References: <20130827181127.24761@relay.ibs.dn.ua> <20130827172727.GA73465@neutralgood.org> Date: Wed, 28 Aug 2013 08:33:42 +0100 Message-ID: Subject: Re: exictent zroot re-alignment to 4K From: krad To: kpneal@pobox.com Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 07:33:43 -0000 is it just an alignment problem or are you ashift=9 (zdb zroot | grep ashift) as well? If you are already ashift=12, you can probably skip all the zfs send and receive bits and just attach it as a mirror, then drop out the original drive, repartition, install boot blocks and reattach. On 27 August 2013 18:27, wrote: > On Tue, Aug 27, 2013 at 06:11:27PM +0300, Zeus Panchenko wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > hi all, > > > > please, I really need help ... > > > > recently I noticed that my netbook HDD has 4K phys sectors and I think > > it is worth to use that (is it really worth indeed?) > > > > I have FreeBSD 9-STABLE running well on it already but when it was > > installed, the alignment was not done properly ... > > > > - -------------------------------------------------------------------- > > what I have: > > > > > uname -a > > FreeBSD 9.1-STABLE #17 r250823: Mon May 20 19:39:19 EEST 2013 amd64 > > > > > smartctl -a /dev/ada0 > > Model Family: Western Digital Scorpio Black (AF) > > Device Model: WDC WD5000BPKT-60PK4T0 > > Serial Number: WD-WXJ1A61P0560 > > User Capacity: 500,107,862,016 bytes [500 GB] > > Sector Sizes: 512 bytes logical, 4096 bytes physical > > > > > gpart show ada0 > > => 34 976773101 ada0 GPT (465G) > > 34 6 - free - (3.0k) > > 40 128 1 freebsd-boot (64k) > > 168 33554432 2 freebsd-swap (16G) > > 33554600 943218528 3 freebsd-zfs (449G) > > 976773128 7 - free - (3.5k) > > Unless I'm doing the math wrong it looks like your swap and zfs partitions > are already 4k aligned. So there's no need to redo your partitions. Just > skip ahead to the part where you do the ZFS stuff. > > I've never setup a machine with a ZFS root so I can't say if that part is > correct. > > > is ZFS version have to be the same on both systems (the one to be > > aligned is shown as 5000 (after `zpool upgrade -a') while the one all > > that stuff to be performed at is 28)? > > I assume you are doing a zfs send to a file on a remote machine? Or, you > are doing the send+receive on the helper machine? In either of those cases > you just need the machine that you do the restore on to be new enough to > handle ZFS version 5000. The version of the pool holding the file you are > restoring from does not matter. > > -- > "A method for inducing cats to exercise consists of directing a beam of > invisible light produced by a hand-held laser apparatus onto the floor ... > in the vicinity of the cat, then moving the laser ... in an irregular way > fascinating to cats,..." -- US patent 5443036, "Method of exercising a cat" > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 08:24:49 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id A5E8AE73 for ; Wed, 28 Aug 2013 08:24:49 +0000 (UTC) (envelope-from zeus@ibs.dn.ua) Received: from relay.ibs.dn.ua (relay.ibs.dn.ua [91.216.196.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 26299250B for ; Wed, 28 Aug 2013 08:24:48 +0000 (UTC) Received: from ibs.dn.ua (relay.ibs.dn.ua [91.216.196.25]) by relay.ibs.dn.ua with ESMTP id r7S8OiX2005125; Wed, 28 Aug 2013 11:24:44 +0300 (EEST) Message-ID: <20130828112444.5123@relay.ibs.dn.ua> Date: Wed, 28 Aug 2013 11:24:44 +0300 From: "Zeus Panchenko" To: "krad" Subject: Re: exictent zroot re-alignment to 4K In-reply-to: Your message of Wed, 28 Aug 2013 08:33:42 +0100 References: <20130827181127.24761@relay.ibs.dn.ua> <20130827172727.GA73465@neutralgood.org> Organization: I.B.S. LLC X-Mailer: MH-E 8.3.1; GNU Mailutils 2.99.98; GNU Emacs 24.0.93 X-Face: &sReWXo3Iwtqql1[My(t1Gkx; y?KF@KF`4X+'9Cs@PtK^y%}^.>Mtbpyz6U=,Op:KPOT.uG )Nvx`=er!l?WASh7KeaGhga"1[&yz$_7ir'cVp7o%CGbJ/V)j/=]vzvvcqcZkf; JDurQG6wTg+?/xA go`}1.Ze//K; Fk&/&OoHd'[b7iGt2UO>o(YskCT[_D)kh4!yY'<&:yt+zM=A`@`~9U+P[qS:f; #9z~ Or/Bo#N-'S'!'[3Wog'ADkyMqmGDvga?WW)qd=?)`Y&k=o}>!ST\ MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Zeus Panchenko List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 08:24:49 -0000 krad wrote: > is it just an alignment problem or are you ashift=3D9 (zdb zroot | grep > ashift) as well? If you are already ashift=3D12, yes, I am ashift=3D9 and it is the very cause I decided to re-allocate the partition ... > all the zfs send and receive bits and just attach it as a mirror, > then drop out the original drive, repartition, install boot blocks > and reattach.=C2=A0 hm ... great! I've missed that variant --=20 Zeus V. Panchenko jid:zeus@im.ibs.dn.ua IT Dpt., I.B.S. LLC GMT+2 (EET) From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 10:47:53 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id E2DD7DAD for ; Wed, 28 Aug 2013 10:47:52 +0000 (UTC) (envelope-from kraduk@gmail.com) Received: from mail-qe0-x22b.google.com (mail-qe0-x22b.google.com [IPv6:2607:f8b0:400d:c02::22b]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A49A32D2B for ; Wed, 28 Aug 2013 10:47:52 +0000 (UTC) Received: by mail-qe0-f43.google.com with SMTP id t7so3304525qeb.16 for ; Wed, 28 Aug 2013 03:47:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=UGJSLRfb7Bcm8lDVnWzDxunF/YyLG6umu3XdX6umMDY=; b=MlGWONkEi20ArL+MKVkgX0ul3zaXoJ7zHBT9oLF6DcDKtY3pnSBsBda/vkXk2LN77W wZmUUYjK8XS9yoPN5HoAV9d0nNHWns2GvZfu7SSPUzLbDWsCM5+IWL/9N78xxCY7h7ka q1sXU0JI5sGy+I+wg5/UD/ycaQY/dODYxoNFjhv6i0bCu6pKvDT/+eOA22zr4enAcCDV GhAKmLZBSkoN0hKj1eUze98OBkDzdBwvonp7UIZNbZnBo+PkqLifgZKqiYYt1SznMMXO 5cJVOxlhXHm4nseigv4CFHfbQU3PpeZGq5WEKfutKVmVJPld+ywCxd4VUibVpMwPx8EC 1ERA== MIME-Version: 1.0 X-Received: by 10.224.127.196 with SMTP id h4mr27797581qas.59.1377686871829; Wed, 28 Aug 2013 03:47:51 -0700 (PDT) Received: by 10.224.189.195 with HTTP; Wed, 28 Aug 2013 03:47:51 -0700 (PDT) In-Reply-To: <20130828112444.5123@relay.ibs.dn.ua> References: <20130827181127.24761@relay.ibs.dn.ua> <20130827172727.GA73465@neutralgood.org> <20130828112444.5123@relay.ibs.dn.ua> Date: Wed, 28 Aug 2013 11:47:51 +0100 Message-ID: Subject: Re: exictent zroot re-alignment to 4K From: krad To: Zeus Panchenko Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 10:47:53 -0000 you cant mirrir if your ashift=9 as its a pool level thing On 28 August 2013 09:24, Zeus Panchenko wrote: > krad wrote: > > > is it just an alignment problem or are you ashift=9 (zdb zroot | grep > > ashift) as well? If you are already ashift=12, > > yes, I am ashift=9 and it is the very cause I decided to re-allocate the > partition ... > > > all the zfs send and receive bits and just attach it as a mirror, > > then drop out the original drive, repartition, install boot blocks > > and reattach. > > hm ... great! I've missed that variant > > -- > Zeus V. Panchenko jid:zeus@im.ibs.dn.ua > IT Dpt., I.B.S. LLC GMT+2 (EET) > From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 13:56:44 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 1EB648FF for ; Wed, 28 Aug 2013 13:56:44 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id CD8BE291D for ; Wed, 28 Aug 2013 13:56:43 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1VEgEj-0001iv-1f for freebsd-fs@freebsd.org; Wed, 28 Aug 2013 15:56:41 +0200 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 28 Aug 2013 15:56:41 +0200 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 28 Aug 2013 15:56:41 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Subject: Call fo comments - raising vfs.ufs.dirhash_reclaimage? Date: Wed, 28 Aug 2013 15:56:30 +0200 Lines: 76 Message-ID: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="----enig2KORNWEXCTJDANISUNQEX" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130322 Thunderbird/17.0.4 X-Enigmail-Version: 1.5.1 Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 13:56:44 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2KORNWEXCTJDANISUNQEX Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, Prodded by davide@, I'd like to collect opinions about raising the vfs.ufs.dirhash_reclaimage sysctl from 5 to 60, committed at: http://svnweb.freebsd.org/changeset/base/254986 What it does: Used in lowmem handler at http://fxr.watson.org/fxr/source/ufs/ufs/ufs_dirhash.c#L1247 when determining which cache entries to evict; it skips (keeps in the cache) entries which are younger than this number of seconds. This lowmem handler only frees up to 10% of the dirhash cache at a time. A real-life example: a PHP server with session files on local storage. Even a moderately busy server will usually have several GB of these small files sharded in directories containing few hundred files each, up to several hundred thousand files total. It is a standard practice for admins to increase the dirhash if they notice PHP/apache processes stuck in the "ufs" state for too long. Why: Without ufs_dirhashreclaimage, it is possible that UFS dirhash entries will get evicted immediately after they are created if a lowmem situation occurs. The previous ufs_dirhashreclaimage value was 5, keeping the entries for at least 5 seconds before allowing them to get evicted (Reclaim Age). Why the change: Dirhash is a cache. Keeping as large number of entries as possible for as large amount of time as possible is a good thing. It can be done either by increasing maxmem or by increasing reclaimage. A scenario where increasing reclaimage helps: when a load spike happens, the following happens: 1) memory gets used up, 2) lowmem is signalled 3) dirhash entries get evicted 4) processes get slower (this can be observed in "top" as being in the "ufs" wchan) waiting for the data which could have been cached. Increasing reclaimage during a spike situation has the effect of clearing up processes faster. Why not leave it for sysadmins to tune it themselves if they want it: 1) They usually don't know about it until it's too late. 2) Dirhash is typically miniscule compared to todays memory sizes - a few dozen MBs even on very busy systems, and there are no typical situations where a large number of entries are filled in at the same time which block eviction of a large-ish amount of memory, so having reclaimage higher will automatically help in file-system intensive spikes without harming other uses. ------enig2KORNWEXCTJDANISUNQEX Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlIeAY8ACgkQ/QjVBj3/HSzEfACgjc4tpCc9AiqrPej/QFki0vH8 u+YAoI67IgN3kTH2GptadAKQgQ5TYTnT =aryY -----END PGP SIGNATURE----- ------enig2KORNWEXCTJDANISUNQEX-- From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 15:11:12 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id AB82EB4B for ; Wed, 28 Aug 2013 15:11:12 +0000 (UTC) (envelope-from aurfalien@gmail.com) Received: from mail-oa0-x22f.google.com (mail-oa0-x22f.google.com [IPv6:2607:f8b0:4003:c02::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 75ADB2F9A for ; Wed, 28 Aug 2013 15:11:12 +0000 (UTC) Received: by mail-oa0-f47.google.com with SMTP id g12so7812986oah.20 for ; Wed, 28 Aug 2013 08:11:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=lOpwCcyuzKTTNo0umqV1XCzZik8D7kau0CN1gYQYZ18=; b=nJ45Dy6AUpxDknVbbrGLnejZGjxmIRtX36ub48Kwlt5eC2Ygrm62e3qGMq2w0UtGTs 7CAP4SdqDjfFaXGwXYrCQm9LQfuPwuny17qtqRWi7pTFknRRA4KVuAK0QkVAaBPfPWE0 58P77AfSHMhczq1r068Ni0vk2ntHPSfzB6We9+/3ew7lWJQa1QOFoO5faCP2Vse5QE78 sKAcfUxi+QZE+PMxjjMzf4S2WEVzFZP7EUpWGyQdwFCMyauqK7VInrYjqebhzhxCeRsh pHYGHZ74tX8j2S42eURXq32Z9PfA1GMRhGSKcU9C+z8EYlTmbkxAoZxJh3WzkJAOmpcU Al1A== X-Received: by 10.60.96.131 with SMTP id ds3mr6902880oeb.50.1377702671841; Wed, 28 Aug 2013 08:11:11 -0700 (PDT) Received: from [192.168.1.74] (75-63-29-182.lightspeed.irvnca.sbcglobal.net. [75.63.29.182]) by mx.google.com with ESMTPSA id ps5sm27121195oeb.8.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 28 Aug 2013 08:11:10 -0700 (PDT) Subject: Re: NFS on ZFS pure SSD pool Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: aurfalien In-Reply-To: <220720097.14453943.1377644890208.JavaMail.root@uoguelph.ca> Date: Wed, 28 Aug 2013 08:11:08 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <220720097.14453943.1377644890208.JavaMail.root@uoguelph.ca> To: Rick Macklem X-Mailer: Apple Mail (2.1085) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 15:11:12 -0000 On Aug 27, 2013, at 4:08 PM, Rick Macklem wrote: > I wrote: >> Outback Dingo wrote: >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>>=20 >>> On Tue, Aug 27, 2013 at 3:29 PM, Rick Macklem < >>> rmacklem@uoguelph.ca >>>> wrote: >>>=20 >>>=20 >>>=20 >>>=20 >>> Eric Browning wrote: >>>> Hello, first time posting to this list. I have a new server that >>>> is >>>> not >>>> living up to the promise of SSD speeds and NFS is maxing out the >>>> CPU. >>>> I'm >>>> new to FreeBSD but I've been reading up on it as much as I can. I >>>> have >>>> obscured my IP addresses and hostname with x's so just ignore >>>> that. >>>> Server has about 200 users on it each draing under 50Mb/s peak >>>> sustained >>>> around 1-2Mb/s >>>>=20 >>>> I've followed some network tuning guides for our I350t4 nic and >>>> that >>>> has >>>> helped with network performance somewhat but the server is still >>>> experiencing heavy load with pegging the CPU at 1250% on average >>>> with >>>> only >>>> 50Mb/s of traffin in/out of the machine. All of the network >>>> tuning >>>> came >>>> from https://calomel.org/freebsd_network_tuning.html since it was >>>> relevant >>>> to the same nic that I have. >>>>=20 >>>> Server Specs: >>>> FreeBSD 9.1 >>>> 16 cores AMDx64 >>>> 64GB of ram >>>> ZFS v28 with four Intel DC S3700 drives (800GB) as a zfs stripe >>>> Intel DC S3500 for ZIL and enabling/disabling has made no >>>> difference >>>> Used a spare DC S3700 for the ZIL and that made no difference >>>> either. >>>> NFS v3 & v4 for Mac home folders whose Cache fodler is >>>> redirected. >>>>=20 >>>> I've tried: >>>> Compression on/of <-- no appreciable difference >>>> Deduplication on/off <-- no appreciable difference >>>> sync=3Ddisabled and sync=3Dstandard <-- no appreciable difference >>>> setting arc cache to 56GB and also to 32GB <-- no difference in >>>> performance >>>> in terms of kern. >>>>=20 >>>> I've tried to follow the freebsd tuning guide: >>>> https://wiki.freebsd.org/ZFSTuningGuide to no avail either. I've >>>> read >>>> everything I can find on NFS on ZFS and nothing has helped. WHere >>>> am >>>> I >>>> going wrong? >>>>=20 >>> You could try this patch: >>> http://people.freebsd.org/~rmacklem/drc4-stable9.patch >>> - After applying the patch and booting a kernel built from the >>> patched >>> sources, you need to increase the value of vfs.nfsd.tcphighwater. >>> (Try something like 5000 for it as a starting point.) >>>=20 >>>=20 >>>=20 >>>=20 >>> can we get a brief on what this is supposed to improve upon ? >>>=20 >> It was developed for and tested by wollman@ to reduce mutex lock >> contention and CPU overheads for the duplicate request cache, mainly >> for NFS over TCP. (For the CPU overheads case, it allows the cache >> to grow larger, reducing the frequency and, therefore, overhead of >> trimming out stale entries.) > Oh, and I should also mention that ivoras@ developed a similar patch = which > had better code structure than mine. I did use some of his code in the = patch > that went into head, but not as much as I would have liked, because I = wanted > to get it into head before code slush for 10.0. (I had already missed = the 9.2 > release.) I take it that this patch can be applied to 9.2 when its officially = released? You mentioned=20 - aurf= From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 15:23:56 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id AD80C20C for ; Wed, 28 Aug 2013 15:23:56 +0000 (UTC) (envelope-from sfourman@gmail.com) Received: from mail-vb0-x233.google.com (mail-vb0-x233.google.com [IPv6:2607:f8b0:400c:c02::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6C9B320F5 for ; Wed, 28 Aug 2013 15:23:56 +0000 (UTC) Received: by mail-vb0-f51.google.com with SMTP id x16so4080938vbf.38 for ; Wed, 28 Aug 2013 08:23:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Vmt/7Gz58sxG0u9odrAyTl8BI5NZ8YlKjRMA5zvUn48=; b=bjZN2ltXGpZbPlqWUkgRVFFVumdWkJWp4ABnIbahEnGGHtQtx/iEbpSNKmHaN1wEm1 QVNMFlzYNdIqwVjOOJgr5s68hLwxtBYerRCiG13WeZLve3Nrwo8hltTHA6d8dH53XTrM qjGqHJ9/G3e19x0V/9/P3NhAOrjB2my49RUI8m1TmwAjXRrx2yO5HqCcb7k1a+Hoz4kO H4055/OKuSmg0/+WhM9Hr19sccIXsd6aO9OM55/C7Ojh3MDV5/R6a6GgoTMkFQNcL2Bx GTLANXf1lB1TVhGL7f5a/iC4fdvBBThG6WRVLsmjJ0+qJYQRIgAlKC9g/JgYCOsipOIj kkVg== MIME-Version: 1.0 X-Received: by 10.220.10.194 with SMTP id q2mr25915241vcq.2.1377703434631; Wed, 28 Aug 2013 08:23:54 -0700 (PDT) Received: by 10.220.96.78 with HTTP; Wed, 28 Aug 2013 08:23:54 -0700 (PDT) In-Reply-To: References: <220720097.14453943.1377644890208.JavaMail.root@uoguelph.ca> Date: Wed, 28 Aug 2013 11:23:54 -0400 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: "Sam Fourman Jr." To: aurfalien Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 15:23:56 -0000 > I take it that this patch can be applied to 9.2 when its officially > released? > > You mentioned > > - aurf > Yes, it applies to stable/9 just fine > -- > Sam Fourman Jr. From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 16:06:18 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 1E41CF46 for ; Wed, 28 Aug 2013 16:06:18 +0000 (UTC) (envelope-from rainer@ultra-secure.de) Received: from mail.ultra-secure.de (mail.ultra-secure.de [78.47.114.122]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 5FCDA2434 for ; Wed, 28 Aug 2013 16:06:17 +0000 (UTC) Received: (qmail 15863 invoked by uid 89); 28 Aug 2013 16:06:07 -0000 Received: by simscan 1.4.0 ppid: 15858, pid: 15860, t: 0.1366s scanners: attach: 1.4.0 clamav: 0.97.3/m:54/d:17761 Received: from unknown (HELO suse3) (rainer@ultra-secure.de@212.71.117.1) by mail.ultra-secure.de with ESMTPA; 28 Aug 2013 16:06:07 -0000 Date: Wed, 28 Aug 2013 18:06:06 +0200 From: Rainer Duffner To: FreeBSD FS Subject: Re: NFS on ZFS pure SSD pool Message-ID: <20130828180606.5fb105dd@suse3> In-Reply-To: References: <220720097.14453943.1377644890208.JavaMail.root@uoguelph.ca> X-Mailer: Claws Mail 3.8.1 (GTK+ 2.24.10; x86_64-suse-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 16:06:18 -0000 Am Wed, 28 Aug 2013 11:23:54 -0400 schrieb "Sam Fourman Jr." : > > I take it that this patch can be applied to 9.2 when its officially > > released? > > > > You mentioned > > > > - aurf > > > Yes, it applies to stable/9 just fine > I think he meant if one can take the source of 9.2-RELEASE and apply that patch. Some companies don't want to run anything that does not say "RELEASE".... From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 16:12:34 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id CB85C1EF; Wed, 28 Aug 2013 16:12:34 +0000 (UTC) (envelope-from gljennjohn@googlemail.com) Received: from mail-bk0-x236.google.com (mail-bk0-x236.google.com [IPv6:2a00:1450:4008:c01::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0903A24EF; Wed, 28 Aug 2013 16:12:33 +0000 (UTC) Received: by mail-bk0-f54.google.com with SMTP id mz12so2303987bkb.13 for ; Wed, 28 Aug 2013 09:12:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=date:from:to:cc:subject:message-id:in-reply-to:references:reply-to :mime-version:content-type:content-transfer-encoding; bh=Q0UFBFG3OQNSfIVBxQ8BX4LdUN5McWVDaf4Dnx2WrqI=; b=c8xMzHl0G+Fn3wODqLRuLBFKLiniTwzjpks5wM9E/TIoQ/lAEHpGn2R2QzeA9mfKl3 TMKJHNZAnJ/JTtjYBAag7S65Pe7eLHVa0IpnHuMsEDKcjA7kUYXlbeRgn9B95HOZIv+B 3k5mIouz9LzVrfcf+ECPUK1G1uBMmYGUQszNrGnry9n5YUkt2CTgW9jh4QoUj4GdGnZG llh24dTP2PhSzr/nbU7GiZ0qAjDwUxWIiesCfjghB2hqAuZipWeG+mreRLQzSW1dsLNa NGYjEVj6/2MvJQP6ilbLHf0kapWSAOMTPv7H7TpmqGSUfiAiwVdQlbY/NoPJL4h6Big8 V5DA== X-Received: by 10.204.62.70 with SMTP id w6mr101436bkh.43.1377706351474; Wed, 28 Aug 2013 09:12:31 -0700 (PDT) Received: from ernst.home (p578E0A66.dip0.t-ipconnect.de. [87.142.10.102]) by mx.google.com with ESMTPSA id jt14sm6184548bkb.0.1969.12.31.16.00.00 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 28 Aug 2013 09:12:30 -0700 (PDT) Date: Wed, 28 Aug 2013 18:12:28 +0200 From: Gary Jennejohn To: Ivan Voras Subject: Re: Call fo comments - raising vfs.ufs.dirhash_reclaimage? Message-ID: <20130828181228.0d3618dd@ernst.home> In-Reply-To: References: X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.17; amd64-portbld-freebsd10.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: gljennjohn@googlemail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 16:12:34 -0000 On Wed, 28 Aug 2013 15:56:30 +0200 Ivan Voras wrote: [jump to the chase] > Why not leave it for sysadmins to tune it themselves if they want it: > > 1) They usually don't know about it until it's too late. > > 2) Dirhash is typically miniscule compared to todays memory sizes - a > few dozen MBs even on very busy systems, and there are no typical > situations where a large number of entries are filled in at the same > time which block eviction of a large-ish amount of memory, so having > reclaimage higher will automatically help in file-system intensive > spikes without harming other uses. > So, if I understand this correctly, a normal desktop user won't notice any real change, except that buildworld might get faster, and big servers will benefit? But could this negatively impact small, embedded systems, which usually have only small memory footprints? Although I suppose one could argue that they usually don't have large numbers of files cached in memory at any given time. -- Gary Jennejohn From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 16:40:57 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 70AD89AE; Wed, 28 Aug 2013 16:40:57 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-vc0-x232.google.com (mail-vc0-x232.google.com [IPv6:2607:f8b0:400c:c03::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1FD9B270F; Wed, 28 Aug 2013 16:40:57 +0000 (UTC) Received: by mail-vc0-f178.google.com with SMTP id ha12so4428788vcb.9 for ; Wed, 28 Aug 2013 09:40:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=0pUJUXgWtnJQUozJSPQTWj89DGpC1GMmt9qnXPY4x2w=; b=tWne3NFlSrHmxKcgtqdth5eXpHIC/50CIAYQhHYD5nkYfuV6s/JcrgLE9wfY1mt1Qi 7q0AiychkTJdpW5A1ZALdkzVaq2Cjjh3anQTT0dLuj0tFmZwkIk7UDhSVBknzVRKz6TB qE2/qMdALJtlC9BpIZVXDkzxxAftBQKQ7115t7Djuaysw8k3Y9IWGAq35dVcieFGUhJ1 /Hk8+YTisTYkIm1flsHvsFYuZh1OMIoC7FJMadaDVHpUK9/w3ClgYsbuAalcxhNfMoO7 PU3KF6tTe877+HmxRi4CJd8sjAJiRX8Nak6UKhqnLWZXGoJVMG0rFFA7Ad82K9C8WmCp isqw== X-Received: by 10.52.165.111 with SMTP id yx15mr1099455vdb.33.1377708056260; Wed, 28 Aug 2013 09:40:56 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.58.229.167 with HTTP; Wed, 28 Aug 2013 09:40:15 -0700 (PDT) In-Reply-To: <20130828181228.0d3618dd@ernst.home> References: <20130828181228.0d3618dd@ernst.home> From: Ivan Voras Date: Wed, 28 Aug 2013 18:40:15 +0200 X-Google-Sender-Auth: U9LZAgQfiX3G_bOqoiVEpR_JU6g Message-ID: Subject: Re: Call fo comments - raising vfs.ufs.dirhash_reclaimage? To: gljennjohn@googlemail.com Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs , freebsd-hackers X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 16:40:57 -0000 On 28 August 2013 18:12, Gary Jennejohn wrote: > So, if I understand this correctly, a normal desktop user won't > notice any real change, except that buildworld might get faster, > and big servers will benefit? Basically, yes, but read on... > But could this negatively impact small, embedded systems, which > usually have only small memory footprints? Although I suppose > one could argue that they usually don't have large numbers of > files cached in memory at any given time. Unless I'm wrong, the only pathological case coming from this change would be the following sequence of events: 1) Memory is scarce [*] 2) There's a sudden surge of requests for a huge number of different directories 3) There's an urgent lowmem event which is observed by dirhash, which attempts to free memory but is prevented in doing so for the next 60 seconds because all entries are young (the idea behind dirhash being that if a directory is accessed, it will probably soon be accessed again - think "ls" then "fopen", so we won't evict him until reclaimage seconds) 4) the kernel runs out of memory, game over. Note that this sequence of events could still happen right now, only over a span of 5 seconds, not 60 seconds. Note also that all of this has nothing to do with regular file cache, dirhash is a very specific corner-case of UFS. [*] Keep in mind that dirhash cache even on large and busy systems is usually ~~15-25 MB; on 16 GB machines the auto-tuning code caps it at 25 MB. As an illustration on how tiny dirhash is: a "du -c" on /usr/ports increases dirhash_mem on my desktop from 103945 to 501507 bytes. One of the issues raised by davide is that the benefits from this are also miniscule and hard to prove. A simple buildworld is not a big enough load. I've seen on my own skin how increasing reclaimage helped, but that was under such specific circumstances that I'm still trying to figure out how to create a self-sustained benchmark (basically - how to provoke lowmem events?). Basically, this change will have no effect for 99.9% of users, but could save that 0.1% from going crazy. From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 18:27:41 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 883D793F for ; Wed, 28 Aug 2013 18:27:41 +0000 (UTC) (envelope-from ericbrowning@skaggscatholiccenter.org) Received: from mail-pa0-f51.google.com (mail-pa0-f51.google.com [209.85.220.51]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6008A2D5E for ; Wed, 28 Aug 2013 18:27:41 +0000 (UTC) Received: by mail-pa0-f51.google.com with SMTP id lf1so6574359pab.24 for ; Wed, 28 Aug 2013 11:27:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=SAvYCzPXaxefiM8Pg5rxO5ymd/RcnHLI1Ib0eWm7zIU=; b=UjDWfjxFNrXdTS/X18xcGooSKWdFc+SpjlOI997dt+thWIO1fxcY53Ve5H5+TPPElJ kCz9YEaUNtaRpw9q7bVBxz4lmzt+mISmgOX/R1m1YHx7T5z+pj5ryVcmoK1cLkcm6JJ1 SFBdtGgKB/pJzUHhYpYh1Zhh37IXuf5itSVy6ryFLyc+R58ai1DRfN0OW5FXZcLgdccx +EvdJAj2pu1FFPK5hDQOhlKe7Y7mOOWitNDFCUzhBMs7EhK/9/YxqoIecqXto3Wf4zlg wySfuMZwNRZpkFZClI8/rDYAgJxZ+IBJIDtAs7NqHdgGl89F0gdT50D/mDIUili2Zhtz qIWQ== X-Gm-Message-State: ALoCoQmJKx7f2ZuBoSd0OhlwEV7KvnpkVfV+xXBVy5VoKNYowIm3Zx+IVcanmbuoD21oISnMV+9x MIME-Version: 1.0 X-Received: by 10.66.154.169 with SMTP id vp9mr3577192pab.190.1377714460375; Wed, 28 Aug 2013 11:27:40 -0700 (PDT) Received: by 10.70.26.4 with HTTP; Wed, 28 Aug 2013 11:27:40 -0700 (PDT) In-Reply-To: <2008996797.14358576.1377631792358.JavaMail.root@uoguelph.ca> References: <2008996797.14358576.1377631792358.JavaMail.root@uoguelph.ca> Date: Wed, 28 Aug 2013 12:27:40 -0600 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: Eric Browning To: Rick Macklem , freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 18:27:41 -0000 Rick, Sam and I applied the patch (kernel now at r254983M) and set vfs.nfsd.tcphighwater=5000 in sysctl.conf and my CPU is still slammed. SHould I up it to 10000? On Tue, Aug 27, 2013 at 1:29 PM, Rick Macklem wrote: > Eric Browning wrote: > > Hello, first time posting to this list. I have a new server that is > > not > > living up to the promise of SSD speeds and NFS is maxing out the CPU. > > I'm > > new to FreeBSD but I've been reading up on it as much as I can. I > > have > > obscured my IP addresses and hostname with x's so just ignore that. > > Server has about 200 users on it each draing under 50Mb/s peak > > sustained > > around 1-2Mb/s > > > > I've followed some network tuning guides for our I350t4 nic and that > > has > > helped with network performance somewhat but the server is still > > experiencing heavy load with pegging the CPU at 1250% on average with > > only > > 50Mb/s of traffin in/out of the machine. All of the network tuning > > came > > from https://calomel.org/freebsd_network_tuning.html since it was > > relevant > > to the same nic that I have. > > > > Server Specs: > > FreeBSD 9.1 > > 16 cores AMDx64 > > 64GB of ram > > ZFS v28 with four Intel DC S3700 drives (800GB) as a zfs stripe > > Intel DC S3500 for ZIL and enabling/disabling has made no > > difference > > Used a spare DC S3700 for the ZIL and that made no difference > > either. > > NFS v3 & v4 for Mac home folders whose Cache fodler is redirected. > > > > I've tried: > > Compression on/of <-- no appreciable difference > > Deduplication on/off <-- no appreciable difference > > sync=disabled and sync=standard <-- no appreciable difference > > setting arc cache to 56GB and also to 32GB <-- no difference in > > performance > > in terms of kern. > > > > I've tried to follow the freebsd tuning guide: > > https://wiki.freebsd.org/ZFSTuningGuide to no avail either. I've > > read > > everything I can find on NFS on ZFS and nothing has helped. WHere am > > I > > going wrong? > > > You could try this patch: > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > - After applying the patch and booting a kernel built from the patched > sources, you need to increase the value of vfs.nfsd.tcphighwater. > (Try something like 5000 for it as a starting point.) > > Although this patch is somewhat different code, it should be semantically > the same as r254337 in head, that is scheduled to be MFC'd to stable/9 in > a couple of weeks. > > rick > > > Here's /boot/loader: > > [quote] > > # ZFS tuning tweaks > > aio_load="YES" # Async IO system calls > > autoboot_delay="10" # reduce boot menu delay time from 10 to 3 > > seconds > > vfs.zfs.arc_max="56868864000" # Reserves 10GB or ram for system, > > leaves > > 56GB for ZFS > > vfs.zfs.cache_flush_disable="1" > > #vfs.zfs.prefetch_disble="1" > > vfs.zfs.write_limit_override="429496728" > > > > kern.ipc.nmbclusters="264144" # increase the number of network > > mbufs > > kern.maxfiles="65535" > > net.inet.tcp.syncache.hashsize="1024" # Size of the syncache hash > > table > > net.inet.tcp.syncache.bucketlimit="100" # Limit the number of entries > > permitted in each bucket of the hash table. > > net.inet.tcp.tcbhashsize="32768" > > > > # Link Aggregation loader tweaks. see: > > https://calomel.org/freebsd_network_tuning.html > > hw.igb.enable_msix="1" > > hw.igb.num_queues="0" > > hw.igb.enable_aim="1" > > hw.igb.max_interrupt_rate="32000" > > hw.igb.rxd="2048" > > hw.igb.txd="2048" > > hw.igb.rx_process_limit="4096" > > if_lagg_load="YES" > > [/quote] > > > > Here's etc/sysctl.conf: > > [quote] > > # $FreeBSD: release/9.1.0/etc/sysctl.conf 112200 2003-03-13 18:43:50Z > > mux $ > > # > > # This file is read when going to multi-user and its contents piped > > thru > > # ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for > > details. > > # > > > > # Uncomment this to prevent users from seeing information about > > processes > > that > > # are being run under another UID. > > #security.bsd.see_other_uids=0 > > kern.ipc.somaxconn=1024 > > kern.maxusers=272 > > #kern.maxvnodes=1096848 #increase this if necessary > > kern.ipc.maxsockbuf=8388608 > > net.inet.tcp.mssdflt=1460 > > net.inet.ip.forwarding=1 > > net.inet.ip.fastforwarding=1 > > dev.igb.2.fc=0 > > dev.igb.3.fc=0 > > dev.igb.4.fc=0 > > dev.igb.5.fc=0 > > dev.igb.2.rx_procesing_limit=10000 > > dev.igb.3.rx_procesing_limit=10000 > > dev.igb.4.rx_procesing_limit=10000 > > dev.igb.5.rx_procesing_limit=10000 > > net.inet.ip.redirect=0 > > net.inet.icmp.bmcastecho=0 # do not respond to ICMP packets sent > > to IP > > .255 > > net.inet.icmp.maskfake=0 # do not fake reply to ICMP Address > > Mask > > Request packets > > net.inet.icmp.maskrepl=0 # replies are not sent for ICMP > > address mask > > net.inet.icmp.log_redirect=0 # do not log redirected ICMP packet > > attempts > > net.inet.icmp.drop_redirect=1 # no redirected ICMP packets > > net.inet.tcp.drop_synfin=1 # SYN/FIN packets get dropped on > > initial > > connection > > net.inet.tcp.ecn.enable=1 # explicit congestion notification > > (ecn) > > warning: some ISP routers abuse it > > net.inet.tcp.icmp_may_rst=0 # icmp may not send RST to avoid > > spoofed > > icmp/udp floods > > net.inet.tcp.maxtcptw=15000 # max number of tcp time_wait states > > for > > closing connections > > net.inet.tcp.msl=5000 # 5 second maximum segment life > > waiting for > > an ACK in reply to a SYN-ACK or FIN-ACK > > net.inet.tcp.path_mtu_discovery=0 # disable MTU discovery since most > > ICMP > > packets are dropped by others > > net.inet.tcp.rfc3042=0 # disable the limited transmit > > mechanism > > which can slow burst transmissions > > net.inet.ip.rtexpire=60 # 3600 secs > > net.inet.ip.rtminexpire=2 # 10 secs > > net.inet.ip.rtmaxcache=1024 # 128 entries > > [/quote] > > > > Here's /etc/rc.conf > > [quote] > > #ifconfig_igb2=" inet xxx.xx.x.xx netmask 255.255.248.0" > > hostname="xxxxxxxxxxxxxxxxxxx" > > # > > # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable > > dumpdev="NO" > > # > > ### LACP config > > ifconfig_igb2="up" > > ifconfig_igb3="up" > > ifconfig_igb4="up" > > ifconfig_igb5="up" > > cloned_interfaces="lagg0" > > ifconfig_lagg0="laggproto lacp laggport igb2 laggport igb3 laggport > > igb4 > > laggport igb5 xxx.xx.x.xx netmask 255.255.248.0" > > ipvr_addrs_lagg0="xxx.xx.x.xx" > > defaultrouter="xxx.xx.x.xx" > > # > > ### Defaults for SSH, NTP, ZFS > > sshd_enable="YES" > > ntpd_enable="YES" > > zfs_enable="YES" > > # > > ## NFS Server > > rpcbind_enable="YES" > > nfs_server_enable="YES" > > mountd_flags="-r -l" > > nfsd_enable="YES" > > mountd_enable="YES" > > rpc_lockd_enable="NO" > > rpc_statd_enable="NO" > > nfs_server_flags="-u -t -n 128" > > nfsv4_server_enable="YES" > > nfsuserd_enable="YES" > > [/quote] > > > > Thanks in advance, > > -- > > Eric Browning > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > -- Eric Browning Systems Administrator 801-984-7623 Skaggs Catholic Center Juan Diego Catholic High School Saint John the Baptist Middle Saint John the Baptist Elementary From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 19:20:57 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 5C4D9B7F; Wed, 28 Aug 2013 19:20:57 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [IPv6:2001:5a8:4:7e72:4a5b:39ff:fe12:452]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id ED7882118; Wed, 28 Aug 2013 19:20:56 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id r7SJKqQJ021893; Wed, 28 Aug 2013 12:20:52 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201308281920.r7SJKqQJ021893@chez.mckusick.com> To: Ivan Voras Subject: Re: Call fo comments - raising vfs.ufs.dirhash_reclaimage? In-reply-to: Date: Wed, 28 Aug 2013 12:20:52 -0700 From: Kirk McKusick Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 19:20:57 -0000 I am in favor of your change to increase reclaimage from 5 to 60 seconds. As you note, the cost is small and when applicable, the benefits are large. You are also correct that most system adminsitrators do not know about these tuning knobs, so do not use them. We should do our best to auto-tune and set sensible defaults. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 20:49:51 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 3DDD4A7A; Wed, 28 Aug 2013 20:49:51 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id CE7D526AF; Wed, 28 Aug 2013 20:49:50 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEAMxhHlKDaFve/2dsb2JhbABYAxaDJlGDJ7x8gTd0giQBAQEEAQEBICsgCxsYAgINGQIpAQkmBggHBAEYBASHYAynIpIzgSmMfAEFgQUkEAcRgleBMQOVKIN2kDSDPCAyegEIFyI X-IronPort-AV: E=Sophos;i="4.89,977,1367985600"; d="scan'208";a="47951292" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 28 Aug 2013 16:49:43 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 3B29CB4036; Wed, 28 Aug 2013 16:49:43 -0400 (EDT) Date: Wed, 28 Aug 2013 16:49:43 -0400 (EDT) From: Rick Macklem To: Eric Browning Message-ID: <1342658741.14983067.1377722983208.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: NFS on ZFS pure SSD pool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs@freebsd.org, Garrett Wollman X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 20:49:51 -0000 Eric Browning wrote: > > Rick, > > > Sam and I applied the patch (kernel now at r254983M) and set > vfs.nfsd.tcphighwater=5000 in sysctl.conf and my CPU is still > slammed. SHould I up it to 10000? > You can try. I have no insight into where this goes, since I can't produce the kind of server/load where it makes any difference. (I have single core i386 (P4 or similar) to test with and I don't use ZFS at all.) I've cc'd Garrett Wollman, since he runs rather large servers and may have some insight into appropriate tuning, etc. rick > > > On Tue, Aug 27, 2013 at 1:29 PM, Rick Macklem < rmacklem@uoguelph.ca > > wrote: > > > > > Eric Browning wrote: > > Hello, first time posting to this list. I have a new server that is > > not > > living up to the promise of SSD speeds and NFS is maxing out the > > CPU. > > I'm > > new to FreeBSD but I've been reading up on it as much as I can. I > > have > > obscured my IP addresses and hostname with x's so just ignore that. > > Server has about 200 users on it each draing under 50Mb/s peak > > sustained > > around 1-2Mb/s > > > > I've followed some network tuning guides for our I350t4 nic and > > that > > has > > helped with network performance somewhat but the server is still > > experiencing heavy load with pegging the CPU at 1250% on average > > with > > only > > 50Mb/s of traffin in/out of the machine. All of the network tuning > > came > > from https://calomel.org/freebsd_network_tuning.html since it was > > relevant > > to the same nic that I have. > > > > Server Specs: > > FreeBSD 9.1 > > 16 cores AMDx64 > > 64GB of ram > > ZFS v28 with four Intel DC S3700 drives (800GB) as a zfs stripe > > Intel DC S3500 for ZIL and enabling/disabling has made no > > difference > > Used a spare DC S3700 for the ZIL and that made no difference > > either. > > NFS v3 & v4 for Mac home folders whose Cache fodler is redirected. > > > > I've tried: > > Compression on/of <-- no appreciable difference > > Deduplication on/off <-- no appreciable difference > > sync=disabled and sync=standard <-- no appreciable difference > > setting arc cache to 56GB and also to 32GB <-- no difference in > > performance > > in terms of kern. > > > > I've tried to follow the freebsd tuning guide: > > https://wiki.freebsd.org/ZFSTuningGuide to no avail either. I've > > read > > everything I can find on NFS on ZFS and nothing has helped. WHere > > am > > I > > going wrong? > > > You could try this patch: > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > - After applying the patch and booting a kernel built from the > patched > sources, you need to increase the value of vfs.nfsd.tcphighwater. > (Try something like 5000 for it as a starting point.) > > Although this patch is somewhat different code, it should be > semantically > the same as r254337 in head, that is scheduled to be MFC'd to > stable/9 in > a couple of weeks. > > rick > > > > > Here's /boot/loader: > > [quote] > > # ZFS tuning tweaks > > aio_load="YES" # Async IO system calls > > autoboot_delay="10" # reduce boot menu delay time from 10 to 3 > > seconds > > vfs.zfs.arc_max="56868864000" # Reserves 10GB or ram for system, > > leaves > > 56GB for ZFS > > vfs.zfs.cache_flush_disable="1" > > #vfs.zfs.prefetch_disble="1" > > vfs.zfs.write_limit_override="429496728" > > > > kern.ipc.nmbclusters="264144" # increase the number of network > > mbufs > > kern.maxfiles="65535" > > net.inet.tcp.syncache.hashsize="1024" # Size of the syncache hash > > table > > net.inet.tcp.syncache.bucketlimit="100" # Limit the number of > > entries > > permitted in each bucket of the hash table. > > net.inet.tcp.tcbhashsize="32768" > > > > # Link Aggregation loader tweaks. see: > > https://calomel.org/freebsd_network_tuning.html > > hw.igb.enable_msix="1" > > hw.igb.num_queues="0" > > hw.igb.enable_aim="1" > > hw.igb.max_interrupt_rate="32000" > > hw.igb.rxd="2048" > > hw.igb.txd="2048" > > hw.igb.rx_process_limit="4096" > > if_lagg_load="YES" > > [/quote] > > > > Here's etc/sysctl.conf: > > [quote] > > # $FreeBSD: release/9.1.0/etc/sysctl.conf 112200 2003-03-13 > > 18:43:50Z > > mux $ > > # > > # This file is read when going to multi-user and its contents piped > > thru > > # ``sysctl'' to adjust kernel values. ``man 5 sysctl.conf'' for > > details. > > # > > > > # Uncomment this to prevent users from seeing information about > > processes > > that > > # are being run under another UID. > > #security.bsd.see_other_uids=0 > > kern.ipc.somaxconn=1024 > > kern.maxusers=272 > > #kern.maxvnodes=1096848 #increase this if necessary > > kern.ipc.maxsockbuf=8388608 > > net.inet.tcp.mssdflt=1460 > > net.inet.ip.forwarding=1 > > net.inet.ip.fastforwarding=1 > > dev.igb.2.fc=0 > > dev.igb.3.fc=0 > > dev.igb.4.fc=0 > > dev.igb.5.fc=0 > > dev.igb.2.rx_procesing_limit=10000 > > dev.igb.3.rx_procesing_limit=10000 > > dev.igb.4.rx_procesing_limit=10000 > > dev.igb.5.rx_procesing_limit=10000 > > net.inet.ip.redirect=0 > > net.inet.icmp.bmcastecho=0 # do not respond to ICMP packets sent > > to IP > > .255 > > net.inet.icmp.maskfake=0 # do not fake reply to ICMP Address > > Mask > > Request packets > > net.inet.icmp.maskrepl=0 # replies are not sent for ICMP > > address mask > > net.inet.icmp.log_redirect=0 # do not log redirected ICMP packet > > attempts > > net.inet.icmp.drop_redirect=1 # no redirected ICMP packets > > net.inet.tcp.drop_synfin=1 # SYN/FIN packets get dropped on > > initial > > connection > > net.inet.tcp.ecn.enable=1 # explicit congestion notification > > (ecn) > > warning: some ISP routers abuse it > > net.inet.tcp.icmp_may_rst=0 # icmp may not send RST to avoid > > spoofed > > icmp/udp floods > > net.inet.tcp.maxtcptw=15000 # max number of tcp time_wait states > > for > > closing connections > > net.inet.tcp.msl=5000 # 5 second maximum segment life > > waiting for > > an ACK in reply to a SYN-ACK or FIN-ACK > > net.inet.tcp.path_mtu_discovery=0 # disable MTU discovery since > > most > > ICMP > > packets are dropped by others > > net.inet.tcp.rfc3042=0 # disable the limited transmit > > mechanism > > which can slow burst transmissions > > net.inet.ip.rtexpire=60 # 3600 secs > > net.inet.ip.rtminexpire=2 # 10 secs > > net.inet.ip.rtmaxcache=1024 # 128 entries > > [/quote] > > > > Here's /etc/rc.conf > > [quote] > > #ifconfig_igb2=" inet xxx.xx.x.xx netmask 255.255.248.0" > > hostname="xxxxxxxxxxxxxxxxxxx" > > # > > # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable > > dumpdev="NO" > > # > > ### LACP config > > ifconfig_igb2="up" > > ifconfig_igb3="up" > > ifconfig_igb4="up" > > ifconfig_igb5="up" > > cloned_interfaces="lagg0" > > ifconfig_lagg0="laggproto lacp laggport igb2 laggport igb3 laggport > > igb4 > > laggport igb5 xxx.xx.x.xx netmask 255.255.248.0" > > ipvr_addrs_lagg0="xxx.xx.x.xx" > > defaultrouter="xxx.xx.x.xx" > > # > > ### Defaults for SSH, NTP, ZFS > > sshd_enable="YES" > > ntpd_enable="YES" > > zfs_enable="YES" > > # > > ## NFS Server > > rpcbind_enable="YES" > > nfs_server_enable="YES" > > mountd_flags="-r -l" > > nfsd_enable="YES" > > mountd_enable="YES" > > rpc_lockd_enable="NO" > > rpc_statd_enable="NO" > > nfs_server_flags="-u -t -n 128" > > nfsv4_server_enable="YES" > > nfsuserd_enable="YES" > > [/quote] > > > > Thanks in advance, > > -- > > Eric Browning > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to " > > freebsd-fs-unsubscribe@freebsd.org " > > > > > > > -- > > Eric Browning > Systems Administrator > 801-984-7623 > > Skaggs Catholic Center > Juan Diego Catholic High School > Saint John the Baptist Middle > Saint John the Baptist Elementary > From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 21:55:24 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 958FAEC5 for ; Wed, 28 Aug 2013 21:55:24 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 20BD42A8A for ; Wed, 28 Aug 2013 21:55:23 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r7SLtK53033771; Wed, 28 Aug 2013 17:55:20 -0400 (EDT) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r7SLtK17033768; Wed, 28 Aug 2013 17:55:20 -0400 (EDT) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21022.29128.557471.157078@hergotha.csail.mit.edu> Date: Wed, 28 Aug 2013 17:55:20 -0400 From: Garrett Wollman To: Rick Macklem Subject: Re: NFS on ZFS pure SSD pool In-Reply-To: <1342658741.14983067.1377722983208.JavaMail.root@uoguelph.ca> References: <1342658741.14983067.1377722983208.JavaMail.root@uoguelph.ca> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 28 Aug 2013 17:55:21 -0400 (EDT) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 21:55:24 -0000 < said: > Eric Browning wrote: >> Sam and I applied the patch (kernel now at r254983M) and set >> vfs.nfsd.tcphighwater=5000 in sysctl.conf and my CPU is still >> slammed. SHould I up it to 10000? >> > You can try. I have no insight into where this goes, since I can't > produce the kind of server/load where it makes any difference. (I have > single core i386 (P4 or similar) to test with and I don't use ZFS at all.) > I've cc'd Garrett Wollman, since he runs rather large servers and may > have some insight into appropriate tuning, etc. 10,000 is probably way too small. We run high-peformance servers with vfs.nfsd.tcphighwater set between 100k and 150k, and we crank vfs.nfsd.tcpcachetimeo down to five minutes or less. Just to give you an idea of how rarely this cache is actually hit: my two main production file servers have both been up for about three months now, and have answered billions of requests (enough for the 32-bit signed statistics counters to wrap). One server shows 63 hits, with a peak TCP cache size of 150k and the other shows zero, with a peak cache size of 64k. Another server, which serves scratch space, has been up for a little more than a month, and in nearly two billion accesses has yet to see a single cache hit (peak cache size 131k, which was actually hitting the configured limit, which I've since raised). -GAWollman From owner-freebsd-fs@FreeBSD.ORG Wed Aug 28 23:48:37 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 140E48EC for ; Wed, 28 Aug 2013 23:48:37 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id CE7B92168 for ; Wed, 28 Aug 2013 23:48:36 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqAEAEmLHlKDaFve/2dsb2JhbABahA2DJ7x+gTh0giQBAQUjVhsYAgINGQJZBhOIAacUkiuBKYx9gQo0B4JogTEDlBeVO4M8IIEtQQ X-IronPort-AV: E=Sophos;i="4.89,978,1367985600"; d="scan'208";a="47227100" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 28 Aug 2013 19:48:29 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id A1A53B3FAE; Wed, 28 Aug 2013 19:48:29 -0400 (EDT) Date: Wed, 28 Aug 2013 19:48:29 -0400 (EDT) From: Rick Macklem To: Garrett Wollman Message-ID: <461209820.15034260.1377733709648.JavaMail.root@uoguelph.ca> In-Reply-To: <21022.29128.557471.157078@hergotha.csail.mit.edu> Subject: Re: NFS on ZFS pure SSD pool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 23:48:37 -0000 Garrett Wollman wrote: > < said: > > > Eric Browning wrote: > >> Sam and I applied the patch (kernel now at r254983M) and set > >> vfs.nfsd.tcphighwater=5000 in sysctl.conf and my CPU is still > >> slammed. SHould I up it to 10000? > >> > > You can try. I have no insight into where this goes, since I can't > > produce the kind of server/load where it makes any difference. (I > > have > > single core i386 (P4 or similar) to test with and I don't use ZFS > > at all.) > > I've cc'd Garrett Wollman, since he runs rather large servers and > > may > > have some insight into appropriate tuning, etc. > > 10,000 is probably way too small. We run high-peformance servers > with > vfs.nfsd.tcphighwater set between 100k and 150k, and we crank > vfs.nfsd.tcpcachetimeo down to five minutes or less. > > Just to give you an idea of how rarely this cache is actually hit: my > two main production file servers have both been up for about three > months now, and have answered billions of requests (enough for the > 32-bit signed statistics counters to wrap). One server shows 63 > hits, > with a peak TCP cache size of 150k and the other shows zero, with a > peak cache size of 64k. Another server, which serves scratch space, > has been up for a little more than a month, and in nearly two billion > accesses has yet to see a single cache hit (peak cache size 131k, > which was actually hitting the configured limit, which I've since > raised). > > -GAWollman > Yes. The cache is only hit if a client is network partitioned for long enough that it does an RPC retry over TCP. Most clients only do this now (this behaviour is required for NFSv4) when the client establishes a new TCP connection after giving up on the old one. (How quickly this occurs will depend on the client, but I am not surprised it is rare in a well maintained LAN environment.) You should get your users to do their mounts over flaky WiFi links and such, in order to make better use of the cache;-) By the way Garrett, what do you have kern.ipc.nmbclusters set to, since cache entries will use mbuf clusters normally. And Garrett, thanks for your input, rick From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 00:03:00 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B9A38E86 for ; Thu, 29 Aug 2013 00:03:00 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 814FA2252 for ; Thu, 29 Aug 2013 00:03:00 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAJOOHlKDaFve/2dsb2JhbABaFoMmUYMnvH6BOHSCTgSBBwINGQJfiBQMmASOf5IrgSmMcoEVgyOBMQOZHpA0gzwggTU5 X-IronPort-AV: E=Sophos;i="4.89,978,1367985600"; d="scan'208";a="47973851" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 28 Aug 2013 20:02:59 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 7A07BB3F1B for ; Wed, 28 Aug 2013 20:02:59 -0400 (EDT) Date: Wed, 28 Aug 2013 20:02:59 -0400 (EDT) From: Rick Macklem To: freebsd-fs Message-ID: <1332572251.15040105.1377734579493.JavaMail.root@uoguelph.ca> Subject: rpc.lockd kernel RPC over UDP patch for testing/review MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 00:03:00 -0000 Hi, Doug White posted this to me via email some time ago (I hope he doesn't mind me reposting it here): > First, we have a installed client system doing heavy NFS lock traffic that occasionally > experiences lockd lockups that require a system reboot to clear. Diagnosis of > the most recent hang identified corruption of one of the tracking variables > (cu->cu_send specifically) in the congestion control in clnt_dg_call() as the culprit. > Since lockd only uses one thread, no congestion control is really necessary. We are > going to make a local patch to avoid the if() that leads to the msleep() if > cu->threads = 1 so we don't run into that again, though the corruption of > cu_send is still a bit troubling. The corruption might stem from repeated retries allowing > cu_send to grow without bound, or some other bizarre code path that causes underflow. After inspecting the code, I found two places where cu_sent (Doug called it cu_send just to try and confuse me. It worked for a while;-) wasn't incremented when a request was re-inserted in the send queue. Since it is always decremented when a request is dequeued, I think this could have resulted in a bogus cu_sent value. The simple patch at: http://people.freebsd.org/~rmacklem/rpcudp.patch adds increments for cu_sent for these two places. If anyone is using rpc.lockd and can test/review this patch, it would be appreciated. Thanks, rick From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 00:15:28 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id BB7742B2; Thu, 29 Aug 2013 00:15:28 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 6F142230D; Thu, 29 Aug 2013 00:15:27 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEABqSHlKDaFve/2dsb2JhbABahA2DJ7txgQ2BOXSCTgRSNQINGQJfiBSnCZIrgSmOBzSCb4ExA6lSgzwggW4 X-IronPort-AV: E=Sophos;i="4.89,978,1367985600"; d="scan'208";a="47975341" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 28 Aug 2013 20:15:27 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 4F9B8B4042; Wed, 28 Aug 2013 20:15:27 -0400 (EDT) Date: Wed, 28 Aug 2013 20:15:27 -0400 (EDT) From: Rick Macklem To: freebsd-fs Message-ID: <2057976640.15042430.1377735327317.JavaMail.root@uoguelph.ca> Subject: fixing "umount -f" for the NFS client MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: Kostik Belousov X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 00:15:28 -0000 I've been doing a little more testing of "umount -f" for NFS mounts and they seem to be working unless some other process/thread has busied the file system via vfs_busy(). Unfortunately, it is pretty easy to vfs_busy() the file system by using a command like "df" that is stuck on the unresponsive NFS server. The problem seems to be that dounmount() msleep()s while mnt_lockref != 0 before calling VFS_UNMOUNT(). If some call into the NFS client was done before this while (mp->mnt_lockref) loop with msleep() in it, it can easily kill off RPCs in progress. (It currently does this in nfs_unmount() using the newnfs_nmcancelreqs() call. In summary: - Would it be appropriate to add a new vfs_XXX method that dounmount() would call before the while() loop for the forced dismount case? (The default would be a no-op and I have no idea if any file system other than NFS would have a use for it?) Alternately, there could be a function pointer set non-NULL that would specifically be used by the NFS client for this. This would avoid adding a vfs_XXX() method, but would mean an NFS specific call ends up in the generic dounmount() code. Anyone have comments on this? Thanks, rick From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 00:56:27 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 2099278 for ; Thu, 29 Aug 2013 00:56:27 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id B38A32535 for ; Thu, 29 Aug 2013 00:56:26 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r7T0uGwA044085; Thu, 29 Aug 2013 03:56:16 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r7T0uGwA044085 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id r7T0uGas044084; Thu, 29 Aug 2013 03:56:16 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 29 Aug 2013 03:56:16 +0300 From: Konstantin Belousov To: Rick Macklem Subject: Re: fixing "umount -f" for the NFS client Message-ID: <20130829005616.GH4972@kib.kiev.ua> References: <2057976640.15042430.1377735327317.JavaMail.root@uoguelph.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="huwNgOI5TI5AwA3T" Content-Disposition: inline In-Reply-To: <2057976640.15042430.1377735327317.JavaMail.root@uoguelph.ca> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 00:56:27 -0000 --huwNgOI5TI5AwA3T Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Aug 28, 2013 at 08:15:27PM -0400, Rick Macklem wrote: > I've been doing a little more testing of "umount -f" for NFS > mounts and they seem to be working unless some other process/thread > has busied the file system via vfs_busy(). >=20 > Unfortunately, it is pretty easy to vfs_busy() the file system > by using a command like "df" that is stuck on the unresponsive > NFS server. >=20 > The problem seems to be that dounmount() msleep()s while > mnt_lockref !=3D 0 before calling VFS_UNMOUNT(). >=20 > If some call into the NFS client was done before this > while (mp->mnt_lockref) loop with msleep() in it, it > can easily kill off RPCs in progress. (It currently > does this in nfs_unmount() using the newnfs_nmcancelreqs() > call. >=20 > In summary: > - Would it be appropriate to add a new vfs_XXX method that > dounmount() would call before the while() loop for the > forced dismount case? > (The default would be a no-op and I have no idea if any > file system other than NFS would have a use for it?) > Alternately, there could be a function pointer set non-NULL > that would specifically be used by the NFS client for this. > This would avoid adding a vfs_XXX() method, but would mean > an NFS specific call ends up in the generic dounmount() code. >=20 > Anyone have comments on this? >=20 Yes, I do. I agree with adding the pre-unmount vfs method. This seems to be the cleanest solution possible. --huwNgOI5TI5AwA3T Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.21 (FreeBSD) iQIcBAEBAgAGBQJSHpwvAAoJEJDCuSvBvK1Bu9EP/i4ulJS03WTiV8Nu2rQJWFgM EXOJco5g/AeUjDIi0xo+Wk1BTfErnHIWFv7erCeDULtLEu4rj5zq4vOSbqPrqW9w mgD4J6FOuqda4rdMtWZYq/lwaXYGiKcu6frxtVxsag1yhZwjaZzavyBK7EburagX utVEOcizXXQtiL0ZgRoeyVejVAyJYt7xz/032+nghc9WKp0Zc0fyC2TWOiIsmHRq h09J9iYIL0mxj2r7mM6GiHaclPrl9q2e/wYV1Civ117NYI/E2G/z4CwW9/gtpFnM qnsaNHWbKJFfJGwMpfG4icLgwXbBgsmH0TI4lBKx9yk/sRaIDNmLrvpBwoQxZADH 8JsU+K8sV17ESnHnSKNWFvwWi69+BYl3eNE1EGo4nYZOD1AusP951xt3hU9Qd8d+ uTsD6KMnjlOS4L6Gnii1Gp4N00eaJHjO21Zj++CtYhfF24/i73zrutzD2axo2hAf tK9o37V5/Urnoa1pDuJLn/CI51XdbnesVma7mkQlTl6rDFdF8LPxfE/1QarxDl6w GVw+4oLYpA3RREiKvre8Te9tN6lY+9CPV87JOefb/Fo9aS/bwphcdyNxCmh4FnAk baLvf+27PzIqMBA2EV/pOAFCdFBWPH1QwSjeAIbFUGVjtBVWftWWxDQT72n2IgtV uICNAIQEMfYfXUcDh3at =RcCt -----END PGP SIGNATURE----- --huwNgOI5TI5AwA3T-- From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 01:13:51 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id A3A9A589 for ; Thu, 29 Aug 2013 01:13:51 +0000 (UTC) (envelope-from sfourman@gmail.com) Received: from mail-ve0-x230.google.com (mail-ve0-x230.google.com [IPv6:2607:f8b0:400c:c01::230]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 61E912648 for ; Thu, 29 Aug 2013 01:13:51 +0000 (UTC) Received: by mail-ve0-f176.google.com with SMTP id b10so4685159vea.21 for ; Wed, 28 Aug 2013 18:13:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=sZbP02pFrd8v5S4QlI2OSjsW/M0tQOSmgd4T5J2ODco=; b=s2+0Pf8WNachucdFPRivBbyaqBghFxq0Gl9TFWfnYKK10XNC4s1pOD5n/Neje9rgnO JJcsm0fB7AMs8WnE/bVvurPCTW0jKS5mcj1H0zlCsbHxoqRIMapUzaQLCw695/0vRA9X P3fRr6B2JXVJ1dp9dRlZ5l5+0CDhAo7vv+vPiMu6BEiponWoeUp+SiZzX+6zIc2KkV9M 1fUbPUDyEUWkC1rftf0s3j0smkhDaU5GytlviAdyWiLWvaXZKYttrM0idQnIhHUWPP4R ykB5jvIQgEm9Pevo/2RmTg7w0FRk6LSZEMQiwqq7NpTJsoKz5D9c0FZjn7rBvN000Zqe psyA== MIME-Version: 1.0 X-Received: by 10.58.100.234 with SMTP id fb10mr537527veb.5.1377738830497; Wed, 28 Aug 2013 18:13:50 -0700 (PDT) Received: by 10.220.96.78 with HTTP; Wed, 28 Aug 2013 18:13:50 -0700 (PDT) In-Reply-To: References: <2008996797.14358576.1377631792358.JavaMail.root@uoguelph.ca> Date: Wed, 28 Aug 2013 21:13:50 -0400 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: "Sam Fourman Jr." To: Eric Browning Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 01:13:51 -0000 On Wed, Aug 28, 2013 at 2:27 PM, Eric Browning < ericbrowning@skaggscatholiccenter.org> wrote: > Rick, > > Sam and I applied the patch (kernel now at r254983M) and set > vfs.nfsd.tcphighwater=5000 > in sysctl.conf and my CPU is still slammed. SHould I up it to 10000? > > > Hello, list I am helping Eric debug and test this situation as much as I can. So to clarify and recap, here is the situation: This is a production setting, in a school, that has 200+ students using a mix of systems,with the primary client being OSX 10.8. and the primary function is using NFS. from what I can see there should be plenty of disk I/O these are Intel SSD disks.. The server is running FreeBSD 9-STABLE r254983 (we patched it last night) with this patch http://people.freebsd.org/~rmacklem/drc4-stable9.patch Here is a full dmesg for reference (it states FreeBSD 9.1,but we have since upgraded and applied the above patch) https://gist.github.com/sfourman/6373059 The main problem is we need better performance from NFS, but it would appear the server is starved for CPU cycles.... With only a few clients the server is lightning fast but with 25 users logging in this morning (students in class) the server went right to 1200% CPU load and about 300% more going to "intr" and it pretty much stayed there all day until they logged out between classes. So that works out to be somewhere between 2 to 4 users per core during today's classes, different settings for vfs.nfsd.tcphighwater were tested various ranges from 5,000 up to 50,000 were used while a load was present, but the processor load didn't change. Garrett stated that he tried values in upwards of 100,000... this can be tested tomorrow It would be helpful if we could get some direction, on other things we might try tomorrow. one idea is, the server has several igb Ethernet interfaces with 8 queue's per interface is it worth forcing the interfaces down to one queue? Is NFS even setup to understand multi queue network devices? or doesn't it matter? Any thoughts are appreciated -- Sam Fourman Jr. From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 01:31:33 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 041CC9E7 for ; Thu, 29 Aug 2013 01:31:33 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 96751271C for ; Thu, 29 Aug 2013 01:31:32 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r7T1VUKQ036585; Wed, 28 Aug 2013 21:31:30 -0400 (EDT) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r7T1VUSk036582; Wed, 28 Aug 2013 21:31:30 -0400 (EDT) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21022.42098.291440.900505@hergotha.csail.mit.edu> Date: Wed, 28 Aug 2013 21:31:30 -0400 From: Garrett Wollman To: Rick Macklem Subject: Re: NFS on ZFS pure SSD pool In-Reply-To: <461209820.15034260.1377733709648.JavaMail.root@uoguelph.ca> References: <21022.29128.557471.157078@hergotha.csail.mit.edu> <461209820.15034260.1377733709648.JavaMail.root@uoguelph.ca> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Wed, 28 Aug 2013 21:31:30 -0400 (EDT) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 01:31:33 -0000 < said: > You should get your users to do their mounts over flaky WiFi links > and such, in order to make better use of the cache;-) We don't support NFS use by such clients -- it's purely for compute cluster type applications. Anything that can use AFS is supposed to use AFS. > By the way Garrett, what do you have kern.ipc.nmbclusters set to, > since cache entries will use mbuf clusters normally. I have it at 2**20, which is actually only important because it causes kern.ipc.nmbjumbop to be set as a side effect. We also set maxusers (to match the new calculation in 10-current) so that other kernel data structures will be sized appropriately. This server's pretty idle right now: 36907/150098/187005 mbufs in use (current/cache/total) 948/22794/23742/1048576 mbuf clusters in use (current/cache/total/max) 0/4352 mbuf+clusters out of packet secondary zone in use (current/cache) 24583/36548/61131/524288 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/262144 9k jumbo clusters in use (current/cache/total/max) 0/0/0/131072 16k jumbo clusters in use (current/cache/total/max) 109454K/229304K/338759K bytes allocated to network (current/cache/total) On a machine without jumbo frames, it looks like this: 10829/230836/241665 mbufs in use (current/cache/total) 8268/93146/101414/1048576 mbuf clusters in use (current/cache/total/max) 8190/80641 mbuf+clusters out of packet secondary zone in use (current/cache) 0/1993/1993/524288 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/262144 9k jumbo clusters in use (current/cache/total/max) 0/0/0/131072 16k jumbo clusters in use (current/cache/total/max) 19243K/251973K/271216K bytes allocated to network (current/cache/total) -GAWollman From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 03:50:01 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D346EDE1 for ; Thu, 29 Aug 2013 03:50:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A588E2EFC for ; Thu, 29 Aug 2013 03:50:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r7T3o1Gt000669 for ; Thu, 29 Aug 2013 03:50:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r7T3o1rd000668; Thu, 29 Aug 2013 03:50:01 GMT (envelope-from gnats) Date: Thu, 29 Aug 2013 03:50:01 GMT Message-Id: <201308290350.r7T3o1rd000668@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: Brent Welch Subject: Re: kern/122380: [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash mem) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Brent Welch List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 03:50:01 -0000 The following reply was made to PR kern/122380; it has been noted by GNATS. From: Brent Welch To: bug-followup@FreeBSD.org, freebsd-current-panic@t-b-o-h.net Cc: Subject: Re: kern/122380: [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash mem) Date: Wed, 28 Aug 2013 20:45:22 -0700 --e89a8ff1ceb6858c3a04e50defdd Content-Type: text/plain; charset=ISO-8859-1 I'm hitting this frequently on 9.1, freshly installed from the ISO image. I have many ffs partitions (40) and am running a simple load generation script against each mount point. The load generator just does a random collection of create, read, write, hardlink, trunc, delete from a script. There is only a single application thread running against each mount point, but there are 40 processes running against 40 different mount points. This has happened both after a hard crash + FSCK, and fairly soon after a newfs. 36 of the devices are SAS drives. 4 are SATA SSD. I'm not sure which mount point is involved, yet. Brent Welch welch@acm.org --e89a8ff1ceb6858c3a04e50defdd Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I'm hitting this frequently on 9.1, freshly installed = from the ISO image. =A0I have many ffs partitions (40)
and am running a= simple load generation script against each mount point. =A0The load genera= tor just
does a random collection of create, read, write, hardlink, trunc, dele= te from a script.

There is only a single applicati= on thread running against each mount point, but there are 40 processes
running against 40 different mount points. =A0This has happened both a= fter a hard crash + FSCK, and
fairly soon after a newfs.

36 of the devices are SAS drives. =A04 are SATA SSD. =A0I&= #39;m not sure which mount point is involved, yet.

Brent Welch
--e89a8ff1ceb6858c3a04e50defdd-- From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 07:10:42 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 38D9E2A0 for ; Thu, 29 Aug 2013 07:10:42 +0000 (UTC) (envelope-from supportsobaka@mail.ru) Received: from fallback4.mail.ru (fallback4.mail.ru [94.100.176.42]) by mx1.freebsd.org (Postfix) with ESMTP id ACEEF2A22 for ; Thu, 29 Aug 2013 07:10:41 +0000 (UTC) Received: from f352.i.mail.ru (f352.i.mail.ru [217.69.140.248]) by fallback4.mail.ru (mPOP.Fallback_MX) with ESMTP id 12D031BC67B for ; Thu, 29 Aug 2013 11:08:21 +0400 (MSK) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Type:Message-ID:Reply-To:Date:Mime-Version:Subject:To:From; bh=iiUZrgzLHhuPQ+W/7I6o+7AUYHTbZoGM+0BhYByvibA=; b=Jk04z2KdNHvw5mP41FluywuK7wHPPhKSz2cWUpAnV9X0ZTbxVTdInol4pBgLuek+lUmBT0nY6U4EF6FjU0JUscK0aId7tddvmFSu1TZdD+QUfqMARVqYE7dkZHJ4GPCO3AdUdHVb41wLtFGISXCouX2RpRGGHgzyHsE/YRSgZJA=; Received: from mail by f352.i.mail.ru with local (envelope-from ) id 1VEwKz-00055L-3f for freebsd-fs@FreeBSD.org; Thu, 29 Aug 2013 11:08:13 +0400 Received: from [194.215.121.251] by e.mail.ru with HTTP; Thu, 29 Aug 2013 11:08:13 +0400 From: supportsobaka@mail.ru To: freebsd-fs@FreeBSD.org Subject: =?UTF-8?B?SW5zdGFsbCBPUyBvbiBIQVNU?= Mime-Version: 1.0 X-Mailer: Mail.Ru Mailer 1.0 X-Originating-IP: [194.215.121.251] Date: Thu, 29 Aug 2013 11:08:13 +0400 X-Priority: 3 (Normal) Message-ID: <1377760093.241716063@f352.i.mail.ru> X-Mras: Ok Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: base64 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: supportsobaka@mail.ru List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 07:10:42 -0000 IEhpLAoKVGhlIGlkZWEgb2YgSEFTVCBpcyBncmVhdCwgYnV0IEknbSBub3QgZXhhY3RseSB1bmRl cnN0b29kIGhvdyBpdCBjb3VsZCBzZXJ2ZSBhcyB0aGUgcmVhbCBjbHVzdGVyIG9mIHR3byBtYWNo aW5lcy4gQ2x1c3RlciBvZiBzdG9yYWdlIGlzIGRpZmZlcmVudCBzdG9yeSAtIGl0IHJlcXVpcmUg dG8gaGF2ZSBhdCBsZWFzdCB0d28gZGlza3Mgb24gZWFjaCBtYWNoaW5lLiBBbHNvIE9TIG9uIGVh Y2ggbWFjaGluZSBtdXN0IGJlIHN5bmNocm9uaXplZCB0byBiZSBhYmxlIHRvIHdvcmsgc2ltaWxh ciBpbiB0aGUgZXZlbnQgb2YgZmFpbHVyZSBvZiBtYXN0ZXIgbm9kZS4gVGhhdCdzIG5vdCBzbyBl bGVnYW50IHNvbHV0aW9uIGFuZCBub3Qgc28gZGVtYW5kZWQgYXMgY2x1c3RlciBvZiB0d28gc2Vy dmVycy4gSXMgdGhlcmUgYW55IHBvc3NpYmlsaXR5IHRvIGluc3RhbGwgT1Mgb24gdGhlIGhhc3Qg ZGV2aWNlIHJpZ2h0IGZyb20gdGhlIGJlZ2lubmluZyBvZiBpbnN0YWxsYXRpb24gdG8gdGhlIG1h Y2hpbmUgd2l0aCBvbmUgSEREOiBlLmcuIEkgaGF2ZSBib290LUNEIG9yIEkgaGF2ZSBib290ZWQg ZnJvbSBuZXR3b3JrIGFuZCBzdGFydCBpbnN0YWxsYXRpb24gd2l0aCBic2RpbnN0YWxsLCB0aGVu IEkgZ28gdG8gc2hlbGwgZm9yIGRpc2sgY29uZmlndXJhdGlvbi4gQXQgdGhhdCBtb21lbnQgY2Fu IEkgY3JlYXRlIC9kZXYvaGFzdCBhbmQgc3RhcnQgaW5zdGFsbCBPUyBvbiB0aGlzPyBXaGF0IHNo b3VsZCBiZSB0YWtlbiBpbnRvIGNvbnNpZGVyYXRpb24gdGhlbj/CoAoKQW5vdGhlciBwcm9ibGVt LiBIQVNUIGlzIG5vdCBzdWl0YWJsZSBmb3IgZGF0YWJhc2VzLCBlLmcuIE15U1FMLiBJZiBJIGhh dmUgc3RhbmRhbG9uZSBhbGwtaW4tb25lIHNlcnZlciAobWFpbCtteXNxbCBvbiB0aGUgc2FtZSBt YWNoaW5lKSBhbmQgSSB3YW50IHRvIHN0YXJ0IHRoaXMgb24gSEFTVCAoc2VlIGFib3ZlKSwgaG93 IGNhbiBJIGV4Y2x1ZGUgc29tZSBwYXJ0aXRpb25zL2ZvbGRlcnMgZnJvbSBzeW5jPyBJIGd1ZXNz IHRoZXJlIGlzIG5vIHN1Y2ggYW4gb3B0aW9uIG5vdy4gRG8geW91IHBsYW4gdG8gYWRkIG9yIGl0 J3MgaW1wb3NzaWJsZSBkdWUgdG8gSEFTVCBuYXR1cmU/CgpSZWdhcmRzCk9sZWc= From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 08:27:32 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 00254875 for ; Thu, 29 Aug 2013 08:27:31 +0000 (UTC) (envelope-from maurizio.vairani@cloverinformatica.it) Received: from smtpdg3.aruba.it (smtpdg221.aruba.it [62.149.158.221]) by mx1.freebsd.org (Postfix) with ESMTP id 5CA7A2EC0 for ; Thu, 29 Aug 2013 08:27:31 +0000 (UTC) Received: from cloverinformatica.it ([188.10.129.202]) by smtpcmd01.ad.aruba.it with bizsmtp id JYTT1m01T4N8xN401YTUij; Thu, 29 Aug 2013 10:27:28 +0200 Received: from [192.168.0.81] (ASUS-TERMINATOR [192.168.0.81]) by cloverinformatica.it (Postfix) with ESMTP id 37E729AB4; Thu, 29 Aug 2013 10:27:28 +0200 (CEST) Message-ID: <521F05F0.4090607@cloverinformatica.it> Date: Thu, 29 Aug 2013 10:27:28 +0200 From: Maurizio Vairani User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 To: freebsd-fs@freebsd.org, freebsd-stable@FreeBSD.org Subject: Boot problem if a ZFS log device is missing Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 08:27:32 -0000 Hi all, I am using an USB memory stick as cache and log devices for a HDD ZFS pool named tank0: $ zpool status -v tank0 pool: tank0 state: ONLINE scan: scrub repaired 0 in 7h19m with 0 errors on Tue Jul 30 06:11:23 2013 config: NAME STATE READ WRITE CKSUM tank0 ONLINE 0 0 0 ada0s1d ONLINE 0 0 0 logs gpt/SLOG ONLINE 0 0 0 cache gpt/L2ARC ONLINE 0 0 0 errors: No known data errors If I remove the stick before booting the laptop (a Compaq Presario) it will not boot with the error message: "Mounting from zfs:tank0 failed with error 6" end the mountroot> prompt is displayed. I need to reinsert the stick to completing the boot process before restarting the laptop. It is possible to boots the laptop without stick if I remove the log device with the command: # zpool remove tank0 gpt/SLOG before switching it off. The pool will works in degraded mode because the cache device is missing, but it works. I am able to boot the PC without a cache device but not without a log device. Why ? The laptop is updated to 9.2-PRERELEASE r254783. Regards Maurizio From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 09:02:33 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 401D4114; Thu, 29 Aug 2013 09:02:33 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 57E7720F0; Thu, 29 Aug 2013 09:02:31 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA23559; Thu, 29 Aug 2013 12:02:28 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VEy7Y-0002kk-BH; Thu, 29 Aug 2013 12:02:28 +0300 Message-ID: <521F0DEB.20408@FreeBSD.org> Date: Thu, 29 Aug 2013 12:01:31 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130810 Thunderbird/17.0.8 MIME-Version: 1.0 To: Maurizio Vairani Subject: Re: Boot problem if a ZFS log device is missing References: <521F05F0.4090607@cloverinformatica.it> In-Reply-To: <521F05F0.4090607@cloverinformatica.it> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 09:02:33 -0000 on 29/08/2013 11:27 Maurizio Vairani said the following: > > I am able to boot the PC without a cache device but not without a log device. Why ? The log could potentially contain uncommitted entries. Without the log device there is no knowing if it did or did not. And if it did then the pool is inconsistent state without the log device and so it can not be imported. The cache is not persistent and so there is nothing needed from it upon a boot. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 10:00:42 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 1C1EC511 for ; Thu, 29 Aug 2013 10:00:42 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 679AE255F for ; Thu, 29 Aug 2013 10:00:41 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA24663; Thu, 29 Aug 2013 13:00:38 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VEz1q-0002nE-Jw; Thu, 29 Aug 2013 13:00:38 +0300 Message-ID: <521F1B74.7020402@FreeBSD.org> Date: Thu, 29 Aug 2013 12:59:16 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130810 Thunderbird/17.0.8 MIME-Version: 1.0 To: Kuang-che Wu Subject: Re: zfs dead lock References: In-Reply-To: X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 10:00:42 -0000 on 15/08/2013 21:36 Kuang-che Wu said the following: > I suspect I encountered zfs deadlock yesterday.(According to > https://wiki.freebsd.org/AvgZfsDeadlockDebug) > Following is the report. > > I have two zpool -- one is named zroot and the other is zdata. They are > both raidz2 on same set of 6 disks. > While zroot works well, any processes hang if they touch zdata. > And those processes hang forever (at least more than half hours). > > Before zdata hang, there are about 10 processes busy read/write to zdata. > Their read/write patterns are linear read/write lots of large files. > (download hundred MB files and calculate checksum) > Average zpool write speed about 10MB/s. > > In output of procstat, I found 8 processes have zio_wait + zio_done in > stacks and this is very > likely a ZFS deadlock according to the wiki page. > > > System: FreeBSD kcwu.csie.org 10.0-CURRENT FreeBSD 10.0-CURRENT #5 b37d7ce: > Tue Jul 30 21:25:46 CST 2013 root@kcwu.csie.org:/usr/obj/usr/src/sys/DESKTOP > amd64 > > > > procstat log and boot dmesg are attached. > > Please let me know if you need more information. Thank you very much for the report! And sorry for the delay. Do you still have this locked up system? Or are you able to reproduce the lock up? I would like to examine some things with kgdb. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 10:44:59 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 8B09D739 for ; Thu, 29 Aug 2013 10:44:59 +0000 (UTC) (envelope-from kraduk@gmail.com) Received: from mail-qc0-x22f.google.com (mail-qc0-x22f.google.com [IPv6:2607:f8b0:400d:c01::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4BBCA29CC for ; Thu, 29 Aug 2013 10:44:59 +0000 (UTC) Received: by mail-qc0-f175.google.com with SMTP id m4so113336qcy.6 for ; Thu, 29 Aug 2013 03:44:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=yzmrxu734rEq/ZVezNwsE2HAT5dPYuDAC1SErtK0USY=; b=yyb3dK0/Fq1uA1LFYF6VutyIfhl1Bk4SU372zu4LTrp9johHoP86Gv4iMLLNsY41QL YOibEIOrCrCFxdjHpKqwbfQzHZY0NlfP0gajxDG54APu9Io+IlaUVh+vX0yUE8h2wOe2 1peJOj+AVZeMfO6042Qd0HRHtGbi3O0TY831OqExtp0LMfrNOCFjDkK77zWKopH1uHkB Ctg2sgIoRy+PQI2lYxhntByhkBeoVnX5B/RawaUYIg7tujxa0M7QXhcupSdFlHbcQE6W 29UaXDUWo9kLzPNHvbiNBfN6cE5OW4PydQtmHe0PHbZjt1BWiMNt8LVtnQigaK+xygNl Nvdg== MIME-Version: 1.0 X-Received: by 10.49.35.177 with SMTP id i17mr2853148qej.77.1377773098322; Thu, 29 Aug 2013 03:44:58 -0700 (PDT) Received: by 10.224.189.195 with HTTP; Thu, 29 Aug 2013 03:44:58 -0700 (PDT) In-Reply-To: <20130828142546.GA16523@neutralgood.org> References: <20130827181127.24761@relay.ibs.dn.ua> <20130827172727.GA73465@neutralgood.org> <20130828112444.5123@relay.ibs.dn.ua> <20130828142546.GA16523@neutralgood.org> Date: Thu, 29 Aug 2013 11:44:58 +0100 Message-ID: Subject: Re: exictent zroot re-alignment to 4K From: krad To: kpneal@pobox.com Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 10:44:59 -0000 I havent tested that myself does that mean you can have ashift=9 for your 1st raidz(23)* vdev and ashift=12 for the 2nd in a striped raidz(23)* setup? If so does the same apply in a mirror? On 28 August 2013 15:25, wrote: > On Wed, Aug 28, 2013 at 11:47:51AM +0100, krad wrote: > > you cant mirrir if your ashift=9 as its a pool level thing > > Well, strictly speaking the ashift is per-vdev. > > Of course, if you only have a single disk and therefore only have > a single vdev then the distinction doesn't matter. It does matter > when adding new vdevs to an existing pool. > > > On 28 August 2013 09:24, Zeus Panchenko <[1]zeus@ibs.dn.ua> wrote: > > > > krad <[2]kraduk@gmail.com> wrote: > > > is it just an alignment problem or are you ashift=9 (zdb zroot | > grep > > > ashift) as well? If you are already ashift=12, > > > > yes, I am ashift=9 and it is the very cause I decided to re-allocate > > the > > partition ... > -- > Kevin P. Neal http://www.pobox.com/~kpn/ > > "I like being on The Daily Show." - Kermit the Frog, Feb 13 2001 > From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 11:58:26 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 370F6899 for ; Thu, 29 Aug 2013 11:58:26 +0000 (UTC) (envelope-from zeus@ibs.dn.ua) Received: from relay.ibs.dn.ua (relay.ibs.dn.ua [91.216.196.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 949B8208A for ; Thu, 29 Aug 2013 11:58:25 +0000 (UTC) Received: from ibs.dn.ua (relay.ibs.dn.ua [91.216.196.25]) by relay.ibs.dn.ua with ESMTP id r7TBwHZV043835; Thu, 29 Aug 2013 14:58:17 +0300 (EEST) Message-ID: <20130829145817.43833@relay.ibs.dn.ua> Date: Thu, 29 Aug 2013 14:58:17 +0300 From: "Zeus Panchenko" To: Subject: Re: exictent zroot re-alignment to 4K (end of story - final recipe) In-reply-to: Your message of Tue, 27 Aug 2013 18:11:27 +0300 <20130827181127.24761@relay.ibs.dn.ua> References: <20130827181127.24761@relay.ibs.dn.ua> Organization: I.B.S. LLC X-Mailer: MH-E 8.3.1; GNU Mailutils 2.99.98; GNU Emacs 24.0.93 X-Face: &sReWXo3Iwtqql1[My(t1Gkx; y?KF@KF`4X+'9Cs@PtK^y%}^.>Mtbpyz6U=,Op:KPOT.uG )Nvx`=er!l?WASh7KeaGhga"1[&yz$_7ir'cVp7o%CGbJ/V)j/=]vzvvcqcZkf; JDurQG6wTg+?/xA go`}1.Ze//K; Fk&/&OoHd'[b7iGt2UO>o(YskCT[_D)kh4!yY'<&:yt+zM=A`@`~9U+P[qS:f; #9z~ Or/Bo#N-'S'!'[3Wog'ADkyMqmGDvga?WW)qd=?)`Y&k=o}>!ST\ MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Zeus Panchenko List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 11:58:26 -0000 =2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 hi all, here is what I did and now I am happy with my ashifted to 12 zroot on my netbook 1. Backup existent data zfs send -R zroot > latest.snap zpool export zroot 2. Destroy/Create partitions gpart destroy ada1 gpart create -s gpt ada1 gpart add -b 34 -s 94 -t freebsd-boot -l z-hp_boot ada1 here it was surprise for me since gpart complained and I had to issue another options #> gpart add -b 34 -s 94 -t freebsd-boot -l z-hp_boot ada1 ada1p1 added, but partition is not aligned on 4096 bytes #> gpart add -s 64K -t freebsd-boot -l z-hp_boot ada1 ada1p1 added gpart add -s 4G -t freebsd-swap -l z-hp_swap ada1 gpart add -a 1m -t freebsd-zfs -l z-hp_disk ada1 gpart bootcode -b /path/to/backuped/pmbr -p /path/to/backuped/gptzfsboot= -i 1 ada1 3. Create zpool and restore backup gnop create -S 4096 gpt/z-hp_disk zpool create z-hp_zroot gpt/z-hp_disk.nop cat latest.snap | zfs receive -Fv z-hp_zroot 4. Heal new zpool zpool export z-hp_zroot zpool import -f -o cachefile=3D/tmp/zpool.cache -o altroot=3D/mnt z-hp_z= root zfs set mountpoint=3D/ z-hp_zroot cp /tmp/zpool.cache /mnt/boot/zfs/zpool.cache vi /mnt/boot/loader.conf change vfs.root.mountfrom=3D"zfs:zroot" =3D> vfs.root.mountfrom=3D"zfs:z= -hp_zroot" vi /mnt/etc/fstab change /dev/gpt/swap0.eli =3D> /dev/gpt/z-hp_swap.eli zfs umount /mnt zfs set mountpoint=3Dlegacy z-hp_zroot zpool export z-hp_zroot 5. Removing gnop gnop destroy gpt/z-hp_disk.nop and now I have my system undamaged and zdb shows: z-hp_zroot: version: 5000 name: 'z-hp_zroot' ... vdev_tree: type: 'root' ... children[0]: type: 'disk' ... path: '/dev/ada0p3' phys_path: '/dev/ada0p3' whole_disk: 1 ashift: 12 ... thanks much to all =2D --=20 Zeus V. Panchenko jid:zeus@im.ibs.dn.ua IT Dpt., I.B.S. LLC GMT+2 (EET) =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlIfN1gACgkQr3jpPg/3oyr5awCeOzHIFdxR7rZ60CJw9qQ71ePd uQ4An07Xs2extRRDbG/vijgZ7WjNIaTZ =3Dy5yh =2D----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 12:33:07 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 6E74240C for ; Thu, 29 Aug 2013 12:33:07 +0000 (UTC) (envelope-from feld@FreeBSD.org) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 444442343 for ; Thu, 29 Aug 2013 12:33:06 +0000 (UTC) Received: from compute2.internal (compute2.nyi.mail.srv.osa [10.202.2.42]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id D8CA5210FB for ; Thu, 29 Aug 2013 08:33:05 -0400 (EDT) Received: from web3 ([10.202.2.213]) by compute2.internal (MEProxy); Thu, 29 Aug 2013 08:33:05 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:from:to:mime-version :content-transfer-encoding:content-type:subject:date:in-reply-to :references; s=smtpout; bh=XVtZDz9biEQbw+6MjoPeJtMAmPM=; b=KjtkH U9gR7zSZoUWZEnywLF69KQgCWp25OO1aIXD3VvAnRuB5FsVIUfR0IaUxbUEtJP3v MYGByiJfhFmfZYeC+iZwEqPS+XHKDlM7aK7RuKzmAh2VxTzHVovFSVTJCzRoeooj AN3YbsYitx4WZHYEZZTDBaRBoLUlLQdUSfDC9E= Received: by web3.nyi.mail.srv.osa (Postfix, from userid 99) id BB682B0002F; Thu, 29 Aug 2013 08:33:05 -0400 (EDT) Message-Id: <1377779585.6006.15542805.33D96D05@webmail.messagingengine.com> X-Sasl-Enc: ti97NTSbzoFqwC4uCTW9ZZ4suiSGHEVhZcl7UKxvVT/0 1377779585 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain X-Mailer: MessagingEngine.com Webmail Interface - ajax-be0d4992 Subject: Re: fixing "umount -f" for the NFS client Date: Thu, 29 Aug 2013 07:33:05 -0500 In-Reply-To: <2057976640.15042430.1377735327317.JavaMail.root@uoguelph.ca> References: <2057976640.15042430.1377735327317.JavaMail.root@uoguelph.ca> X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 12:33:07 -0000 On Wed, Aug 28, 2013, at 19:15, Rick Macklem wrote: > I've been doing a little more testing of "umount -f" for NFS > mounts and they seem to be working unless some other process/thread > has busied the file system via vfs_busy(). > > Unfortunately, it is pretty easy to vfs_busy() the file system > by using a command like "df" that is stuck on the unresponsive > NFS server. > always mount your nfs with options "soft,intr" or you'll run into this :) From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 12:35:44 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 48097573 for ; Thu, 29 Aug 2013 12:35:44 +0000 (UTC) (envelope-from feld@FreeBSD.org) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 1B8A72376 for ; Thu, 29 Aug 2013 12:35:44 +0000 (UTC) Received: from compute6.internal (compute6.nyi.mail.srv.osa [10.202.2.46]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 103B021D32 for ; Thu, 29 Aug 2013 08:35:39 -0400 (EDT) Received: from web3 ([10.202.2.213]) by compute6.internal (MEProxy); Thu, 29 Aug 2013 08:35:39 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:from:to:mime-version :content-transfer-encoding:content-type:in-reply-to:references :subject:date; s=smtpout; bh=Ct3qRjD5LfRexs2nJnaS9lD2+Mg=; b=QS7 XfQDNgVZdQwJrKts4D+RxgbUQHySJNtB3oI3SW9Lj8vj/e9nb1X5fJCyM/5QjVaN Cnj81BrhZu7SG3Ai8EDqwK7x5f9En51tAbPXNnimZXbOb+cnw/bmJJBvOmo3jLpb 8enBJz5V8SNcT8ucg/8vFMYn5l4uAks2hcNy0+Y4= Received: by web3.nyi.mail.srv.osa (Postfix, from userid 99) id E7620B0002F; Thu, 29 Aug 2013 08:35:38 -0400 (EDT) Message-Id: <1377779738.6375.15543517.25588670@webmail.messagingengine.com> X-Sasl-Enc: GbFykXjHJac0A6CHvxiAJ71YALStirMc8HBrrRJgQc3+ 1377779738 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain X-Mailer: MessagingEngine.com Webmail Interface - ajax-be0d4992 In-Reply-To: <1377779585.6006.15542805.33D96D05@webmail.messagingengine.com> References: <2057976640.15042430.1377735327317.JavaMail.root@uoguelph.ca> <1377779585.6006.15542805.33D96D05@webmail.messagingengine.com> Subject: Re: fixing "umount -f" for the NFS client Date: Thu, 29 Aug 2013 07:35:38 -0500 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 12:35:44 -0000 wow, I had two completely different thoughts and that's not exactly what I meant to say... need coffee! *I* always run into this and have to mount with "soft,intr". Any work that can make NFS more robust would be greatly appreciated. :) From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 12:50:56 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 2F6FEB40; Thu, 29 Aug 2013 12:50:56 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id BA520248A; Thu, 29 Aug 2013 12:50:55 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqUEAFVDH1KDaFve/2dsb2JhbABagzxRgye7dIENgUF0giQBAQEEAQEBICsgCwUWDgoCAg0ZAikBCSYGCAcEARwEh2AMpl6SL4EpjRKBBTQHgmiBNAOVKoN4kDeDPCAygQM5 X-IronPort-AV: E=Sophos;i="4.89,983,1367985600"; d="scan'208";a="48042414" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 29 Aug 2013 08:50:40 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 25CD7B4042; Thu, 29 Aug 2013 08:50:40 -0400 (EDT) Date: Thu, 29 Aug 2013 08:50:40 -0400 (EDT) From: Rick Macklem To: Mark Felder Message-ID: <1440314933.15156431.1377780640143.JavaMail.root@uoguelph.ca> In-Reply-To: <1377779585.6006.15542805.33D96D05@webmail.messagingengine.com> Subject: Re: fixing "umount -f" for the NFS client MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 12:50:56 -0000 Mark Felder wrote: > On Wed, Aug 28, 2013, at 19:15, Rick Macklem wrote: > > I've been doing a little more testing of "umount -f" for NFS > > mounts and they seem to be working unless some other process/thread > > has busied the file system via vfs_busy(). > > > > Unfortunately, it is pretty easy to vfs_busy() the file system > > by using a command like "df" that is stuck on the unresponsive > > NFS server. > > > > always mount your nfs with options "soft,intr" or you'll run into > this > :) I'm not sure if your smiley is because you already know what I think of "soft, intr", but... Application software generally doesn't expect I/O syscalls (read, write...) to fail with ETIMEDOUT or EINTR. As such, I would never use "soft", except maybe for some read-only volume where intermittent failures aren't an issue. As for "intr", I believe that processes can get stuck in uninterruptible situations when they are waiting for a buffer/page that is in use by another thread (that no "intr" signal has been issued for). However, this just means the process won't "interrupt", so so long as there isn't an unexpected interruption caused by a signal, it could be ok. Finally, neither of these options are safe to use for NFSv4, because operations that manipulate state (like locks) cannot be safely interrupted, since they will leave the lock in an undefined state, at least for the FreeBSD server. So, at least for NFSv4, a forced dismount is a necessary alternative. Personally, when I was a sysadmin, I tried to make sure my NFS server and network ran reliably and always used hard mounts (no "intr, soft") to avoid the issue. rick > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 12:52:59 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id C0004BE4 for ; Thu, 29 Aug 2013 12:52:59 +0000 (UTC) (envelope-from feld@FreeBSD.org) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9372E24AC for ; Thu, 29 Aug 2013 12:52:59 +0000 (UTC) Received: from compute1.internal (compute1.nyi.mail.srv.osa [10.202.2.41]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id E117621FD5; Thu, 29 Aug 2013 08:52:50 -0400 (EDT) Received: from web3 ([10.202.2.213]) by compute1.internal (MEProxy); Thu, 29 Aug 2013 08:52:50 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:from:to:cc:mime-version :content-transfer-encoding:content-type:in-reply-to:references :subject:date; s=smtpout; bh=f4DYJXPQgT72tcd+2LdpoA/r0oY=; b=SjX HqO3953rvOcti2jaaNXaMIAw+kQpWM3pzy4/MqyqfCwIKLy02xTA8cCAx5dccFqx t5uM7BPBp8G5+ivT92JtXZMdHDXD747k6OJBzgU//3ik+rn2Q8IxUANfb1twKn+e F7aPXJm49E7cBLPekq3YFB11IKXh7Lj8K3gyGxAM= Received: by web3.nyi.mail.srv.osa (Postfix, from userid 99) id 3E96FB00032; Thu, 29 Aug 2013 08:52:50 -0400 (EDT) Message-Id: <1377780770.11951.15550057.2F8C551C@webmail.messagingengine.com> X-Sasl-Enc: +WuiF7d2KvS/YO1oib6pK9ZOqVqux9nljmwFjjfGnzIb 1377780770 From: Mark Felder To: Rick Macklem MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain X-Mailer: MessagingEngine.com Webmail Interface - ajax-be0d4992 In-Reply-To: <1440314933.15156431.1377780640143.JavaMail.root@uoguelph.ca> References: <1440314933.15156431.1377780640143.JavaMail.root@uoguelph.ca> Subject: Re: fixing "umount -f" for the NFS client Date: Thu, 29 Aug 2013 07:52:50 -0500 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 12:52:59 -0000 On Thu, Aug 29, 2013, at 7:50, Rick Macklem wrote: > > Finally, neither of these options are safe to use for NFSv4, because > operations that manipulate state (like locks) cannot be safely > interrupted, > since they will leave the lock in an undefined state, at least for the > FreeBSD server. > So, at least for NFSv4, a forced dismount is a necessary alternative. > I was not aware of this specific issue, so thank you for the warning From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 13:41:30 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 2AC08284; Thu, 29 Aug 2013 13:41:30 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id D1C362854; Thu, 29 Aug 2013 13:41:29 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAP9OH1KDaFve/2dsb2JhbABahA2DJ7t0gQ2BQXSCJAEBBSNWGw4KAgINGQJZBogUpnqSK4Epjhc0B4JogTQDqVmDPCCBbg X-IronPort-AV: E=Sophos;i="4.89,983,1367985600"; d="scan'208";a="48057180" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 29 Aug 2013 09:41:29 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id DCF7AB4038; Thu, 29 Aug 2013 09:41:28 -0400 (EDT) Date: Thu, 29 Aug 2013 09:41:28 -0400 (EDT) From: Rick Macklem To: Mark Felder Message-ID: <1184714754.15192677.1377783688897.JavaMail.root@uoguelph.ca> In-Reply-To: <1377780770.11951.15550057.2F8C551C@webmail.messagingengine.com> Subject: Re: fixing "umount -f" for the NFS client MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 13:41:30 -0000 Mark Felder wrote: > On Thu, Aug 29, 2013, at 7:50, Rick Macklem wrote: > > > > Finally, neither of these options are safe to use for NFSv4, > > because > > operations that manipulate state (like locks) cannot be safely > > interrupted, > > since they will leave the lock in an undefined state, at least for > > the > > FreeBSD server. > > So, at least for NFSv4, a forced dismount is a necessary > > alternative. > > > > I was not aware of this specific issue, so thank you for the warning > Yeah. It's documented in the BUGS section of "man mount_nfs" and few (me included;-) can be bothered reading an entire man page. rick From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 13:49:39 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 37CBD3DB for ; Thu, 29 Aug 2013 13:49:39 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id EF61028B1 for ; Thu, 29 Aug 2013 13:49:38 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEACpQH1KDaFve/2dsb2JhbABaFoMmUYMnvQGBQHSCJAEBBAEjVhsYAgINGQJZBhOHewYMpnOSK4EpjR94NAeCaIE0A5kikDeDPCCBLkA X-IronPort-AV: E=Sophos;i="4.89,983,1367985600"; d="scan'208";a="48059355" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 29 Aug 2013 09:49:37 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id C96FEB3F1B; Thu, 29 Aug 2013 09:49:37 -0400 (EDT) Date: Thu, 29 Aug 2013 09:49:37 -0400 (EDT) From: Rick Macklem To: "Sam Fourman Jr." Message-ID: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: NFS on ZFS pure SSD pool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 13:49:39 -0000 Sam Fourman Jr wrote: > > > > > > > On Wed, Aug 28, 2013 at 2:27 PM, Eric Browning < > ericbrowning@skaggscatholiccenter.org > wrote: > > > Rick, > > Sam and I applied the patch (kernel now at r254983M) and set > vfs.nfsd.tcphighwater=5000 > in sysctl.conf and my CPU is still slammed. SHould I up it to 10000? > > > > > > > Hello, list > I am helping Eric debug and test this situation as much as I can. > > > So to clarify and recap, here is the situation: > > > > This is a production setting, in a school, that has 200+ students > using a mix of systems,with the primary client being OSX 10.8. > and the primary function is using NFS. > I haven't touched a Mac in several years, but I think Finder probes at regular intervals to see if directories (oops, I meant folders;-) have changed. I think there is a way to increase the interval time between probes. Also, I think there are tunables for the metadata cache in ZFS, which might be useful for increasing the metadata cache sizes, since the probes will be checking metadata (attributes). If this is contributing the the heavy load, I'd suspect "nfsstat -e -s" to show a large count for Getattr. (I vaguely remember that NFSv4 mounts were in use. The counts of everything is larger for NFSv4, since they are counts of operations and not RPCs. Each NFSv4 RPC is a compound made up of N operations.) Just something that might be worth looking at, rick > > from what I can see there should be plenty of disk I/O > > these are Intel SSD disks.. > > > The server is running FreeBSD 9-STABLE r254983 (we patched it last > night) > > with this patch > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > > > > Here is a full dmesg for reference (it states FreeBSD 9.1,but we have > since > upgraded and applied the above patch) > > > https://gist.github.com/sfourman/6373059 > > > > > > > > > > The main problem is we need better performance from NFS, but > it would appear the server is starved for CPU cycles.... > > > With only a few clients the server is lightning fast but > with 25 users logging in this morning (students in class) the server > went right to 1200% CPU load > and about 3 00% more going to "intr" and it pretty much stayed there > all day until they logged out between classes. > > > So that works out to be somewhere between 2 to 4 users per core > I'm not the guy to be able to help with how to do it, put profiling the running kernel to try and see where the CPU is being used, could help. (At this point, I suspect it isn't in the nfs code, since the DRC seems to be the only CPU hog and I think the patch you are already using fixes that.) Good luck with it, rick > > > during today's classes, different settings for vfs.nfsd.tcphighwater > were tested > various ranges from 5,000 up to 50,000 were used while a load was > present, but > the processor load didn't change. > > > Garrett stated that he tried values in upwards of 100,000... this can > be tested tomorrow > > > > > It would be helpful if we could get some direction, on other things > we might try tomorrow. > > > one idea is, the server has several igb Ethernet interfaces with 8 > queue's per interface > is it worth forcing the interfaces down to one queue? > > > Is NFS even setup to understand multi queue network devices? or > doesn't it matter? > > > Any thoughts are appreciated -- > > Sam Fourman Jr. > From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 13:59:34 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 2B700911 for ; Thu, 29 Aug 2013 13:59:34 +0000 (UTC) (envelope-from feld@FreeBSD.org) Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id F34602943 for ; Thu, 29 Aug 2013 13:59:33 +0000 (UTC) Received: from compute2.internal (compute2.nyi.mail.srv.osa [10.202.2.42]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 2BEFD21D41 for ; Thu, 29 Aug 2013 09:59:27 -0400 (EDT) Received: from web3 ([10.202.2.213]) by compute2.internal (MEProxy); Thu, 29 Aug 2013 09:59:28 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:from:to:mime-version :content-transfer-encoding:content-type:in-reply-to:references :subject:date; s=smtpout; bh=K0kSmHcsW6LsDjTxQlIN21X9FqI=; b=Zmk ytcfMz5zqv/WLSHXJtRfZq69YibQ33Hyo7vO8WkD2yAjdrpxwFu3lvhgLHdOtlCJ cjcjgDJAvB+O2TvWU+JLo6gEvIkL2HYzXXiJF5ot6Ym2ZmasIl0g5fIcuomHOjWf EY25set6Q02v1RaR2qCy6GQbrlOL0+8Ven/wjU64= Received: by web3.nyi.mail.srv.osa (Postfix, from userid 99) id A77BFB0003E; Thu, 29 Aug 2013 09:59:27 -0400 (EDT) Message-Id: <1377784767.31759.15575801.12666FDA@webmail.messagingengine.com> X-Sasl-Enc: 47KHKLo1kWQQ8ZXFyiXJwW7dSqgbP2nFGLqKUnASoVhg 1377784767 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain X-Mailer: MessagingEngine.com Webmail Interface - ajax-be0d4992 In-Reply-To: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> References: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> Subject: Re: NFS on ZFS pure SSD pool Date: Thu, 29 Aug 2013 08:59:27 -0500 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 13:59:34 -0000 On Thu, Aug 29, 2013, at 8:49, Rick Macklem wrote: > I haven't touched a Mac in several years, but I think Finder probes at > regular intervals to see if directories (oops, I meant folders;-) have > changed. I think there is a way to increase the interval time between > probes. > I can't be 100% sure, but I think Finder does an initial scan sometime after the OS install and after that it just watches filesystem changes through the FSEvents API -- like inotify/dnotify on linux or what you can sort of do with kqueue on FreeBSD. From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 14:05:48 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 58D7CCEA for ; Thu, 29 Aug 2013 14:05:48 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from cpsmtpb-ews08.kpnxchange.com (cpsmtpb-ews08.kpnxchange.com [213.75.39.13]) by mx1.freebsd.org (Postfix) with ESMTP id E83A72A09 for ; Thu, 29 Aug 2013 14:05:47 +0000 (UTC) Received: from cpsps-ews08.kpnxchange.com ([10.94.84.175]) by cpsmtpb-ews08.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Thu, 29 Aug 2013 16:04:38 +0200 Received: from CPSMTPM-TLF103.kpnxchange.com ([195.121.3.6]) by cpsps-ews08.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Thu, 29 Aug 2013 16:04:38 +0200 Received: from sjakie.klop.ws ([212.182.167.131]) by CPSMTPM-TLF103.kpnxchange.com with Microsoft SMTPSVC(7.5.7601.17514); Thu, 29 Aug 2013 16:04:37 +0200 Received: from 212-182-167-131.ip.telfort.nl (localhost [127.0.0.1]) by sjakie.klop.ws (Postfix) with ESMTP id 538FCC3DF for ; Thu, 29 Aug 2013 16:04:36 +0200 (CEST) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: freebsd-fs@freebsd.org Subject: Re: NFS on ZFS pure SSD pool References: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> <1377784767.31759.15575801.12666FDA@webmail.messagingengine.com> Date: Thu, 29 Aug 2013 16:04:36 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Ronald Klop" Message-ID: In-Reply-To: <1377784767.31759.15575801.12666FDA@webmail.messagingengine.com> User-Agent: Opera Mail/12.16 (FreeBSD) X-OriginalArrivalTime: 29 Aug 2013 14:04:37.0970 (UTC) FILETIME=[B2BC8B20:01CEA4C0] X-RcptDomain: freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 14:05:48 -0000 On Thu, 29 Aug 2013 15:59:27 +0200, Mark Felder wrote: > On Thu, Aug 29, 2013, at 8:49, Rick Macklem wrote: >> I haven't touched a Mac in several years, but I think Finder probes at >> regular intervals to see if directories (oops, I meant folders;-) have >> changed. I think there is a way to increase the interval time between >> probes. >> > > I can't be 100% sure, but I think Finder does an initial scan sometime > after the OS install and after that it just watches filesystem changes > through the FSEvents API -- like inotify/dnotify on linux or what you > can sort of do with kqueue on FreeBSD. AFAIK these FSEvents or inotify/dnotify or kqueue do not work on NFS. (But maybe the world has changed in the meantime.) Ronald. From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 14:16:34 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id E32F43BD for ; Thu, 29 Aug 2013 14:16:34 +0000 (UTC) (envelope-from sfourman@gmail.com) Received: from mail-vc0-x229.google.com (mail-vc0-x229.google.com [IPv6:2607:f8b0:400c:c03::229]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A227D2AFE for ; Thu, 29 Aug 2013 14:16:34 +0000 (UTC) Received: by mail-vc0-f169.google.com with SMTP id ib11so346503vcb.14 for ; Thu, 29 Aug 2013 07:16:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=qOLArIvyBtTf+fdrfqV7qDw1f1/rVSSThF1/D8eXAGI=; b=EJRpfWfuRGYxY3EuXu6iq/PwaiQhzRuInXZ13fVjS0l3Vdf5tPlzALNssCURFIuvEs LkA3089dB73r6Z4Q6/T82Knj3GxfyhJ90aIa9AClyIj5wDd0o/QtTsVjWVblRl2BYlY0 vaVfc8I0xX1/cpNdtzcYjIbM/K2SuqwYx2jJpZsK3e5ozhhjuFbeSaJ58xwEhw2bMCZ4 kyV/eJLAFwZDawplYVh6xkY5VX8/dmWpyFMd0hUOi7rsLGuf9JwYADq8500pzF/RZOEc gjdI6E7mfKcDc73t+GcrYxtqdzwhVtpefsS7yURCAoK7EP6XDIAXCT7elXGUIoHfm4sN kT9Q== MIME-Version: 1.0 X-Received: by 10.220.181.136 with SMTP id by8mr3341906vcb.11.1377785793796; Thu, 29 Aug 2013 07:16:33 -0700 (PDT) Received: by 10.220.96.78 with HTTP; Thu, 29 Aug 2013 07:16:33 -0700 (PDT) In-Reply-To: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> References: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> Date: Thu, 29 Aug 2013 10:16:33 -0400 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: "Sam Fourman Jr." To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 14:16:35 -0000 On Thu, Aug 29, 2013 at 9:49 AM, Rick Macklem wrote: > Sam Fourman Jr wrote: > > > > > > > > > > > > > > On Wed, Aug 28, 2013 at 2:27 PM, Eric Browning < > > ericbrowning@skaggscatholiccenter.org > wrote: > > > > > > Rick, > > > > Sam and I applied the patch (kernel now at r254983M) and set > > vfs.nfsd.tcphighwater=5000 > > in sysctl.conf and my CPU is still slammed. SHould I up it to 10000? > > > > > > > > > > > > > > Hello, list > > I am helping Eric debug and test this situation as much as I can. > > > > > > So to clarify and recap, here is the situation: > > > > > > > > This is a production setting, in a school, that has 200+ students > > using a mix of systems,with the primary client being OSX 10.8. > > and the primary function is using NFS. > > > I haven't touched a Mac in several years, but I think Finder probes at > regular intervals to see if directories (oops, I meant folders;-) have > changed. I think there is a way to increase the interval time between > probes. > Also, I think there are tunables for the metadata cache in ZFS, which > might be useful for increasing the metadata cache sizes, since the probes > will be checking metadata (attributes). > > Can does anyone here on the list, have clues on where to start with ZFS metadata turntables, and a command or two to profile the kernel for CPU usage? I'm willing to read up on kernel profiling as well, I'm just not sure where to start > -- Sam Fourman Jr. From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 15:10:05 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 1B47E1B1 for ; Thu, 29 Aug 2013 15:10:05 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) by mx1.freebsd.org (Postfix) with ESMTP id CFC5A2EE0 for ; Thu, 29 Aug 2013 15:10:04 +0000 (UTC) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id 6EC1E5D558 for ; Thu, 29 Aug 2013 17:03:45 +0200 (CEST) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id hs7J9kXEe-Lp for ; Thu, 29 Aug 2013 17:03:44 +0200 (CEST) Received: from mail.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id 8843C5D561 for ; Thu, 29 Aug 2013 17:03:44 +0200 (CEST) Received: from bsdlo.incore (bsdlo.incore [192.168.0.84]) by mail.incore (Postfix) with ESMTP id 81E9B50962 for ; Thu, 29 Aug 2013 17:03:44 +0200 (CEST) Message-ID: <521F62D0.8000400@incore.de> Date: Thu, 29 Aug 2013 17:03:44 +0200 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: fsync: giving up on dirty on partitions with gjournal or soft updates Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 15:10:05 -0000 Hello, I run Freebsd 8.4-Stable r253040 completed with adapted r244795, r244925 and r245286 from head. On my servers with gjournaled partitions I sometimes see messages like this: serv02 kernel: fsync: giving up on dirty serv02 kernel: 0xffffff0002d313b0: tag devfs, type VCHR serv02 kernel: usecount 1, writecount 0, refcount 35 i mountedhere 0xffffff0002bcd400 serv02 kernel: flags () serv02 kernel: v_object 0xffffff0002d22ca8 ref 0 pages 520 serv02 kernel: lock type devfs: EXCL by thread 0xffffff0002956470 (pid 9) serv02 kernel: dev mirror/gmsv02p10.journal serv02 kernel: GEOM_JOURNAL: Cannot suspend file system /home (error=35). pid 9 is the process g_journal switcher and mount gives for /home: /dev/mirror/gmsv02p10.journal on /home (ufs, asynchronous, local, noatime, gjournal) I am aware of on old statement of pjd: it's harmless, it just means journal switch will be done a bit later. Ok, but now I have the same messages on a server running soft updates instead of gjournal during a backup with "dump -L" started by amanda: amandalog: sendbackup: Spawning "/sbin/dump dump 0ubLshf 64 1048576 0 - /dev/amrd0s1f" in pipeline sendbackup: 116: strange(?): mksnap_ffs: Cannot create snapshot /home/.snap/dump_snapshot: Resource temporarily unavailable sendbackup: critical (fatal): error [dump (82637) /sbin/dump returned 1] messages: dsspbx2 kernel: fsync: giving up on dirty dsspbx2 kernel: 0xc5ae4b84: tag devfs, type VCHR dsspbx2 kernel: usecount 1, writecount 0, refcount 123 mountedhere 0xc5a34d00 dsspbx2 kernel: flags () dsspbx2 kernel: v_object 0xc5aeca18 ref 0 pages 834 dsspbx2 kernel: lock type devfs: EXCL by thread 0xc5a4c8a0 (pid 82639) dsspbx2 kernel: dev amrd0s1f dsspbx2 sendbackup[82634]: error [dump (82637) /sbin/dump returned 1] pid 82639 is the process mksnap_ffs and mount gives for /home: /dev/amrd0s1f on /home (ufs, local, noatime, soft-updates) Last year in this list Kirk wrote (for V9) "Note that soft updates without journaling do not show this issue". I like to modify this statement a little bit (for V8) "soft updates do show this is issue, but not so often as gjournal". -- Andreas Longwitz From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 15:59:29 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id DC45188E for ; Thu, 29 Aug 2013 15:59:28 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-qe0-x22b.google.com (mail-qe0-x22b.google.com [IPv6:2607:f8b0:400d:c02::22b]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 9E6922365 for ; Thu, 29 Aug 2013 15:59:28 +0000 (UTC) Received: by mail-qe0-f43.google.com with SMTP id t7so312514qeb.30 for ; Thu, 29 Aug 2013 08:59:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=cqNWVb0Uo3bU9p10twVQS3BgWcOWTPOitYue5Z8L3ho=; b=GgTgvR4VzahUPZvWlSlqW4ZFMUJbAIJCY9th+lugX9pv+GG7v80OaSo2zsZqL8hnbV JY1rIMJz9dI2bCpdzXvtCOa1DEj66SSQnM/sU5gh7TFx6xmHmnHjF9RxgSN2L42AUIcr GvBaswgoJF+kWdkoC2BWBa4EJPSUAGS4AsPlN6DsDA1oSV29NBcY8XnkU5vYus399/U+ d5uoyclw7aPC8CWpjDl/D16gmyE4cdnmOrptDe5fE36mfhjEZ7hu0a/Zch8tKAuBXSlt 7PcN3szTBoqSvyTH6DR/a1x3dX0sdWoUaKtj0QtvDFoOi3uRa5Fti9t2xd/WBhI42n3o aSqw== MIME-Version: 1.0 X-Received: by 10.49.58.132 with SMTP id r4mr4997488qeq.10.1377791967867; Thu, 29 Aug 2013 08:59:27 -0700 (PDT) Received: by 10.49.1.98 with HTTP; Thu, 29 Aug 2013 08:59:27 -0700 (PDT) In-Reply-To: References: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> Date: Thu, 29 Aug 2013 08:59:27 -0700 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: Freddie Cash To: "Sam Fourman Jr." Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 15:59:29 -0000 On Thu, Aug 29, 2013 at 7:16 AM, Sam Fourman Jr. wrote: > On Thu, Aug 29, 2013 at 9:49 AM, Rick Macklem > wrote: > > Also, I think there are tunables for the metadata cache in ZFS, which > > might be useful for increasing the metadata cache sizes, since the probes > > will be checking metadata (attributes). > > Can does anyone here on the list, have clues on where to start with ZFS > metadata turntables, and a command or two to profile the kernel for CPU > usage? > This will show you how many bytes of the ARC are being used by metadata: sysctl vfs.zfs.arc_meta_used You can tune what the max is via: sysctl vfs.zfs.arc_meta_limit The default setting is 12.5% (one eighth) of the ARC. On our servers running dedupe, we set this to around 80% of the ARC, and set the secondarycache ZFS property to "metadata", in order to keep the DDT in ARC/L2ARC. You may want to play with the arc_meta_limit a bit. -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 18:18:22 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 3D09B50C for ; Thu, 29 Aug 2013 18:18:22 +0000 (UTC) (envelope-from Robert.Burmeister@utoledo.edu) Received: from smtpin1.utoledo.edu (smtpin1.utoledo.edu [131.183.2.213]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D8B992E3F for ; Thu, 29 Aug 2013 18:18:21 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtwBAO6PH1KDtwN/l2dsb2JhbABagzxRgmBHtFqIOgEBAgEBAh+BCxYOAQEBAQEIFgc8gk4wRAEWAgUhAhECQAwNCAEBh2sDDwyYJo5/iC8XTYhsgSmOaIJSgTQDmSKEfY52gg4 X-IronPort-AV: E=Sophos;i="4.89,984,1367985600"; d="scan'208,217";a="281906856" Received: from dlpint01.utoledo.edu ([131.183.3.127]) by smtpin1.utoledo.edu with ESMTP/TLS/DHE-RSA-AES256-SHA; 29 Aug 2013 14:18:14 -0400 Received: from MsgApp11.utad.utoledo.edu (msgapp11.utad.utoledo.edu [131.183.3.7]) by dlpint01.utoledo.edu (RSA Interceptor) for ; Thu, 29 Aug 2013 14:17:58 -0400 Received: from [192.168.1.65] (76.238.196.183) by Email.Utoledo.Edu (131.183.3.18) with Microsoft SMTP Server (TLS) id 14.2.328.9; Thu, 29 Aug 2013 14:17:58 -0400 Message-ID: <521F9035.20706@UToledo.edu> Date: Thu, 29 Aug 2013 14:17:25 -0400 From: Robert Burmeister User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.28) Gecko/20120306 Thunderbird/3.1.20 To: Subject: dirhash and dynamic memory allocation X-Originating-IP: [76.238.196.183] X-RSA-Inspected: yes X-RSA-Classifications: public X-RSA-Action: allow MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 18:18:22 -0000 Could you check the list[1] http://freebsd.1045724.n5.nabble.com/Suggest-changing-dirhash-defaults- for-FreeBSD-9-2-td5839351.html and contribute any benchmarks that you know of that are more recent than 2008? References 1. http://freebsd.1045724.n5.nabble.com/Suggest-changing-dirhash-defaults-for-FreeBSD-9-2-td5839351.html From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 18:18:22 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id BB77F50D for ; Thu, 29 Aug 2013 18:18:22 +0000 (UTC) (envelope-from Robert.Burmeister@utoledo.edu) Received: from smtpin1.utoledo.edu (smtpin1.utoledo.edu [131.183.2.213]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6228D2E40 for ; Thu, 29 Aug 2013 18:18:22 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtwBAO6PH1KDtwN/l2dsb2JhbABagzxRgmBHtFqIOgEBAgEBAh+BCxYOAQEBAQEIFgc8giUBBSMwNg4BFgIFIQICDwJABgYNCAEBh2sDDwynJYgvF02IbIEpjmiCUoE0A5kihH2OdoIO X-IronPort-AV: E=Sophos;i="4.89,984,1367985600"; d="scan'208,217";a="281906857" Received: from dlpint01.utoledo.edu ([131.183.3.127]) by smtpin1.utoledo.edu with ESMTP/TLS/DHE-RSA-AES256-SHA; 29 Aug 2013 14:18:14 -0400 Received: from MsgApp11.utad.utoledo.edu (msgapp11.utad.utoledo.edu [131.183.3.7]) by dlpint01.utoledo.edu (RSA Interceptor) for ; Thu, 29 Aug 2013 14:18:00 -0400 Received: from [192.168.1.65] (76.238.196.183) by Email.Utoledo.Edu (131.183.3.18) with Microsoft SMTP Server (TLS) id 14.2.328.9; Thu, 29 Aug 2013 14:18:00 -0400 Message-ID: <521F9037.40205@UToledo.edu> Date: Thu, 29 Aug 2013 14:17:27 -0400 From: Robert Burmeister User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.28) Gecko/20120306 Thunderbird/3.1.20 To: Subject: dirhash and dynamic memory allocation References: 4E982C0E.2060900@quip.cz X-Originating-IP: [76.238.196.183] X-RSA-Inspected: yes X-RSA-Classifications: public X-RSA-Action: allow MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 18:18:22 -0000 Could you check the list[1] http://freebsd.1045724.n5.nabble.com/Suggest-changing-dirhash-defaults- for-FreeBSD-9-2-td5839351.html and contribute any benchmarks that you know of that are more recent than 2008? References 1. http://freebsd.1045724.n5.nabble.com/Suggest-changing-dirhash-defaults-for-FreeBSD-9-2-td5839351.html From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 18:36:15 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 68EF3DD6 for ; Thu, 29 Aug 2013 18:36:15 +0000 (UTC) (envelope-from Robert.Burmeister@utoledo.edu) Received: from smtpin1.utoledo.edu (smtpin1.utoledo.edu [131.183.2.213]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0FDF92F66 for ; Thu, 29 Aug 2013 18:36:14 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqkAAGqTH1KDtwIWl2dsb2JhbABagzxRgmBHvTuBCxYOAQEBAQEIFgc8ghsKAQUjDwEgNg4BFgIFFgsCAgkDAgECAT8GBg0IAQGHawMPDKckiC4XTYhsgSmOaIJSgTQDmSKEfY52gg4 X-IronPort-AV: E=Sophos;i="4.89,984,1367985600"; d="scan'208,217";a="281910837" Received: from dlpint00.utoledo.edu ([131.183.2.22]) by smtpin1.utoledo.edu with ESMTP/TLS/DHE-RSA-AES256-SHA; 29 Aug 2013 14:36:13 -0400 Received: from MsgApp11.utad.utoledo.edu (msgapp11.utad.utoledo.edu [131.183.3.7]) by dlpint00.utoledo.edu (RSA Interceptor) for ; Thu, 29 Aug 2013 14:35:58 -0400 Received: from [192.168.1.65] (76.238.196.183) by Email.Utoledo.Edu (131.183.3.18) with Microsoft SMTP Server (TLS) id 14.2.328.9; Thu, 29 Aug 2013 14:35:57 -0400 Message-ID: <521F946D.9060805@UToledo.edu> Date: Thu, 29 Aug 2013 14:35:25 -0400 From: Robert Burmeister User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.28) Gecko/20120306 Thunderbird/3.1.20 To: Subject: dirhash default maxmem and reclaimage values References: 20111014135220.GA73637@icarus.home.lan X-Originating-IP: [76.238.196.183] X-RSA-Inspected: yes X-RSA-Classifications: public X-RSA-Action: allow MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 18:36:15 -0000 Would people with experience tuning dirhash check the list[1] http://freebsd.1045724.n5.nabble.com/Suggest-changing-dirhash-defaults- for-FreeBSD-9-2-td5839351.html and contribute any benchmarks that they have that are more recent than 2008? In question are: should vfs.ufs.dirhash_reclaimage be increased, if so what to? should vfs.ufs.dirhash_maxmem scale with kernel memory rather than RAM? (Autotuning maxmem on low RAM machines makes maxmem too small, and on large RAM machines scavenging 10% of maxmem on vm_lowmem events becomes too aggressive.) References 1. http://freebsd.1045724.n5.nabble.com/Suggest-changing-dirhash-defaults-for-FreeBSD-9-2-td5839351.html From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 20:23:15 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D8271592 for ; Thu, 29 Aug 2013 20:23:15 +0000 (UTC) (envelope-from sfourman@gmail.com) Received: from mail-ve0-x234.google.com (mail-ve0-x234.google.com [IPv6:2607:f8b0:400c:c01::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 96E8026B1 for ; Thu, 29 Aug 2013 20:23:15 +0000 (UTC) Received: by mail-ve0-f180.google.com with SMTP id pb11so702070veb.25 for ; Thu, 29 Aug 2013 13:23:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=aYU7bBI2dtqC/i4iIBWqTDUPuKjI5oed7fIvCstl79k=; b=lh4B2wxCIya4IIpuqpwP1QYbKsU77yQq8Ylm9UW1Q3M/rej0XiJPFKsywwjaY5fPTy y5JZi/V9xuS1sdGGk3JLEFb75WwkP2zN1dElMii2SwXBozxP4smYuP8ckuP7ALr3Sjoi Rez4jjwYriiGAI9LqYUzPXX3bYlK3mtNJ207dWAwc6Xps3jd67AArfVAcTw6hhL869Tm Ul1Kvfg1Wwoe2sdVG5fKUr+U7qtn/pOTumyTd+ZSjkfBfEsd7vGIWssnOmS3FdHK3eKG U/X4nkTube4cmXBSBTwyLJZrIW9C6afvEmkclJYW54JYof5SfTfQSEyyOxGvc/QK7oSw htTg== MIME-Version: 1.0 X-Received: by 10.58.108.8 with SMTP id hg8mr4227046veb.6.1377807794747; Thu, 29 Aug 2013 13:23:14 -0700 (PDT) Received: by 10.220.96.78 with HTTP; Thu, 29 Aug 2013 13:23:14 -0700 (PDT) In-Reply-To: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> References: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> Date: Thu, 29 Aug 2013 16:23:14 -0400 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: "Sam Fourman Jr." To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 20:23:15 -0000 > If this is contributing the the heavy load, I'd suspect "nfsstat -e -s" > to show a large count for Getattr. (I vaguely remember that NFSv4 mounts > were in use. The counts of everything is larger for NFSv4, since they > are counts of operations and not RPCs. Each NFSv4 RPC is a compound made > up of N operations.) > > Just something that might be worth looking at, rick > > root@students:/users # nfsstat -e -s Server Info: Getattr Setattr Lookup Readlink Read Write Create Remove 106273793 1417764 19593633 12021 2497674 7927757 1047249 772450 Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access 319284 924 13813 63500 20980 526257 0 677005862 Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf 0 5218418 445 445 2899915 0 0 0 Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock 0 0 0 0 0 0 0 0 LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH 0 0 0 0 0 0 0 0 Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create 0 0 0 0 0 0 Server: Retfailed Faults Clients 0 0 0 OpenOwner Opens LockOwner Locks Delegs 0 0 0 0 0 Server Cache Stats: Inprog Idem Non-idem Misses CacheSize TCPPeak 0 0 0 826136189 952954 150002 -- Sam Fourman Jr. From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 20:53:27 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 26F61DDF for ; Thu, 29 Aug 2013 20:53:27 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id D60BD286C for ; Thu, 29 Aug 2013 20:53:26 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 2253F3DE73; Thu, 29 Aug 2013 20:53:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=3e+Ymb67cB0vioZI2OAETUrLMAc=; b=D4q4UoTQxS+fDVMuw+Ie9JYHcuxP 3Dzr/4dTIsvHgoVNWGXoWq3pZRNlNEEmmvz4pjyCcE6kC8xYXM44xecytBa8Kroc LVL1Vph1fTc1EyjdkY28d+plo+2qa68VYTAkiA6bY/897/3cN1aYZf81/DAQz0dC DBBnkfAm4iqVWW4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=vdNRMn WTp2eHD1/i9xnJyPg3sk1mdNg5de5ursHCT6CA7tXc18B6ma3UYsrgDSQuZILe7y 7/naXM2AjDJcnZa69kUrCTGW/1I94qJ458kG92XGMzoH0bLfYOe8vyVNhgd3PhYw z54+kTwtAEHb/g5gxKzqcgipv4+tGQWTEKe5Q= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 155EF3DE72; Thu, 29 Aug 2013 20:53:18 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.231.117]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 6C5B43DE70; Thu, 29 Aug 2013 20:53:17 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 81C565C57; Fri, 30 Aug 2013 08:53:15 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 6287E4042FCD; Fri, 30 Aug 2013 08:53:15 +1200 (NZST) Date: Fri, 30 Aug 2013 08:53:15 +1200 Message-ID: <877gf4cj7o.wl%berend@pobox.com> From: Berend de Boer To: Freddie Cash Subject: Re: NFS on ZFS pure SSD pool In-Reply-To: References: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Fri_Aug_30_08:53:14_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 07BF92F0-10ED-11E3-B9B9-CA9B8506CD1E-48001098!b-pb-sasl-quonix.pobox.com Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 20:53:27 -0000 --pgp-sign-Multipart_Fri_Aug_30_08:53:14_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Freddie" == Freddie Cash writes: Freddie> This will show you how many bytes of the ARC are being Freddie> used by metadata: sysctl vfs.zfs.arc_meta_used Freddie> You can tune what the max is via: sysctl Freddie> vfs.zfs.arc_meta_limit Strangely enough on my system arc_meta_used = 2GB, while arc_meta_limit = 1.5G. Wouldn't you expect that arc_meta_used would be lower than this limit? -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Fri_Aug_30_08:53:14_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJSH7S6AAoJEKOfeD48G3g5oZ0QALi0G4zhgg2ZYsviFv4ulu5w g2csNeYUvRYI50stI50f9YpvzsSf8S2QGpIgsweAeFGrbNLmRQ+rEXGxnLw4soLp Pu4LXiekKbHflqvvBrMyXSEP/Yo6uKc57pzjstPuuSZJCeQydCbOfQ9FOAMphLVn J8nHArsdTiwYOaK/gYMKY7vLpDLnTG4GOB6/69vySIYuYEhCT/1LnjFmbehfXj11 KSsFsVLJUsaCBcYom2onwiZrHAXoDFO0wIVuMrHrt/vZ2SaHoXik8ZQxprRvBapx M1yp3ORFvW8StmMHlLfeb9i3PGjnNX//n0RhpkrF7Lq+AHRm+1ani+LftVPzwKSx b1Rsp3wdo4xLoBSuC6dOTwzuiPgWFYL3XenrBpghA2lESyaEbnkYpFgl7P6Xlmo/ SMiXpOf3GPzA2Oj9ZN83wntQfV3ABjlml6lcRqymmRF0KeybRUBOlSjMM7tycJaf m5sfFip9m9cnx5iTgOHdvJO6LzbVxI6PrXzc4Os9wqquIdkNjyhVwmloMliip3QQ xHOw6iQ5H6NLsetO3qsZrUbbkpvQBr6es25kIdUh9eUQG1pC8TLf53lfcGdq16Zj GIJFYe4qy5/D6mj/IvibmHihUHBdCxnzgkO5frkk83I0vEO/lqy7nGPS4Q5e6KAj fsbC3ULZ2ae6z84Oxe2k =gMzg -----END PGP SIGNATURE----- --pgp-sign-Multipart_Fri_Aug_30_08:53:14_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 21:38:15 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 7E3BAD06; Thu, 29 Aug 2013 21:38:15 +0000 (UTC) (envelope-from spork@bway.net) Received: from smtp2.bway.net (smtp2.v6.bway.net [IPv6:2607:d300:1::28]) (using TLSv1 with cipher ADH-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4C1BD2B78; Thu, 29 Aug 2013 21:38:15 +0000 (UTC) Received: from toasty.sporklab.com (foon.sporktines.com [96.57.144.66]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: spork@bway.net) by smtp2.bway.net (Postfix) with ESMTPSA id 6EA0495875; Thu, 29 Aug 2013 17:38:07 -0400 (EDT) References: <521F05F0.4090607@cloverinformatica.it> <521F0DEB.20408@FreeBSD.org> In-Reply-To: <521F0DEB.20408@FreeBSD.org> Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii Message-Id: Content-Transfer-Encoding: quoted-printable From: Charles Sprickman Subject: Re: Boot problem if a ZFS log device is missing Date: Thu, 29 Aug 2013 17:38:06 -0400 To: Andriy Gapon X-Mailer: Apple Mail (2.1085) Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 21:38:15 -0000 On Aug 29, 2013, at 5:01 AM, Andriy Gapon wrote: > on 29/08/2013 11:27 Maurizio Vairani said the following: >>=20 >> I am able to boot the PC without a cache device but not without a log = device. Why ? >=20 > The log could potentially contain uncommitted entries. Without the = log device > there is no knowing if it did or did not. And if it did then the pool = is > inconsistent state without the log device and so it can not be = imported. If one is willing to accept that data is lost (like the log device is = totally smoked), is there a way to boot knowing that you may have some = data loss, or is the only option to boot alternate media and force a = pool import (assuming that works without the log device)? Charles >=20 > The cache is not persistent and so there is nothing needed from it = upon a boot. >=20 > --=20 > Andriy Gapon > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 21:56:46 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 00CD764E for ; Thu, 29 Aug 2013 21:56:45 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-qc0-x234.google.com (mail-qc0-x234.google.com [IPv6:2607:f8b0:400d:c01::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id B56DA2D3A for ; Thu, 29 Aug 2013 21:56:45 +0000 (UTC) Received: by mail-qc0-f180.google.com with SMTP id l13so481293qcy.25 for ; Thu, 29 Aug 2013 14:56:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=58BZGNunMJEKjfb5xK0lllAZdFvBhXjKVRPuGmSCOmY=; b=zNx2C2LhjJpbWWtGy8cogpDw/5Cv6fQPUo7QbFobpTzstjDJ7KAbNthPsvWKYp1CJi TRdCLDC3BDemEsvYEJoAW6wXkQF2HMmNGMEKjfTd+irfG7G4+2p/dneLrw9mnE/4NZvv qDW3s22GpAIeQGoLMl1poCr/mNOascy2zp3yJdienBZM+5vv4zkPmfi4E5Oy99C+q1AF hTl7GZ+CDsJg7fkyDyT1K+R3FJYxAyIJ1A7RjPsaDu3DUmO6JiNV2XXJ0luSTvL5FVsf 0GT5OW5/8lSWyijk4WXWf9Q4pduEhUcJ5xZ2ckIgWrzY9L8CTMSPpDIKtqMim/Wtx6vr Rs4w== MIME-Version: 1.0 X-Received: by 10.49.6.99 with SMTP id z3mr6658524qez.27.1377813404857; Thu, 29 Aug 2013 14:56:44 -0700 (PDT) Received: by 10.49.1.98 with HTTP; Thu, 29 Aug 2013 14:56:44 -0700 (PDT) In-Reply-To: <877gf4cj7o.wl%berend@pobox.com> References: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> <877gf4cj7o.wl%berend@pobox.com> Date: Thu, 29 Aug 2013 14:56:44 -0700 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: Freddie Cash To: Berend de Boer Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 21:56:46 -0000 On Thu, Aug 29, 2013 at 1:53 PM, Berend de Boer wrote: > >>>>> "Freddie" == Freddie Cash writes: > > Freddie> This will show you how many bytes of the ARC are being > Freddie> used by metadata: sysctl vfs.zfs.arc_meta_used > > Freddie> You can tune what the max is via: sysctl > Freddie> vfs.zfs.arc_meta_limit > > Strangely enough on my system arc_meta_used = 2GB, while > arc_meta_limit = 1.5G. > > Wouldn't you expect that arc_meta_used would be lower than this limit? > You would think so. Maybe this is not a hard limit, but a high-water mark that leads to an eviction process running? Don't know, I'm not at all versed in the inner workings of ZFS. :) I do know other vfs.zfs.*limit tunables were high-water marks and work was underway to turn them into hard limits. Maybe this needs to be added to that list? Maybe it's not worth worrying about? -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 21:58:51 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 43264709 for ; Thu, 29 Aug 2013 21:58:51 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-qe0-x22a.google.com (mail-qe0-x22a.google.com [IPv6:2607:f8b0:400d:c02::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 01FE62D6A for ; Thu, 29 Aug 2013 21:58:50 +0000 (UTC) Received: by mail-qe0-f42.google.com with SMTP id w7so552470qeb.15 for ; Thu, 29 Aug 2013 14:58:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=YcA6LtD3vN+9+Vc/1o1TiQL3AyvEm43Rg1PRAWmLvr0=; b=dPIbMQI1sm9RzvzJJngPDHJyek1CfCOFY2tdtvqq2uIM8VSE5tOKKiyeKQuk7JCP5+ Kw/wIv+bsO2NSz2yaoG8tOd1QCf5SiqluqIKpHKYXJFtdfso8Kn7A0M35ATH2K5oiwh+ jLy+3dlvL/+Ju743Y8qven+QVvraxG3eqoH8k9qvCVGaJ7lv1/CjIeT8q8/U2uCwK8aO jV+NhFUjQa5qYmb5zUYzrd8PFBMvBEzjxJcaFvxsyFBYdklqzJN9ug1aHP9sLrybyL3Y Nu0lL7T4nXMuOy+/X3gA5A/EeTFrXnHW45rRHtUHIP7ZeWPrxzHLXN5f/F8Z3Q6cPFGU isvg== MIME-Version: 1.0 X-Received: by 10.49.6.99 with SMTP id z3mr6666472qez.27.1377813530238; Thu, 29 Aug 2013 14:58:50 -0700 (PDT) Received: by 10.49.1.98 with HTTP; Thu, 29 Aug 2013 14:58:50 -0700 (PDT) In-Reply-To: References: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> <877gf4cj7o.wl%berend@pobox.com> Date: Thu, 29 Aug 2013 14:58:50 -0700 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: Freddie Cash To: Eric Browning Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 21:58:51 -0000 On Thu, Aug 29, 2013 at 2:55 PM, Eric Browning < ericbrowning@skaggscatholiccenter.org> wrote: > sysctl vfs.zfs.arc_meta_used yeilds: > > vfs.zfs.arc_meta_used: 6916698888 > > There is 56GB of RAM dedicated to ARC > So that's using the full 12.5% (one eighth) of the ARC. Depending on how efficient your ARC is (high data hit rates, low data hit rates, low actual memory usage, etc), you may want to test increasing this to see if it helps with the NFS cache hit rates stuff. From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 22:03:52 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id EBD248AE for ; Thu, 29 Aug 2013 22:03:52 +0000 (UTC) (envelope-from ericbrowning@skaggscatholiccenter.org) Received: from mail-pa0-f52.google.com (mail-pa0-f52.google.com [209.85.220.52]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C12542DC3 for ; Thu, 29 Aug 2013 22:03:52 +0000 (UTC) Received: by mail-pa0-f52.google.com with SMTP id kq13so1473494pab.25 for ; Thu, 29 Aug 2013 15:03:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=/fYacKA+TVOK3XsaefkbQxy8BcLu5B4tvpeCeYNgkmY=; b=Djq2OvVUiSYSVd3tLkc3TE6MX9rIORj/2XHh7w9oRtsrL05Rf0SrFjmc9VfiX2udXt /FOgr77whzNqmZC36PC97Qso/de+85fwlwQXsa/nn0oHPQyxshs372o9A4/x1qS0JeSo wLwUk/H52/HtqLIyzVTDLUhtGfM/kcGWvT/8dvH19k9CeOO4qOLa02W0e078FSJQygQ/ 5t4ZvHO97vWmutn3eaCUCvDccCOECuHFkJb+ho+PewZpKmOCClFTBCJqb8z9xiIaqH4z uIzXugaxCIsg22B2IIQ52f35/FTvCIa256XSLRysKApe4eX4gwUfveXj6Z2tFuIgqFQX qSUA== X-Gm-Message-State: ALoCoQkBWWl90uazZnVjTLabdGP9nmhJHd5TacVQtbRTr+250ZL6BNlEkhlr5t2HibVwOQTpxNaR MIME-Version: 1.0 X-Received: by 10.68.236.168 with SMTP id uv8mr6076135pbc.124.1377813826143; Thu, 29 Aug 2013 15:03:46 -0700 (PDT) Received: by 10.70.26.4 with HTTP; Thu, 29 Aug 2013 15:03:46 -0700 (PDT) In-Reply-To: References: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> <877gf4cj7o.wl%berend@pobox.com> Date: Thu, 29 Aug 2013 16:03:46 -0600 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: Eric Browning To: Freddie Cash Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 22:03:53 -0000 Oops forgot to include the limit value sysctl vfs.zfs.arc_meta_limit vfs.zfs.arc_meta_limit: 14217216000 so using about 7 gig, limit is 14 gig. On Thu, Aug 29, 2013 at 3:58 PM, Freddie Cash wrote: > On Thu, Aug 29, 2013 at 2:55 PM, Eric Browning < > ericbrowning@skaggscatholiccenter.org> wrote: > >> sysctl vfs.zfs.arc_meta_used yeilds: >> >> vfs.zfs.arc_meta_used: 6916698888 >> >> There is 56GB of RAM dedicated to ARC >> > > So that's using the full 12.5% (one eighth) of the ARC. Depending on how > efficient your ARC is (high data hit rates, low data hit rates, low actual > memory usage, etc), you may want to test increasing this to see if it helps > with the NFS cache hit rates stuff. > > -- Eric Browning Systems Administrator 801-984-7623 Skaggs Catholic Center Juan Diego Catholic High School Saint John the Baptist Middle Saint John the Baptist Elementary From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 22:04:17 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 8ED6D925 for ; Thu, 29 Aug 2013 22:04:17 +0000 (UTC) (envelope-from ericbrowning@skaggscatholiccenter.org) Received: from mail-pd0-f178.google.com (mail-pd0-f178.google.com [209.85.192.178]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6624C2DCC for ; Thu, 29 Aug 2013 22:04:17 +0000 (UTC) Received: by mail-pd0-f178.google.com with SMTP id w10so1017749pde.9 for ; Thu, 29 Aug 2013 15:04:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=odu1/mWVIMAeTXzwADvnEanr/GCWXyWHyw/UCm7ymq0=; b=AUdTnKVPrEHhTa9qus757PPb2ETiaEIrAZSCAaevPQg/qSDIcbmvus2fBTfxqqPP8m iB+IRoYjXTpMdmby/tW2UdZUKF0DDYuQu662dZmprsQOENHWT2icd1T+ZZylCp6kJZ/2 rph0D1WsVRKAv09x1jEwELkJDKvGytgnZnMNpab8CGo88Mw7uSr9t8stPGvXnTWdTMmJ Dnp4eohin9rj4wTltbZFPUfi7jT4cRYl4kOKsjG84x+HxCbJ1MWV2U6DqHqf1rNoviE4 zwId/FrUqFCK8GjOmrgEo+8FGYU7gSnykx/+e8aS0kNwaVQzkP56bwpqT8v8OhygCFdT uG3A== X-Gm-Message-State: ALoCoQnpFNFHfIMMV4ktTj06ffxtqvsEB5ojXyUc1CTAl2LJRH2U/EUo9ddx0gpHL6eSNUP86mo6 MIME-Version: 1.0 X-Received: by 10.66.121.131 with SMTP id lk3mr6929120pab.43.1377813352377; Thu, 29 Aug 2013 14:55:52 -0700 (PDT) Received: by 10.70.26.4 with HTTP; Thu, 29 Aug 2013 14:55:52 -0700 (PDT) In-Reply-To: <877gf4cj7o.wl%berend@pobox.com> References: <1079189088.15197172.1377784177805.JavaMail.root@uoguelph.ca> <877gf4cj7o.wl%berend@pobox.com> Date: Thu, 29 Aug 2013 15:55:52 -0600 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: Eric Browning To: Berend de Boer Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 22:04:17 -0000 sysctl vfs.zfs.arc_meta_used yeilds: vfs.zfs.arc_meta_used: 6916698888 There is 56GB of RAM dedicated to ARC On Thu, Aug 29, 2013 at 2:53 PM, Berend de Boer wrote: > >>>>> "Freddie" == Freddie Cash writes: > > Freddie> This will show you how many bytes of the ARC are being > Freddie> used by metadata: sysctl vfs.zfs.arc_meta_used > > Freddie> You can tune what the max is via: sysctl > Freddie> vfs.zfs.arc_meta_limit > > Strangely enough on my system arc_meta_used = 2GB, while > arc_meta_limit = 1.5G. > > Wouldn't you expect that arc_meta_used would be lower than this limit? > > -- > All the best, > > Berend de Boer > > > ------------------------------------------------------ > Awesome Drupal hosting: https://www.xplainhosting.com/ > > -- Eric Browning Systems Administrator 801-984-7623 Skaggs Catholic Center Juan Diego Catholic High School Saint John the Baptist Middle Saint John the Baptist Elementary From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 22:21:44 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id F0627B41 for ; Thu, 29 Aug 2013 22:21:43 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 8643B2ED5 for ; Thu, 29 Aug 2013 22:21:42 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArkEAGzIH1KDaFve/2dsb2JhbABaFoMmUYMnvBaBD4E5dIIkAQEFIwRSGw4KERkCBFUGLodmDKcxkgSPQBkWBQeCaIE0A5AiiQCQN4M8IIFu X-IronPort-AV: E=Sophos;i="4.89,985,1367985600"; d="scan'208";a="47422995" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 29 Aug 2013 18:21:41 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id A8824B3F41; Thu, 29 Aug 2013 18:21:41 -0400 (EDT) Date: Thu, 29 Aug 2013 18:21:41 -0400 (EDT) From: Rick Macklem To: Konstantin Belousov Message-ID: <2075428996.15437999.1377814901677.JavaMail.root@uoguelph.ca> In-Reply-To: <20130829005616.GH4972@kib.kiev.ua> Subject: Re: fixing "umount -f" for the NFS client MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_15437997_1305582011.1377814901674" X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 22:21:44 -0000 ------=_Part_15437997_1305582011.1377814901674 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Kostik wrote: > On Wed, Aug 28, 2013 at 08:15:27PM -0400, Rick Macklem wrote: > > I've been doing a little more testing of "umount -f" for NFS > > mounts and they seem to be working unless some other process/thread > > has busied the file system via vfs_busy(). > > > > Unfortunately, it is pretty easy to vfs_busy() the file system > > by using a command like "df" that is stuck on the unresponsive > > NFS server. > > > > The problem seems to be that dounmount() msleep()s while > > mnt_lockref != 0 before calling VFS_UNMOUNT(). > > > > If some call into the NFS client was done before this > > while (mp->mnt_lockref) loop with msleep() in it, it > > can easily kill off RPCs in progress. (It currently > > does this in nfs_unmount() using the newnfs_nmcancelreqs() > > call. > > > > In summary: > > - Would it be appropriate to add a new vfs_XXX method that > > dounmount() would call before the while() loop for the > > forced dismount case? > > (The default would be a no-op and I have no idea if any > > file system other than NFS would have a use for it?) > > Alternately, there could be a function pointer set non-NULL > > that would specifically be used by the NFS client for this. > > This would avoid adding a vfs_XXX() method, but would mean > > an NFS specific call ends up in the generic dounmount() code. > > > > Anyone have comments on this? > > > Yes, I do. I agree with adding the pre-unmount vfs method. > This seems to be the cleanest solution possible. > I've attached a patch. It is also at http://people.freebsd.org/~rmacklem/forced-dism.patch in case the attachment gets lost. I don't really like doing the MNT_IUNLOCK(), MNT_ILOCK() before/after the VFS_KILLIO() call, but I couldn't see any better way to do it and it looks safe to do so, at least for the forced case. I assume I would also need to bump __FreeBSD_version (and maybe VFS_VERSION?). Please review it. If anyone would like to test it, please do so. Thanks, rick ------=_Part_15437997_1305582011.1377814901674 Content-Type: text/x-patch; name=forced-dism.patch Content-Disposition: attachment; filename=forced-dism.patch Content-Transfer-Encoding: base64 LS0tIHN5cy9tb3VudC5oLnNhdgkyMDEzLTA4LTI5IDE1OjE5OjQwLjAwMDAwMDAwMCAtMDQwMAor Kysgc3lzL21vdW50LmgJMjAxMy0wOC0yOSAxNToyMzo1OS4wMDAwMDAwMDAgLTA0MDAKQEAgLTYw OSw2ICs2MDksNyBAQCB0eXBlZGVmIGludCB2ZnNfc3lzY3RsX3Qoc3RydWN0IG1vdW50ICptCiAJ CSAgICBzdHJ1Y3Qgc3lzY3RsX3JlcSAqcmVxKTsKIHR5cGVkZWYgdm9pZCB2ZnNfc3VzcF9jbGVh bl90KHN0cnVjdCBtb3VudCAqbXApOwogdHlwZWRlZiB2b2lkIHZmc19ub3RpZnlfbG93ZXJ2cF90 KHN0cnVjdCBtb3VudCAqbXAsIHN0cnVjdCB2bm9kZSAqbG93ZXJ2cCk7Cit0eXBlZGVmIHZvaWQg dmZzX2tpbGxpb190KHN0cnVjdCBtb3VudCAqbXApOwogCiBzdHJ1Y3QgdmZzb3BzIHsKIAl2ZnNf bW91bnRfdAkJKnZmc19tb3VudDsKQEAgLTYyOCw2ICs2MjksNyBAQCBzdHJ1Y3QgdmZzb3BzIHsK IAl2ZnNfc3VzcF9jbGVhbl90CSp2ZnNfc3VzcF9jbGVhbjsKIAl2ZnNfbm90aWZ5X2xvd2VydnBf dAkqdmZzX3JlY2xhaW1fbG93ZXJ2cDsKIAl2ZnNfbm90aWZ5X2xvd2VydnBfdAkqdmZzX3VubGlu a19sb3dlcnZwOworCXZmc19raWxsaW9fdAkJKnZmc19raWxsaW87CiB9OwogCiB2ZnNfc3RhdGZz X3QJX192ZnNfc3RhdGZzOwpAQCAtNzU2LDYgKzc1OCwxNCBAQCB2ZnNfc3RhdGZzX3QJX192ZnNf c3RhdGZzOwogCX0JCQkJCQkJCVwKIH0gd2hpbGUgKDApCiAKKyNkZWZpbmUJVkZTX0tJTExJTyhN UCkgZG8gewkJCQkJCVwKKwlpZiAoKihNUCktPm1udF9vcC0+dmZzX2tpbGxpbyAhPSBOVUxMKSB7 CQkJXAorCQlWRlNfUFJPTE9HVUUoTVApOwkJCQkJXAorCQkoKihNUCktPm1udF9vcC0+dmZzX2tp bGxpbykoTVApOwkJCVwKKwkJVkZTX0VQSUxPR1VFKE1QKTsJCQkJCVwKKwl9CQkJCQkJCQlcCit9 IHdoaWxlICgwKQorCiAjZGVmaW5lIFZGU19LTk9URV9MT0NLRUQodnAsIGhpbnQpIGRvCQkJCQlc CiB7CQkJCQkJCQkJXAogCWlmICgoKHZwKS0+dl92ZmxhZyAmIFZWX05PS05PVEUpID09IDApCQkJ CVwKLS0tIGtlcm4vdmZzX21vdW50LmMuc2F2CTIwMTMtMDgtMjkgMTU6MjQ6MzYuMDAwMDAwMDAw IC0wNDAwCisrKyBrZXJuL3Zmc19tb3VudC5jCTIwMTMtMDgtMjkgMTc6MjI6NTEuMDAwMDAwMDAw IC0wNDAwCkBAIC0xMjY5LDggKzEyNjksMTYgQEAgZG91bm1vdW50KG1wLCBmbGFncywgdGQpCiAJ fQogCW1wLT5tbnRfa2Vybl9mbGFnIHw9IE1OVEtfVU5NT1VOVCB8IE1OVEtfTk9JTlNNTlRROwog CS8qIEFsbG93IGZpbGVzeXN0ZW1zIHRvIGRldGVjdCB0aGF0IGEgZm9yY2VkIHVubW91bnQgaXMg aW4gcHJvZ3Jlc3MuICovCi0JaWYgKGZsYWdzICYgTU5UX0ZPUkNFKQorCWlmIChmbGFncyAmIE1O VF9GT1JDRSkgewogCQltcC0+bW50X2tlcm5fZmxhZyB8PSBNTlRLX1VOTU9VTlRGOworCQlNTlRf SVVOTE9DSyhtcCk7CisJCS8qCisJCSAqIE11c3QgYmUgZG9uZSBhZnRlciBzZXR0aW5nIE1OVEtf VU5NT1VOVEYgYW5kIGJlZm9yZQorCQkgKiB3YWl0aW5nIGZvciBtbnRfbG9ja3JlZiB0byBiZWNv bWUgMC4KKwkJICovCisJCVZGU19LSUxMSU8obXApOworCQlNTlRfSUxPQ0sobXApOworCX0KIAll cnJvciA9IDA7CiAJaWYgKG1wLT5tbnRfbG9ja3JlZikgewogCQltcC0+bW50X2tlcm5fZmxhZyB8 PSBNTlRLX0RSQUlOSU5HOwotLS0gZnMvbmZzY2xpZW50L25mc19jbHZmc29wcy5jLnNhdgkyMDEz LTA4LTI5IDE1OjMwOjUwLjAwMDAwMDAwMCAtMDQwMAorKysgZnMvbmZzY2xpZW50L25mc19jbHZm c29wcy5jCTIwMTMtMDgtMjkgMTY6NTI6NTQuMDAwMDAwMDAwIC0wNDAwCkBAIC0xMjAsNiArMTIw LDcgQEAgc3RhdGljIHZmc19yb290X3QgbmZzX3Jvb3Q7CiBzdGF0aWMgdmZzX3N0YXRmc190IG5m c19zdGF0ZnM7CiBzdGF0aWMgdmZzX3N5bmNfdCBuZnNfc3luYzsKIHN0YXRpYyB2ZnNfc3lzY3Rs X3QgbmZzX3N5c2N0bDsKK3N0YXRpYyB2ZnNfa2lsbGlvX3QgbmZzX2tpbGxpbzsKIAogLyoKICAq IG5mcyB2ZnMgb3BlcmF0aW9ucy4KQEAgLTEzNCw2ICsxMzUsNyBAQCBzdGF0aWMgc3RydWN0IHZm c29wcyBuZnNfdmZzb3BzID0gewogCS52ZnNfdW5pbml0ID0JCW5jbF91bmluaXQsCiAJLnZmc191 bm1vdW50ID0JCW5mc191bm1vdW50LAogCS52ZnNfc3lzY3RsID0JCW5mc19zeXNjdGwsCisJLnZm c19raWxsaW8gPQkJbmZzX2tpbGxpbywKIH07CiBWRlNfU0VUKG5mc192ZnNvcHMsIG5mcywgVkZD Rl9ORVRXT1JLIHwgVkZDRl9TQkRSWSk7CiAKQEAgLTE2NzYsNiArMTY3OCwxOSBAQCBuZnNfc3lz Y3RsKHN0cnVjdCBtb3VudCAqbXAsIGZzY3Rsb3BfdCBvCiB9CiAKIC8qCisgKiBLaWxsIG9mZiBh bnkgUlBDcyBpbiBwcm9ncmVzcywgc28gdGhhdCB0aGV5IHdpbGwgYWxsIHJldHVybiBlcnJvcnMu CisgKiBUaGlzIGFsbG93cyBkb3VubW91bnQoKSB0byBjb250aW51ZSBhcyBmYXIgYXMgVkZTX1VO TU9VTlQoKSBmb3IgYQorICogZm9yY2VkIGRpc21vdW50LgorICovCitzdGF0aWMgdm9pZAorbmZz X2tpbGxpbyhzdHJ1Y3QgbW91bnQgKm1wKQoreworCXN0cnVjdCBuZnNtb3VudCAqbm1wID0gVkZT VE9ORlMobXApOworCisJbmV3bmZzX25tY2FuY2VscmVxcyhubXApOworfQorCisvKgogICogRXh0 cmFjdCB0aGUgaW5mb3JtYXRpb24gbmVlZGVkIGJ5IHRoZSBubG0gZnJvbSB0aGUgbmZzIHZub2Rl LgogICovCiBzdGF0aWMgdm9pZAo= ------=_Part_15437997_1305582011.1377814901674-- From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 22:31:34 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 3B41CCEB for ; Thu, 29 Aug 2013 22:31:34 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A66732F66 for ; Thu, 29 Aug 2013 22:31:33 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r7TMVSeP021814; Fri, 30 Aug 2013 01:31:28 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r7TMVSeP021814 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id r7TMVSR2021813; Fri, 30 Aug 2013 01:31:28 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 30 Aug 2013 01:31:28 +0300 From: Konstantin Belousov To: Rick Macklem Subject: Re: fixing "umount -f" for the NFS client Message-ID: <20130829223128.GP4972@kib.kiev.ua> References: <20130829005616.GH4972@kib.kiev.ua> <2075428996.15437999.1377814901677.JavaMail.root@uoguelph.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="uFc9xn1y3ioKq896" Content-Disposition: inline In-Reply-To: <2075428996.15437999.1377814901677.JavaMail.root@uoguelph.ca> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 22:31:34 -0000 --uFc9xn1y3ioKq896 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Aug 29, 2013 at 06:21:41PM -0400, Rick Macklem wrote: > Kostik wrote: > > On Wed, Aug 28, 2013 at 08:15:27PM -0400, Rick Macklem wrote: > > > I've been doing a little more testing of "umount -f" for NFS > > > mounts and they seem to be working unless some other process/thread > > > has busied the file system via vfs_busy(). > > >=20 > > > Unfortunately, it is pretty easy to vfs_busy() the file system > > > by using a command like "df" that is stuck on the unresponsive > > > NFS server. > > >=20 > > > The problem seems to be that dounmount() msleep()s while > > > mnt_lockref !=3D 0 before calling VFS_UNMOUNT(). > > >=20 > > > If some call into the NFS client was done before this > > > while (mp->mnt_lockref) loop with msleep() in it, it > > > can easily kill off RPCs in progress. (It currently > > > does this in nfs_unmount() using the newnfs_nmcancelreqs() > > > call. > > >=20 > > > In summary: > > > - Would it be appropriate to add a new vfs_XXX method that > > > dounmount() would call before the while() loop for the > > > forced dismount case? > > > (The default would be a no-op and I have no idea if any > > > file system other than NFS would have a use for it?) > > > Alternately, there could be a function pointer set non-NULL > > > that would specifically be used by the NFS client for this. > > > This would avoid adding a vfs_XXX() method, but would mean > > > an NFS specific call ends up in the generic dounmount() code. > > >=20 > > > Anyone have comments on this? > > >=20 > > Yes, I do. I agree with adding the pre-unmount vfs method. > > This seems to be the cleanest solution possible. > >=20 > I've attached a patch. It is also at > http://people.freebsd.org/~rmacklem/forced-dism.patch > in case the attachment gets lost. > I don't really like doing the MNT_IUNLOCK(), MNT_ILOCK() before/after > the VFS_KILLIO() call, but I couldn't see any better way to do it and > it looks safe to do so, at least for the forced case. Might be, call it VFS_PURGE() ? I suggest to move the call to the VFS_KILLIO after the MNTK_DRAINING is set, to avoid getting new references after the current i/o transactions are stopped. You would need to set MNTK_DRAINING unconditionally. Also, it probably makes sense to replace the if (mnt_lockref) with while (). >=20 > I assume I would also need to bump __FreeBSD_version (and maybe VFS_VERSI= ON?). I think you could avoid it. --uFc9xn1y3ioKq896 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.21 (FreeBSD) iQIcBAEBAgAGBQJSH8u/AAoJEJDCuSvBvK1B1dsQAKlRITSVql/vHTUinsxv6hq5 Qaq8EdsVyTTwoUi2fqp1Vf91B+K0prFiWEwJkJHfLIMzKCGT47wM30pWQvDRZ+Xl OcrOLilWj3WgdolfQRubNH35xlO5WhQOSxuPslhzyIvQOUninRUBXWQZy5ptq4tG nAHHLmlfviAN5gfu3gKoSizfDoB0aJwC0MoLZMD72ta4CVArmkK/qQMxNbZn+w7O U7/Owm1HCZKNyAukdLIpmqEViusCDpvwtfNR3wi5JaJLE+leT8K/vcmCivLH9QqP Lx/SppNsW9a/temHpSs4mBK7pR+awqF2k0cVYIoquSyDk1xaMsqYQOwWhYDbzm51 CSsm89c0TNZJyuQxfCouNVg7p3pzrRjyC5EE+Azu9T+gkTNq+PQaVr6G3DOQdEdm 1DbxAiBK5EC2yfdQREf9twRcXVB2Qor8IEEXeuyzX3P+s+9NI2RqrpXyKRC0c4We KkNdwHkakhA8PQZl64x8lP8ugvye7JewYa0gTORehBwR/fuAfMqhYfWVgoAYKH0b AfW1Zx3tnSN3gX0FVM761Q+hZRJ3Ov76obsSS1pr+WmNEOwtdzdCC4bS0xqx21+B WCR/Fqf+HMd8L6KJLXbG7zDeuZyYddR5Pg4o3f9lQ8SMRCxEZFRyrnGByP8RfHaK XwuZaen7vIO4wF30Wjtq =+HoH -----END PGP SIGNATURE----- --uFc9xn1y3ioKq896-- From owner-freebsd-fs@FreeBSD.ORG Thu Aug 29 23:43:41 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id E6E74DFF for ; Thu, 29 Aug 2013 23:43:41 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id AD1AF2340 for ; Thu, 29 Aug 2013 23:43:41 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArkEABjcH1KDaFve/2dsb2JhbABaFoMmUYMnvBaBD4E3dIIkAQEEASMEUgUWDgoCAg0ZAlkGLodgBgynNJILgSmOFzQHgmiBNAOZIpA3gzwggW4 X-IronPort-AV: E=Sophos;i="4.89,986,1367985600"; d="scan'208";a="48179327" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 29 Aug 2013 19:43:34 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id CC988B3F13; Thu, 29 Aug 2013 19:43:34 -0400 (EDT) Date: Thu, 29 Aug 2013 19:43:34 -0400 (EDT) From: Rick Macklem To: Konstantin Belousov Message-ID: <537646864.15457428.1377819814825.JavaMail.root@uoguelph.ca> In-Reply-To: <20130829223128.GP4972@kib.kiev.ua> Subject: Re: fixing "umount -f" for the NFS client MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Aug 2013 23:43:42 -0000 Kostik wrote: > On Thu, Aug 29, 2013 at 06:21:41PM -0400, Rick Macklem wrote: > > Kostik wrote: > > > On Wed, Aug 28, 2013 at 08:15:27PM -0400, Rick Macklem wrote: > > > > I've been doing a little more testing of "umount -f" for NFS > > > > mounts and they seem to be working unless some other > > > > process/thread > > > > has busied the file system via vfs_busy(). > > > > > > > > Unfortunately, it is pretty easy to vfs_busy() the file system > > > > by using a command like "df" that is stuck on the unresponsive > > > > NFS server. > > > > > > > > The problem seems to be that dounmount() msleep()s while > > > > mnt_lockref != 0 before calling VFS_UNMOUNT(). > > > > > > > > If some call into the NFS client was done before this > > > > while (mp->mnt_lockref) loop with msleep() in it, it > > > > can easily kill off RPCs in progress. (It currently > > > > does this in nfs_unmount() using the newnfs_nmcancelreqs() > > > > call. > > > > > > > > In summary: > > > > - Would it be appropriate to add a new vfs_XXX method that > > > > dounmount() would call before the while() loop for the > > > > forced dismount case? > > > > (The default would be a no-op and I have no idea if any > > > > file system other than NFS would have a use for it?) > > > > Alternately, there could be a function pointer set non-NULL > > > > that would specifically be used by the NFS client for this. > > > > This would avoid adding a vfs_XXX() method, but would mean > > > > an NFS specific call ends up in the generic dounmount() code. > > > > > > > > Anyone have comments on this? > > > > > > > Yes, I do. I agree with adding the pre-unmount vfs method. > > > This seems to be the cleanest solution possible. > > > > > I've attached a patch. It is also at > > http://people.freebsd.org/~rmacklem/forced-dism.patch > > in case the attachment gets lost. > > I don't really like doing the MNT_IUNLOCK(), MNT_ILOCK() > > before/after > > the VFS_KILLIO() call, but I couldn't see any better way to do it > > and > > it looks safe to do so, at least for the forced case. > Might be, call it VFS_PURGE() ? > Sure, any name is fine with me. > I suggest to move the call to the VFS_KILLIO after the MNTK_DRAINING > is > set, to avoid getting new references after the current i/o > transactions > are stopped. You would need to set MNTK_DRAINING unconditionally. > Also, > it probably makes sense to replace the if (mnt_lockref) with while > (). > Hmm. When I look at the code, the only use of MNTK_DRAINING seems to be to tell vfs_unbusy() to do a wakeup() if mnt_lockref == 0. I don't see why setting it before VFS_PURGE() would matter? Let me explain what the NFS client does: If is sees MNTK_UNMOUNTF set, it fails VOP/VFS calls without attempting any RPCs. That's why I needed MNTK_UNMOUNTF set before the VFS_PURGE() call. The VFS_PURGE() call causes any RPC that is already in progress to fail (by closing the connection to the server). If there is a case where an RPC attempt can get stuck after this point, it's a bug in the NFS client I will need to find;-) --> Once MNTK_UNMOUNTF is set and VFS_PURGE() is called, all VOP/VFS ops should return failure without attempting to do RPCs against the server. If some thread does vfs_busy() while VFS_PURGE() is in progress or before (any time MNT_ILOCK() isn't held) it should end up doing a vfs_unbusy() at some point without getting stuck trying to do an RPC against the server. If this happens before dounmount() re-acquires the MNT_ILOCK(), it should be ok, since mnt_lockref has been decremented. If it does this after the dounmount() thread re-acquires MNT_ILOCK(), dounmount() should be in the msleep() with MNTK_DRAINING set, so it will get the wakeup once mnt_lockref has decremented to 0. Setting MNTK_DRAINING sooner would just result in the odd unnecessary wakeup(), from what I can see? > > > > I assume I would also need to bump __FreeBSD_version (and maybe > > VFS_VERSION?). > I think you could avoid it. > Do you mean I don't need to bump __FreeBSD_version or VFS_VERSION or both? Thanks, rick From owner-freebsd-fs@FreeBSD.ORG Fri Aug 30 08:17:15 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id DF8A4819 for ; Fri, 30 Aug 2013 08:17:15 +0000 (UTC) (envelope-from maurizio.vairani@cloverinformatica.it) Received: from smtpdg8.aruba.it (smtpdg226.aruba.it [62.149.158.226]) by mx1.freebsd.org (Postfix) with ESMTP id 44821207F for ; Fri, 30 Aug 2013 08:17:14 +0000 (UTC) Received: from cloverinformatica.it ([188.10.129.202]) by smtpcmd03.ad.aruba.it with bizsmtp id JwHB1m01N4N8xN401wHCpk; Fri, 30 Aug 2013 10:17:12 +0200 Received: from [192.168.0.81] (ASUS-TERMINATOR [192.168.0.81]) by cloverinformatica.it (Postfix) with ESMTP id 449A113914; Fri, 30 Aug 2013 10:17:12 +0200 (CEST) Message-ID: <52205507.4030802@cloverinformatica.it> Date: Fri, 30 Aug 2013 10:17:11 +0200 From: Maurizio Vairani User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1 MIME-Version: 1.0 To: Andriy Gapon Subject: Re: Boot problem if a ZFS log device is missing References: <521F05F0.4090607@cloverinformatica.it> <521F0DEB.20408@FreeBSD.org> In-Reply-To: <521F0DEB.20408@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 08:17:15 -0000 On 29/08/2013 11.01, Andriy Gapon wrote: > on 29/08/2013 11:27 Maurizio Vairani said the following: >> I am able to boot the PC without a cache device but not without a log device. Why ? > The log could potentially contain uncommitted entries. Without the log device > there is no knowing if it did or did not. And if it did then the pool is > inconsistent state without the log device and so it can not be imported. > > The cache is not persistent and so there is nothing needed from it upon a boot. > Thank you for the clear and concise reply. Yesterday I have done some test. If I remove the stick from the USB port, before the shutdown the PC, it don't crash but continues to works. Then I am able to reboot the laptop without inserting the stick with a pool that works in degraded mode. From the end user point of view a PC should always boot, even with a missing ZFS log device. Regards Maurizio From owner-freebsd-fs@FreeBSD.ORG Fri Aug 30 08:53:42 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id AEB0B592 for ; Fri, 30 Aug 2013 08:53:42 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4CE562322 for ; Fri, 30 Aug 2013 08:53:42 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r7U8ralr052323; Fri, 30 Aug 2013 11:53:36 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r7U8ralr052323 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id r7U8rapG052322; Fri, 30 Aug 2013 11:53:36 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 30 Aug 2013 11:53:36 +0300 From: Konstantin Belousov To: Rick Macklem Subject: Re: fixing "umount -f" for the NFS client Message-ID: <20130830085336.GU4972@kib.kiev.ua> References: <20130829223128.GP4972@kib.kiev.ua> <537646864.15457428.1377819814825.JavaMail.root@uoguelph.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="J4V09s3zFtpAP6QA" Content-Disposition: inline In-Reply-To: <537646864.15457428.1377819814825.JavaMail.root@uoguelph.ca> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 08:53:42 -0000 --J4V09s3zFtpAP6QA Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Aug 29, 2013 at 07:43:34PM -0400, Rick Macklem wrote: > Kostik wrote: > > On Thu, Aug 29, 2013 at 06:21:41PM -0400, Rick Macklem wrote: > > > Kostik wrote: > > > > On Wed, Aug 28, 2013 at 08:15:27PM -0400, Rick Macklem wrote: > > > > > I've been doing a little more testing of "umount -f" for NFS > > > > > mounts and they seem to be working unless some other > > > > > process/thread > > > > > has busied the file system via vfs_busy(). > > > > >=20 > > > > > Unfortunately, it is pretty easy to vfs_busy() the file system > > > > > by using a command like "df" that is stuck on the unresponsive > > > > > NFS server. > > > > >=20 > > > > > The problem seems to be that dounmount() msleep()s while > > > > > mnt_lockref !=3D 0 before calling VFS_UNMOUNT(). > > > > >=20 > > > > > If some call into the NFS client was done before this > > > > > while (mp->mnt_lockref) loop with msleep() in it, it > > > > > can easily kill off RPCs in progress. (It currently > > > > > does this in nfs_unmount() using the newnfs_nmcancelreqs() > > > > > call. > > > > >=20 > > > > > In summary: > > > > > - Would it be appropriate to add a new vfs_XXX method that > > > > > dounmount() would call before the while() loop for the > > > > > forced dismount case? > > > > > (The default would be a no-op and I have no idea if any > > > > > file system other than NFS would have a use for it?) > > > > > Alternately, there could be a function pointer set non-NULL > > > > > that would specifically be used by the NFS client for this. > > > > > This would avoid adding a vfs_XXX() method, but would mean > > > > > an NFS specific call ends up in the generic dounmount() code. > > > > >=20 > > > > > Anyone have comments on this? > > > > >=20 > > > > Yes, I do. I agree with adding the pre-unmount vfs method. > > > > This seems to be the cleanest solution possible. > > > >=20 > > > I've attached a patch. It is also at > > > http://people.freebsd.org/~rmacklem/forced-dism.patch > > > in case the attachment gets lost. > > > I don't really like doing the MNT_IUNLOCK(), MNT_ILOCK() > > > before/after > > > the VFS_KILLIO() call, but I couldn't see any better way to do it > > > and > > > it looks safe to do so, at least for the forced case. > > Might be, call it VFS_PURGE() ? > >=20 > Sure, any name is fine with me. >=20 > > I suggest to move the call to the VFS_KILLIO after the MNTK_DRAINING > > is > > set, to avoid getting new references after the current i/o > > transactions > > are stopped. You would need to set MNTK_DRAINING unconditionally. > > Also, > > it probably makes sense to replace the if (mnt_lockref) with while > > (). > >=20 > Hmm. When I look at the code, the only use of MNTK_DRAINING seems to > be to tell vfs_unbusy() to do a wakeup() if mnt_lockref =3D=3D 0. I don't > see why setting it before VFS_PURGE() would matter? >=20 > Let me explain what the NFS client does: > If is sees MNTK_UNMOUNTF set, it fails VOP/VFS calls without attempting > any RPCs. That's why I needed MNTK_UNMOUNTF set before the VFS_PURGE() > call. The VFS_PURGE() call causes any RPC that is already in progress to > fail (by closing the connection to the server). If there is a case where > an RPC attempt can get stuck after this point, it's a bug in the NFS clie= nt > I will need to find;-) > --> Once MNTK_UNMOUNTF is set and VFS_PURGE() is called, all VOP/VFS ops > should return failure without attempting to do RPCs against the serve= r. > If some thread does vfs_busy() while VFS_PURGE() is in progress or before > (any time MNT_ILOCK() isn't held) it should end up doing a vfs_unbusy() at > some point without getting stuck trying to do an RPC against the server. > If this happens before dounmount() re-acquires the MNT_ILOCK(), it should= be ok, > since mnt_lockref has been decremented. > If it does this after the dounmount() thread re-acquires MNT_ILOCK(), dou= nmount() > should be in the msleep() with MNTK_DRAINING set, so it will get the wake= up > once mnt_lockref has decremented to 0. >=20 > Setting MNTK_DRAINING sooner would just result in the odd unnecessary wak= eup(), > from what I can see? >=20 Hm, I mis-remembered the vfs_busy() code. Yes, I agree with you. > > >=20 > > > I assume I would also need to bump __FreeBSD_version (and maybe > > > VFS_VERSION?). > > I think you could avoid it. > >=20 > Do you mean I don't need to bump __FreeBSD_version or VFS_VERSION or both? I do not see much sense in bumping either of them. You might want to bump __FreeBSD_version when merging to stable. --J4V09s3zFtpAP6QA Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.21 (FreeBSD) iQIcBAEBAgAGBQJSIF2PAAoJEJDCuSvBvK1BBFoQAI6iR1c8lZ6AQxdCfmukzlUm g1R4ZqmyOO2RSYgWpYr0vDCyBjDaZ0yHZ9B4LEHNXQ9mg7P5Y/fouYuCluuo/jWt +7He7xf3NZy7ziYKzj7HdV76Rktwd00y5QZ2b1tJSw0C3fmHySs78DH7N0I5eFCj vlKnUxHm+sw5ZnXq3bXj6DHuMFgESlxME3kbauHKglF40LNTBRjYArahgkdIKHgF +bIX5MCva1F5AccbYhF9iTld8W3Zvftcif2XuuUlEKG/R10wdaJfwAhDZGNO8R5c BXa0t6rN3xaxJpCs66anSDdFHOc2rfSuEH0/kH4i1MLfqYyHDIweaSnhMIOp82U2 ausgzDseHA+yAkNNkzRG3XIyEvUfo7O1veWNC+I0tzuQDHTjDdbRwS+A20pUPonS ehSayrmbl4VTV6levQuaALMM5OydrvT0PAsB4mP+qjFnbrtutxTCjUZ7IgaRI7Qs 4G+OMelfhFaV/nOABxYLpXe7ft2m+BODJgu2CNA4ECF3tU1qSICNEW0yqGWUpVAM T2nA3/ISCCcUJf4p/jA6F/iSNGCh61eMYgm5YHVsyeqkUg7Q/I0t0v1iraOsz3jR 1cD+Lojm/L4WqEU9txwYPQMmcskAiiAviy5AY1IjfrzQIX6/EZJshN3+797vMIOB B/VoWnK+ebInmSQ85rKO =zco0 -----END PGP SIGNATURE----- --J4V09s3zFtpAP6QA-- From owner-freebsd-fs@FreeBSD.ORG Fri Aug 30 10:38:40 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D87811FD; Fri, 30 Aug 2013 10:38:40 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id EFE2F2970; Fri, 30 Aug 2013 10:38:39 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA11720; Fri, 30 Aug 2013 13:38:33 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VFM64-0007bu-PU; Fri, 30 Aug 2013 13:38:32 +0300 Message-ID: <522075D6.70600@FreeBSD.org> Date: Fri, 30 Aug 2013 13:37:10 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130810 Thunderbird/17.0.8 MIME-Version: 1.0 To: Charles Sprickman Subject: Re: Boot problem if a ZFS log device is missing References: <521F05F0.4090607@cloverinformatica.it> <521F0DEB.20408@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 10:38:40 -0000 on 30/08/2013 00:38 Charles Sprickman said the following: > If one is willing to accept that data is lost (like the log device is totally smoked), is there a way to boot knowing that you may have some data loss, or is the only option to boot alternate media and force a pool import (assuming that works without the log device)? I think it's the latter. I am not aware of any way to select a behavior similar to import -m or import -F during boot. Perhaps... ZFS_IMPORT_MISSING_LOG should be a default behavior for a root pool or maybe the behavior could be controllable by a tunable. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Fri Aug 30 10:47:26 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 3863768D; Fri, 30 Aug 2013 10:47:26 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5134C2A08; Fri, 30 Aug 2013 10:47:24 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA12017; Fri, 30 Aug 2013 13:47:21 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VFMEa-0007cl-No; Fri, 30 Aug 2013 13:47:20 +0300 Message-ID: <52207800.2060901@FreeBSD.org> Date: Fri, 30 Aug 2013 13:46:24 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130810 Thunderbird/17.0.8 MIME-Version: 1.0 To: Maurizio Vairani Subject: Re: Boot problem if a ZFS log device is missing References: <521F05F0.4090607@cloverinformatica.it> <521F0DEB.20408@FreeBSD.org> <522075D6.70600@FreeBSD.org> In-Reply-To: <522075D6.70600@FreeBSD.org> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 10:47:26 -0000 on 30/08/2013 13:37 Andriy Gapon said the following: > on 30/08/2013 00:38 Charles Sprickman said the following: >> If one is willing to accept that data is lost (like the log device is totally smoked), is there a way to boot knowing that you may have some data loss, or is the only option to boot alternate media and force a pool import (assuming that works without the log device)? > > I think it's the latter. I am not aware of any way to select a behavior similar > to import -m or import -F during boot. > Perhaps... ZFS_IMPORT_MISSING_LOG should be a default behavior for a root pool > or maybe the behavior could be controllable by a tunable. > Maurizio, you might want to try the following patch as an interim solution for your environment: --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c @@ -4112,6 +4112,7 @@ spa_import_rootpool(const char *name) } spa->spa_is_root = B_TRUE; spa->spa_import_flags = ZFS_IMPORT_VERBATIM; + spa->spa_import_flags |= ZFS_IMPORT_MISSING_LOG; /* XXX make tunable */ /* * Build up a vdev tree based on the boot device's label config. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Fri Aug 30 12:24:57 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id EAC06819 for ; Fri, 30 Aug 2013 12:24:57 +0000 (UTC) (envelope-from matt.home@userve.net) Received: from smtp-outbound.userve.net (smtp-outbound.userve.net [217.196.1.22]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 897B02FD7 for ; Fri, 30 Aug 2013 12:24:57 +0000 (UTC) Received: from webmail.userve.net (db3.userve.net [217.196.1.19]) (authenticated bits=0) by smtp-outbound.userve.net (8.14.5/8.14.5) with ESMTP id r7UCC0Tv023529 for ; Fri, 30 Aug 2013 13:12:00 +0100 (BST) (envelope-from matt.home@userve.net) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Date: Fri, 30 Aug 2013 13:12:04 +0100 From: Matt Churchyard To: Subject: Boot problem if a ZFS log device is missing Message-ID: <78b974623984482459c6279b43144276@users.userve.net> X-Sender: matt.home@userve.net User-Agent: Roundcube Webmail/0.8.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 12:24:58 -0000 > Yesterday I have done some test. If I remove the stick from the USB > port, before the shutdown the PC, it don't crash but continues to > works. Then I am able to reboot the laptop without inserting the > stick > with a pool that works in degraded mode. > From the end user point of view a PC should always boot, even with a > missing ZFS log device. The problem is that if a machine comes up without a LOG device that was there previously, It can't guarantee that there weren't pending writes. To automatically import the pool could be dangerous - leaving critical data corrupt. That's not really acceptable in a production environment. ZFS does the *right* thing by requiring an admin to get involved. It may be that the admin forces a rollback and checks any applications are ok manually, or it could be that they just plug in a device that was removed by accident. I haven't followed official ZFS since Oracle came along but Sun's kit used to only allow a simple disk or mirror for the root pool. There are some good reasons for this, and the failure to import the pool if ZIL is lost was probably one of them. I wouldn't recommend running any serious system with a large or complex pool that's also being used for root. For a home PC maybe it is useful to have a tunable that says "just force an import and ignore any possible writes if the ZILs gone, I'll deal with any problems that appear". The only issue with that is most people won't know to switch it on until it's too late. From owner-freebsd-fs@FreeBSD.ORG Fri Aug 30 13:00:52 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id B97914CB for ; Fri, 30 Aug 2013 13:00:52 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 7CEDA22A3 for ; Fri, 30 Aug 2013 13:00:51 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArkEAB6XIFKDaFve/2dsb2JhbABaFoMmUYMnvBmBD4E3dIIkAQEFIwRSGw4KAgINGQJZBi6HZgynDZIWgSmOFzQHgmiBNAOZIpA3gzwggW4 X-IronPort-AV: E=Sophos;i="4.89,990,1367985600"; d="scan'208";a="48260454" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 30 Aug 2013 09:00:50 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 46E6DB3F12; Fri, 30 Aug 2013 09:00:50 -0400 (EDT) Date: Fri, 30 Aug 2013 09:00:50 -0400 (EDT) From: Rick Macklem To: Konstantin Belousov Message-ID: <1251021093.15594833.1377867650267.JavaMail.root@uoguelph.ca> In-Reply-To: <20130830085336.GU4972@kib.kiev.ua> Subject: Re: fixing "umount -f" for the NFS client MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 13:00:52 -0000 Kostik wrote: > On Thu, Aug 29, 2013 at 07:43:34PM -0400, Rick Macklem wrote: > > Kostik wrote: > > > On Thu, Aug 29, 2013 at 06:21:41PM -0400, Rick Macklem wrote: > > > > Kostik wrote: > > > > > On Wed, Aug 28, 2013 at 08:15:27PM -0400, Rick Macklem wrote: > > > > > > I've been doing a little more testing of "umount -f" for > > > > > > NFS > > > > > > mounts and they seem to be working unless some other > > > > > > process/thread > > > > > > has busied the file system via vfs_busy(). > > > > > > > > > > > > Unfortunately, it is pretty easy to vfs_busy() the file > > > > > > system > > > > > > by using a command like "df" that is stuck on the > > > > > > unresponsive > > > > > > NFS server. > > > > > > > > > > > > The problem seems to be that dounmount() msleep()s while > > > > > > mnt_lockref != 0 before calling VFS_UNMOUNT(). > > > > > > > > > > > > If some call into the NFS client was done before this > > > > > > while (mp->mnt_lockref) loop with msleep() in it, it > > > > > > can easily kill off RPCs in progress. (It currently > > > > > > does this in nfs_unmount() using the newnfs_nmcancelreqs() > > > > > > call. > > > > > > > > > > > > In summary: > > > > > > - Would it be appropriate to add a new vfs_XXX method that > > > > > > dounmount() would call before the while() loop for the > > > > > > forced dismount case? > > > > > > (The default would be a no-op and I have no idea if any > > > > > > file system other than NFS would have a use for it?) > > > > > > Alternately, there could be a function pointer set > > > > > > non-NULL > > > > > > that would specifically be used by the NFS client for > > > > > > this. > > > > > > This would avoid adding a vfs_XXX() method, but would > > > > > > mean > > > > > > an NFS specific call ends up in the generic dounmount() > > > > > > code. > > > > > > > > > > > > Anyone have comments on this? > > > > > > > > > > > Yes, I do. I agree with adding the pre-unmount vfs method. > > > > > This seems to be the cleanest solution possible. > > > > > > > > > I've attached a patch. It is also at > > > > http://people.freebsd.org/~rmacklem/forced-dism.patch > > > > in case the attachment gets lost. > > > > I don't really like doing the MNT_IUNLOCK(), MNT_ILOCK() > > > > before/after > > > > the VFS_KILLIO() call, but I couldn't see any better way to do > > > > it > > > > and > > > > it looks safe to do so, at least for the forced case. > > > Might be, call it VFS_PURGE() ? > > > > > Sure, any name is fine with me. > > > > > I suggest to move the call to the VFS_KILLIO after the > > > MNTK_DRAINING > > > is > > > set, to avoid getting new references after the current i/o > > > transactions > > > are stopped. You would need to set MNTK_DRAINING unconditionally. > > > Also, > > > it probably makes sense to replace the if (mnt_lockref) with > > > while > > > (). > > > > > Hmm. When I look at the code, the only use of MNTK_DRAINING seems > > to > > be to tell vfs_unbusy() to do a wakeup() if mnt_lockref == 0. I > > don't > > see why setting it before VFS_PURGE() would matter? > > > > Let me explain what the NFS client does: > > If is sees MNTK_UNMOUNTF set, it fails VOP/VFS calls without > > attempting > > any RPCs. That's why I needed MNTK_UNMOUNTF set before the > > VFS_PURGE() > > call. The VFS_PURGE() call causes any RPC that is already in > > progress to > > fail (by closing the connection to the server). If there is a case > > where > > an RPC attempt can get stuck after this point, it's a bug in the > > NFS client > > I will need to find;-) > > --> Once MNTK_UNMOUNTF is set and VFS_PURGE() is called, all > > VOP/VFS ops > > should return failure without attempting to do RPCs against the > > server. > > If some thread does vfs_busy() while VFS_PURGE() is in progress or > > before > > (any time MNT_ILOCK() isn't held) it should end up doing a > > vfs_unbusy() at > > some point without getting stuck trying to do an RPC against the > > server. > > If this happens before dounmount() re-acquires the MNT_ILOCK(), it > > should be ok, > > since mnt_lockref has been decremented. > > If it does this after the dounmount() thread re-acquires > > MNT_ILOCK(), dounmount() > > should be in the msleep() with MNTK_DRAINING set, so it will get > > the wakeup > > once mnt_lockref has decremented to 0. > > > > Setting MNTK_DRAINING sooner would just result in the odd > > unnecessary wakeup(), > > from what I can see? > > > Hm, I mis-remembered the vfs_busy() code. Yes, I agree with you. > > > > > > > > > I assume I would also need to bump __FreeBSD_version (and maybe > > > > VFS_VERSION?). > > > I think you could avoid it. > > > > > Do you mean I don't need to bump __FreeBSD_version or VFS_VERSION > > or both? > I do not see much sense in bumping either of them. > You might want to bump __FreeBSD_version when merging to stable. > Ok, thanks. I'll consider the code reviewed by you unless I here otherwise from you. rick From owner-freebsd-fs@FreeBSD.ORG Fri Aug 30 19:28:29 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id BA7D2F0A for ; Fri, 30 Aug 2013 19:28:29 +0000 (UTC) (envelope-from sfourman@gmail.com) Received: from mail-vc0-x22e.google.com (mail-vc0-x22e.google.com [IPv6:2607:f8b0:400c:c03::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 7A95F2A0C for ; Fri, 30 Aug 2013 19:28:29 +0000 (UTC) Received: by mail-vc0-f174.google.com with SMTP id gd11so1583168vcb.5 for ; Fri, 30 Aug 2013 12:28:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=BWj7Mtfowuuxz0A238enm7cGWZQYnv8wGiwr7GWNbsM=; b=QEQ5RInUP97FhmwXA2klVMXX4oHQPWs9i9wwyt6vb4xp3YQL+UXFDEwJ6Dhn0SEf3O R/viYfwyqbYVXVniwiVetvaCE+navn6BAt4dIyCUFsR+OE62cz9FEyVJSPdNhRvdMTcm ztpOnfeOqDXPeVCGrSw/dl9vLvP35D9LnAyMmDRacZ29H8nU18t2SQE0rHxPC7I5P6ED 6rurcqEbJTSfITnz2jbplEuh2S8ll2u7NHRz4in36EN7Pctw2n5Plwwl/341HtY/ortp XvOes74cdsF0eCun/YelryVDQPqqWx/nvW9mfdW5woljRxMWf+fs48Btr4Zxn+mM3XKH x2BA== MIME-Version: 1.0 X-Received: by 10.52.76.38 with SMTP id h6mr7257333vdw.10.1377890908544; Fri, 30 Aug 2013 12:28:28 -0700 (PDT) Received: by 10.220.96.78 with HTTP; Fri, 30 Aug 2013 12:28:28 -0700 (PDT) In-Reply-To: <2008996797.14358576.1377631792358.JavaMail.root@uoguelph.ca> References: <2008996797.14358576.1377631792358.JavaMail.root@uoguelph.ca> Date: Fri, 30 Aug 2013 15:28:28 -0400 Message-ID: Subject: Re: NFS on ZFS pure SSD pool From: "Sam Fourman Jr." To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 19:28:29 -0000 > > You could try this patch: > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > - After applying the patch and booting a kernel built from the patched > sources, you need to increase the value of vfs.nfsd.tcphighwater. > (Try something like 5000 for it as a starting point.) > > Although this patch is somewhat different code, it should be semantically > the same as r254337 in head, that is scheduled to be MFC'd to stable/9 in > a couple of weeks. > > Does anyone know why the server would get these errors? $ cat /var/log/messages | grep failed Aug 30 10:22:20 students nfsd[1978]: accept failed: Software caused connection abort Aug 30 10:27:16 students nfsd[1978]: accept failed: Software caused connection abort Aug 30 11:46:30 students nfsd[1978]: accept failed: Software caused connection abort Aug 30 11:47:10 students nfsd[1978]: accept failed: Software caused connection abort -- Sam Fourman Jr. From owner-freebsd-fs@FreeBSD.ORG Fri Aug 30 23:28:52 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 62C677C2 for ; Fri, 30 Aug 2013 23:28:52 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 2918026C6 for ; Fri, 30 Aug 2013 23:28:51 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEABcqIVKDaFve/2dsb2JhbABaFoMmUYMnvT2BNHSCJAEBBSNWGxgCAg0ZAlkGE4gBDKddki2BKY4RNAeCaIE0A5kkkDeDPCCBbg X-IronPort-AV: E=Sophos;i="4.89,994,1367985600"; d="scan'208";a="47622804" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 30 Aug 2013 19:28:44 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id F20CCB403A; Fri, 30 Aug 2013 19:28:44 -0400 (EDT) Date: Fri, 30 Aug 2013 19:28:44 -0400 (EDT) From: Rick Macklem To: "Sam Fourman Jr." Message-ID: <258054624.15907722.1377905324980.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: NFS on ZFS pure SSD pool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Aug 2013 23:28:52 -0000 Sam Fourman Jr. wrote: > > > > > > > > You could try this patch: > http://people.freebsd.org/~rmacklem/drc4-stable9.patch > - After applying the patch and booting a kernel built from the > patched > sources, you need to increase the value of vfs.nfsd.tcphighwater. > (Try something like 5000 for it as a starting point.) > > Although this patch is somewhat different code, it should be > semantically > the same as r254337 in head, that is scheduled to be MFC'd to > stable/9 in > a couple of weeks. > > > > > Does anyone know why the server would get these errors? > > > $ cat /var/log/messages | grep failed > Aug 30 10:22:20 students nfsd[1978]: accept failed: Software caused > connection abort > Aug 30 10:27:16 students nfsd[1978]: accept failed: Software caused > connection abort > Aug 30 11:46:30 students nfsd[1978]: accept failed: Software caused > connection abort > Aug 30 11:47:10 students nfsd[1978]: accept failed: Software caused > connection abort > Since the master socket that is accepting connections isn't being closed, I believe this error (ECONNABORTED returned by accept()) occurs when the client closes the new TCP connection before it has been accepted. Why would an NFS client do this? I have no idea. You might want to post on freebsd-net@freebsd.org with a subject line like "When does accept(2) fail with ECONNABORTED?" to try and confirm the above. rick > > > -- > > Sam Fourman Jr. > From owner-freebsd-fs@FreeBSD.ORG Sat Aug 31 15:50:01 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D32FE5BD for ; Sat, 31 Aug 2013 15:50:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A56F42835 for ; Sat, 31 Aug 2013 15:50:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r7VFo1SC063254 for ; Sat, 31 Aug 2013 15:50:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r7VFo1Sa063253; Sat, 31 Aug 2013 15:50:01 GMT (envelope-from gnats) Date: Sat, 31 Aug 2013 15:50:01 GMT Message-Id: <201308311550.r7VFo1Sa063253@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: Moritz Wilhelmy Subject: Re: kern/162591: [nullfs] cross-filesystem nullfs does not work as expected X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Moritz Wilhelmy List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Aug 2013 15:50:01 -0000 The following reply was made to PR kern/162591; it has been noted by GNATS. From: Moritz Wilhelmy To: bug-followup@FreeBSD.org, Gleb Kurtsou Cc: Subject: Re: kern/162591: [nullfs] cross-filesystem nullfs does not work as expected Date: Sat, 31 Aug 2013 17:46:06 +0200 On Mon, Nov 21, 2011 at 16:23:47 +0200, Gleb Kurtsou wrote: > That is expected behaviour, according to mount_nullfs(8): -snip- Alright. Care to close the PR then? > I think writing a small script to do nested mounts/unmounts that suites > your needs would be the best option here. That's what I've done. Thanks!