From owner-freebsd-stable@freebsd.org Mon Aug 1 14:38:23 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 547B5BAA680 for ; Mon, 1 Aug 2016 14:38:23 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from cu01176b.smtpx.saremail.com (cu01176b.smtpx.saremail.com [195.16.151.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 535BC18F3 for ; Mon, 1 Aug 2016 14:38:21 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from [172.16.8.36] (izaro.sarenet.es [192.148.167.11]) by proxypop01.sare.net (Postfix) with ESMTPSA id 1E55D9DCA3A; Mon, 1 Aug 2016 16:38:19 +0200 (CEST) Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: Intel NVMe troubles? From: Borja Marcos In-Reply-To: Date: Mon, 1 Aug 2016 16:38:18 +0200 Cc: FreeBSD-STABLE Mailing List Message-Id: <4996AF96-76BA-47F1-B328-D4FE7AC777EE@sarenet.es> References: To: Jim Harris X-Mailer: Apple Mail (2.3124) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Aug 2016 14:38:23 -0000 > On 29 Jul 2016, at 17:44, Jim Harris wrote: >=20 >=20 >=20 > On Fri, Jul 29, 2016 at 1:10 AM, Borja Marcos = wrote: >=20 > > On 28 Jul 2016, at 19:25, Jim Harris wrote: > > > > Yes, you should worry. > > > > Normally we could use the dump_debug sysctls to help debug this - = these > > sysctls will dump the NVMe I/O submission and completion queues. = But in > > this case the LBA data is in the payload, not the NVMe submission = entries, > > so dump_debug will not help as much as dumping the NVMe DSM payload > > directly. > > > > Could you try the attached patch and send output after recreating = your pool? >=20 > Just in case the evil anti-spam ate my answer, sent the results to = your Gmail account. >=20 >=20 > Thanks Borja. >=20 > It looks like all of the TRIM commands are formatted properly. The = failures do not happen until about 10 seconds after the last TRIM to = each drive was submitted, and immediately before TRIMs start to the next = drive, so I'm assuming the failures are for the the last few TRIM = commands but cannot say for sure. Could you apply patch v2 (attached) = which will dump the TRIM payload contents inline with the failure = messages? Sure, this is the complete /var/log/messages starting with the system = boot. Before booting I destroyed the pool so that you could capture what happens when booting, zpool create, etc. Remember that the drives are in LBA format #3 (4 KB blocks). As far as I = know that=E2=80=99s preferred to the old 512 byte blocks. Thank you very much and sorry about the belated response. Borja.