From owner-freebsd-hackers@FreeBSD.ORG Mon Apr 27 16:46:30 2015 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1B9E196F; Mon, 27 Apr 2015 16:46:30 +0000 (UTC) Received: from mail-wi0-x233.google.com (mail-wi0-x233.google.com [IPv6:2a00:1450:400c:c05::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9F1A01D42; Mon, 27 Apr 2015 16:46:29 +0000 (UTC) Received: by wicmx19 with SMTP id mx19so88572037wic.1; Mon, 27 Apr 2015 09:46:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=JQ4fiXMK1n5zg2UnADc22eGS0Ivoxz4D/HI3C3x5TPw=; b=RM0g6zMfDGyADBWpIzfe3nL76Aq0xDRH3DPvFYV/BIS8N7r/06CX556NJ4Isfs8fQG o5MtqqWU6t6ZYLh+FJAnmOpT0wSja+biulbwB31VwnSt0Nfso8eOQpSmiYOTqSsBSr0q 7zNw1VEHsFlLIr0O1cHRU+gEKOZOfBk7S2N71GSYVsHVmoF8UnnkJNh253lAqbQka9Yc bnMy1Umw+and6WDFcfugnrEdk9HBxp8RBa0/L8H1TWDVc5XXuHL+Q+/mH6uzJ16dGlhu 55yR1A5bMOIUGphyuTgwTWvYg7kYhjKKpHPd7IiUHprLoF9hMfymne3rBKhNEkFNAgMk 2u2Q== X-Received: by 10.194.75.168 with SMTP id d8mr24494230wjw.87.1430153188046; Mon, 27 Apr 2015 09:46:28 -0700 (PDT) Received: from [192.168.1.130] (ppp-88-217-61-4.dynamic.mnet-online.de. [88.217.61.4]) by mx.google.com with ESMTPSA id em18sm6470287wjd.19.2015.04.27.09.46.27 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 27 Apr 2015 09:46:27 -0700 (PDT) Message-ID: <553E67E2.2050404@gmail.com> Date: Mon, 27 Apr 2015 18:46:26 +0200 From: Tobias Oberstein User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Jim Harris CC: Adrian Chadd , Konstantin Belousov , "freebsd-hackers@freebsd.org" , Michael Fuckner , Alan Somers Subject: Re: NVMe performance 4x slower than expected References: <551BC57D.5070101@gmail.com> <551C5A82.2090306@gmail.com> <20150401212303.GB2379@kib.kiev.ua> <5526EA33.6090004@gmail.com> <5527F554.2030806@gmail.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Apr 2015 16:46:30 -0000 Hi Jim, I have now done extensive tests under Linux (SLES12) at the block device level. 8kB Random IO results: http://tavendo.com.s3.amazonaws.com/scratch/fio_p3700_8kB_random.pdf All results: http://tavendo.com.s3.amazonaws.com/scratch/fio_p3700.pdf What becomes apparent is: 1) IOPS is scaling nicely "linear" for (software) RAID-0 It scales up to roughly 2.2 Mio. 8kB random reads, and 750k 8kB random writes. Extrapolating Intel's datasheet would give: 2.36 Mio / 720k Awesome! 2) It does not scale for RAID-1. In fact, the write performance fully collapses for more than 4 devices. Note: I don't know which NVMe is wired to which CPU socket, and which block device - IOW: I did not "handplace" the devices into RAID sets or anything. == I am currently running the same set of tests against 10 DC S3700 via SAS. This should reveal if it's a general mdadm thing, or NVMe related. == For now, we likely will use the NVMes in a RAID-0 setup to leverage the maximum performance. Cheers, /Tobias Am 10.04.2015 um 18:58 schrieb Jim Harris: > On Fri, Apr 10, 2015 at 9:07 AM, Tobias Oberstein < > tobias.oberstein@gmail.com> wrote: > >> Hi Adrian, >> >>> Dell has graciously loaned me a bunch of hardware to continue doing >> >> FWIW, Dell has a roughly comparable system: Dell R920. But they don't have >> Intel NVMe's on their menu, only Samsung (and FusionIO, but that's not >> NVMe). >> >> NUMA development on, but I have no NVMe hardware. I'm hoping people at >>> >> >> The 8 NVMe PCIe SSDs in the box we're deploying are a key feature of this >> system (will be a data-warehouse). A single NVMe probably won't have >> triggered (all) issues we experienced. >> >> We are using the largest model (2TB), and this amounts to 50k bucks for >> all eight. The smallest model (400GB) is 1.5k, so 12k in total. >> >> Intel can continue kicking along any desires for NUMA that they >>> require. (Which they have, fwiw.) >>> >> >> It's already awesome that Intel has senior engineers working on FreeBSD >> driver code! And it would underline Intel's Open-source commitment and tech >> leadership if they donated a couple of these beefy NVMes. >> > > Intel has agreed to send DC P3700 samples to the FreeBSD Foundation to put > in the cluster for this kind of work - we are working on getting these > through the internal sample distribution process at the moment. > > -Jim >