From owner-svn-src-head@freebsd.org Thu Apr 14 22:24:46 2016 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 984E6ADAA91 for ; Thu, 14 Apr 2016 22:24:46 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-ig0-x242.google.com (mail-ig0-x242.google.com [IPv6:2607:f8b0:4001:c05::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5ECBF1D0B for ; Thu, 14 Apr 2016 22:24:46 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-ig0-x242.google.com with SMTP id kb1so752427igb.3 for ; Thu, 14 Apr 2016 15:24:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc; bh=184bR5OqcmQKsYMUM420AJuSJijMKNiISE8vbV9qhOE=; b=ww5rYsP+QjNkesZxdNrSVmnj226Hm9DOoAe6WvObg9wdmQ/zT5PQC4ym18BuAECqEH toYG+gcv34Dfl+S6n/JyTV5xQ10QibX5klLrKZF9BdbZNUqYE6/Q5RcM1p0AT1jykn5P 7DI3jxFEDU7Gllz0KkFMuTIQ27/65dDYyj8RGIezsbYU+5S/Qth0W49pCoqqMpqYU8m3 mJCcecCGVK43rldWViq4ORdjvfrvmrTzT8oWpNS4aU4U/h0uPS82sWzVVRqR/9Pg10Q8 xABpDNoq3vhbgbu3MHVNzoB7zUBhDbX0lH9VMkSHirexJ0eQnG/02x5LYOrWAdy7qHVn W4HQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc; bh=184bR5OqcmQKsYMUM420AJuSJijMKNiISE8vbV9qhOE=; b=YwC7Ehxsl72RSvDDDcxXdUF3/1HDdqS/h/gPL1KpdR0N9Pl2NbEIBCTTcMaZFrrhTC RRotPtQkF2QSoeGeGEfof+fa1dRSoSQ00NAA6AKO3jku8WG2qCZoSE1lcHaPRMN0WBpy h3ySYJp25G5pL4W6RQFZgdxNIfk/4Nl6rRh/s7DxTDpPCib7p4cE2dgGndb+ziynvVx9 ms2ipAgIVvDOk8jgYpjpZ8ZdFxHyZAR4tEsOGugdJPUmzBd3wzKg18S0+LlrR49Zn/rL 5y/jkusn6ILRpD8zGPt2nVoypaO6uWBcu07J9BkXjdSwXmiYEj369v/VY4P3VDORRGC6 bXKA== X-Gm-Message-State: AOPr4FWfUDK88iMdhNGVLIA0+dARlCRj6ws59CYVg9WGlkw4OOBpTYQEkzLCkulvJAR/WtHPMfJavg7yAoXPSQ== MIME-Version: 1.0 X-Received: by 10.50.67.113 with SMTP id m17mr1110537igt.52.1460672685862; Thu, 14 Apr 2016 15:24:45 -0700 (PDT) Sender: wlosh@bsdimp.com Received: by 10.36.194.3 with HTTP; Thu, 14 Apr 2016 15:24:45 -0700 (PDT) X-Originating-IP: [50.253.99.174] In-Reply-To: <20160414221517.GA66711@mutt-hardenedbsd> References: <201604142147.u3ELlwYo052010@repo.freebsd.org> <20160414221517.GA66711@mutt-hardenedbsd> Date: Thu, 14 Apr 2016 16:24:45 -0600 X-Google-Sender-Auth: _UKUFP2cCOocOcaxnmt7zPgRzfM Message-ID: Subject: Re: svn commit: r298002 - in head/sys: cam cam/ata cam/scsi conf dev/ahci From: Warner Losh To: Shawn Webb Cc: Dmitry Morozovsky , "svn-src-head@freebsd.org" , "svn-src-all@freebsd.org" , src-committers , Warner Losh Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Apr 2016 22:24:46 -0000 On Thu, Apr 14, 2016 at 4:15 PM, Shawn Webb wrote: > On Thu, Apr 14, 2016 at 04:04:27PM -0600, Warner Losh wrote: > > On Thu, Apr 14, 2016 at 3:54 PM, Dmitry Morozovsky > wrote: > > > > > Warner, > > > > > > On Thu, 14 Apr 2016, Warner Losh wrote: > > > > > > > Author: imp > > > > Date: Thu Apr 14 21:47:58 2016 > > > > New Revision: 298002 > > > > URL: https://svnweb.freebsd.org/changeset/base/298002 > > > > > > > > Log: > > > > New CAM I/O scheduler for FreeBSD. The default I/O scheduler is the > > > same > > > > > > [snip] > > > > > > First, thanks so much for this quite a non-trivial work! > > > What are the ways to enable this instead of deafult, and what ar the > > > benefits > > > and drawbacks? > > > > > > You add CAM_NETFLIX_IOSCHED to your kernel config to enable it. Hmmm, > > looking at the diff, perhaps I should add that to LINT. > > > > In production, we use it for three things. First, our scheduler keeps a > lot > > more > > statistics than the default one. These statistics are useful for us > knowing > > when > > a system is saturated and needs to shed load. Second, we favor reads over > > writes because our workload, as you might imagine, is a read mostly work > > load. > > Finally, in some systems, we throttle the write throughput to the SSDs. > The > > SSDs > > we buy can do 300MB/s write while serving 400MB/s read, but only for > short > > periods of time (long enough to do 10-20GB of traffic). After that, write > > performance > > drops, and read performance goes out the window. Experiments have shown > that > > if we limit the write speed to no more than 30MB/s or so, then the > garbage > > collection the drive is doing won't adversely affect the read latency / > > performance. > > Going on a tangent here, but related: > > As someone who is just barely stepping into the world of benchmarks and > performance metrics, can you shed some light as to how you gained those > metrics? I'd be extremely interested to learn. > These numbers were derived through an iterative process. All our systems report a large number of statistics while they are running. The disk performance numbers come from gstat(8) which ultimately derives them from devstat(9). When we enabled serving customer traffic while refreshing content, we noticed a large number of reports from our playback clients indicating problems with the server during this time period. I looked at the graphs to see what was going on. Once I found the problem, I was able to see that as the write load varied, the latency numbers for the reads would vary substantially as well. I added code to the I/O scheduler so I could rate limit the write speed to the SSDs. After running through a number of different machines over a number of nights of filling and serving, I was able to find the right number. If I set it to 30MB, the 20 machines I tested didn't have any reports above background level of problems. When I set it to 35MB/s there was a couple of those machines that had problems. when I set it to 40MB/s there were a couple more. When I set it to 80MB/s, almost all had problems. Being conservative, I set it to the highest number that showed no ill effect on the clients. I was able to see large jumps in read latency as low as 25MB/s though. Sadly, this is with Netflix internal tools, but one could do the same research with gstat and scripting. One could also use dtrace to study the latency patterns to a much finer degree of fidelity than gstat offers. Warner