From owner-freebsd-questions@FreeBSD.ORG Wed Aug 7 20:36:27 2013 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id A727E4B6 for ; Wed, 7 Aug 2013 20:36:27 +0000 (UTC) (envelope-from jdavidlists@gmail.com) Received: from mail-ie0-x22c.google.com (mail-ie0-x22c.google.com [IPv6:2607:f8b0:4001:c03::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 78F5127AB for ; Wed, 7 Aug 2013 20:36:27 +0000 (UTC) Received: by mail-ie0-f172.google.com with SMTP id 17so346093iea.31 for ; Wed, 07 Aug 2013 13:36:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=FWmV1lZpV0wfCDxxuAYL5+FRbbt/Ve+4Qp6KTfr8oy4=; b=dH+5Upt/Jv9ivVDcvHn1blUCpOXQrNszaoMh2HMtQZSTPK5fDGi9I0rnue6sL9oKAM TGPbq37gjocJbh6n/lwa/WKEJl6+onOCgxuyI6CPXo10HowbEP3MEpgKWQn/Re3fjZbz ZqrAoX+b8MflIIrkr0SbW47obIfjSS6e7/fLz1A9aU+NiVqBxMt2cuqp95uO6VvOM+xv bMoUzLDaXDDzp9y7JjIVxjsu6keacKOnyLkBEl6ZgF2z/j+iYa1v7HZStDoBaaOYFgj5 dANFR0rwNEcrDJT4EOI6gwetez+MWITMG9WMecRSTQOMFnTkWGSVm6or01iPqkAkQQi4 CFow== MIME-Version: 1.0 X-Received: by 10.43.137.9 with SMTP id im9mr456979icc.39.1375907786465; Wed, 07 Aug 2013 13:36:26 -0700 (PDT) Sender: jdavidlists@gmail.com Received: by 10.42.150.196 with HTTP; Wed, 7 Aug 2013 13:36:26 -0700 (PDT) In-Reply-To: References: Date: Wed, 7 Aug 2013 16:36:26 -0400 X-Google-Sender-Auth: TWNBsZ2kkLZaK8L0XG5ha7qJxc4 Message-ID: Subject: Re: Terrible disk performance with LSI / FreeBSD 9.2-RC1 From: J David To: James Gosnell Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Aug 2013 20:36:27 -0000 On Wed, Aug 7, 2013 at 3:15 PM, James Gosnell wrote: > Maybe one of your drives is bad, so it's constantly doing error correction? Not according to SMART; all the drives report no problems. Also, all the drives seem to perform in lock-step for both reading and writing. E.g. when one drive in an array is failing, all the drives may be pulling the same # of reads, but the failing drive will often report 100% busy and/or multi-second svc_t's and the others will sit at 4% with 20msec svc_t's or similar. In this case, it's acting like the disks are all hugely overloaded. Except without even the high svc_t's I typically associate with overworking an array. The speeds do fluctuate. Last night it was down to 64k/sec reads per drive (about 15 reads/sec) and still reporting 90% busy on all drives. It feels like some sort of issue with the bus/controller/kernel/driver/ZFS that is affecting all the drives equally. Also, even ls takes forever (10-30 seconds for "ls -lh /") but when it eventually does finish, "time ls -lh /" reports: 0.02 real 0.00 user 0.00 sys Really not sure what to make of that. An attempt to do "ps axlww | fgrep ls" while the ls was running failed, because the ps hangs just as long as the ls. So it's like the system is just repeatedly putting anything that touches the disks on hold, even if all the data being requested is clearly in cache. (Even apparently loading the binary for /bin/ls or doing "ls -lh /" twice in a row.) Thanks!