From owner-freebsd-fs@FreeBSD.ORG Sun Nov 24 16:29:37 2013 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C4823CC7 for ; Sun, 24 Nov 2013 16:29:37 +0000 (UTC) Received: from mail-qe0-x22c.google.com (mail-qe0-x22c.google.com [IPv6:2607:f8b0:400d:c02::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6065E2854 for ; Sun, 24 Nov 2013 16:29:37 +0000 (UTC) Received: by mail-qe0-f44.google.com with SMTP id nd7so2285759qeb.31 for ; Sun, 24 Nov 2013 08:29:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eitanadler.com; s=0xdeadbeef; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=S9AN62/oKePlL3ze3AUb6dBtK/BJYqbm+aZ7jHJK39Y=; b=oBVd+78W9p09iCvyEcszqzh8ZAFFqo64y0/MB3VDqt5mkF9FRxXW9DL0xdj4KsmwS3 ljDqFXH4aEs3oaECjHaUjTYLD5AbWmmsYV7upL0A9sGbj0WbAQDazFD5RWlmlv8bzv2F qmNjXPkZm3u95G2AXPdWwDVtaQLe4Uq+G0Mgs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=S9AN62/oKePlL3ze3AUb6dBtK/BJYqbm+aZ7jHJK39Y=; b=BAnFC3vktgwmJzaZhKG+3P8idkJt3/iGKsv/kFOR1+XF+dClzoxXH+8peInAbQGXzr haKf/fqnIp4lWldeaFmzTBqrI4iP/z4vfSDDaPxdtsHp7OqpmHTDcVGOk3/AXb2Za86s hUmXFW6r5WoJrUtLLAKrjUz6wajaEcs3ZDGDiEHcn1JTuBkl/D5nL2FFb1/4tX2hg2hj QPBd3efubCxelb4pp1fA1RebtN6x9KszaaxJjF+f4Hcl4KVaOAUSP1ySfu/HlYsu0OB/ PsjH6/9ZGf1G15Vudd1sQ1T5UYfmhbh1hyKilOdEM2fSpI6K92P3Q69v9EmZdPLebhin wOYQ== X-Gm-Message-State: ALoCoQkpkwYH/W0hPSeb+fQk44I1VJ7ef3exIDUfnqKd0e2e0JC1BFimLaE/L9X8RAoHjwLS88os X-Received: by 10.224.151.209 with SMTP id d17mr39109868qaw.45.1385310576490; Sun, 24 Nov 2013 08:29:36 -0800 (PST) MIME-Version: 1.0 Received: by 10.96.63.101 with HTTP; Sun, 24 Nov 2013 08:29:06 -0800 (PST) In-Reply-To: <5291B2CC.2040907@gibfest.dk> References: <5290E0CF.20704@gibfest.dk> <5291B2CC.2040907@gibfest.dk> From: Eitan Adler Date: Sun, 24 Nov 2013 11:29:06 -0500 Message-ID: Subject: Re: ZFS (or something) is absurdly slow To: Thomas Steen Rasmussen Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Nov 2013 16:29:37 -0000 On Sun, Nov 24, 2013 at 3:03 AM, Thomas Steen Rasmussen wrote: > On 24-11-2013 04:15, Eitan Adler wrote: >> >> >>> vfsstat.d https://forums.freebsd.org/showpost.php?p=182070&postcount=6 >> >> I can run this script, what output should I be looking for? > > > Check the sample output on the page: It shows two lists, "Number > of operations" and "Bytes read or write". The lists are ordered > with the busiest at the bottom and seperated by filesystem > location. While things are slow, try running it to see if some > location on the filesystem is being hammered..... I will do so. > Another thing you should probably do is run a SMART check on the > disk to see if something is wrong with it. See the complete output below. The only thing which stands out to me is: 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 19275 > I had another case with > a zfs mirror that performed appalingly, turned out it was because > one of the disks was dodgy, not in a way that made zfs show > checksum errors, but enough to make it really really slow. Since a > ZFS vdev only performs as good as the slowest disk in a vdev, > which in turn will slow the whole pool down, replacing the disk > made everything much better. =============== smartctl 6.2 2013-07-26 r3841 [FreeBSD 11.0-CURRENT amd64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Seagate Momentus SpinPoint M8 (AF) Device Model: ST1000LM024 HN-M101MBB Serial Number: S2U5J9FCB79134 LU WWN Device Id: 5 0004cf 20904e7cf Firmware Version: 2AR10001 User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 6 SATA Version is: SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s) Local Time is: Sun Nov 24 11:23:21 2013 EST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (12780) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 213) minutes. SCT capabilities: (0x003f) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail Always - 25 2 Throughput_Performance 0x0026 252 252 000 Old_age Always - 0 3 Spin_Up_Time 0x0023 089 089 025 Pre-fail Always - 3453 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 158 5 Reallocated_Sector_Ct 0x0033 252 252 010 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 252 252 051 Old_age Always - 0 8 Seek_Time_Performance 0x0024 252 252 015 Old_age Offline - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 7285 10 Spin_Retry_Count 0x0032 252 252 051 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 280 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 180 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always - 155 192 Power-Off_Retract_Count 0x0022 252 252 000 Old_age Always - 0 194 Temperature_Celsius 0x0002 055 047 000 Old_age Always - 45 (Min/Max 18/63) 195 Hardware_ECC_Recovered 0x003a 100 100 000 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 252 252 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 252 252 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 252 252 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0036 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age Always - 19275 223 Load_Retry_Count 0x0032 100 100 000 Old_age Always - 280 225 Load_Cycle_Count 0x0032 091 091 000 Old_age Always - 95664 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 0 Note: revision number not 1 implies that no selective self-test has ever been run SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Completed [00% left] (0-65535) 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. ===================== > What does diskinfo -ct /dev/whatever say about the seek times on the > bad disk ? [10005 root@gravity (100%) /home/eitan !2!]#diskinfo -ct /dev/ada1 /dev/ada1 512 # sectorsize 1000204886016 # mediasize in bytes (932G) 1953525168 # mediasize in sectors 4096 # stripesize 0 # stripeoffset 1938021 # Cylinders according to firmware. 16 # Heads according to firmware. 63 # Sectors according to firmware. S2U5J9FCB79134 # Disk ident. I/O command overhead: time to read 10MB block 0.099713 sec = 0.005 msec/sector time to read 20480 sectors 1.447996 sec = 0.071 msec/sector calculated command overhead = 0.066 msec/sector Seek times: Full stroke: 250 iter in 8.036950 sec = 32.148 msec Half stroke: 250 iter in 5.463750 sec = 21.855 msec Quarter stroke: 500 iter in 10.542506 sec = 21.085 msec Short forward: 400 iter in 5.707363 sec = 14.268 msec Short backward: 400 iter in 4.645333 sec = 11.613 msec Seq outer: 2048 iter in 0.096977 sec = 0.047 msec Seq inner: 2048 iter in 1.853596 sec = 0.905 msec Transfer rates: outside: 102400 kbytes in 0.949048 sec = 107898 kbytes/sec middle: 102400 kbytes in 1.659245 sec = 61715 kbytes/sec inside: 102400 kbytes in 2.020322 sec = 50685 kbytes/sec > Are the results the same if you boot off of an usb stick and > test the disk when it is completely idle and independent of the running OS ? Good question. I can not check this at the moment. -- Eitan Adler