From owner-freebsd-fs@FreeBSD.ORG Mon Jan 14 19:51:50 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B7CAE16B; Mon, 14 Jan 2013 19:51:50 +0000 (UTC) (envelope-from nicolas@i.0x5.de) Received: from n.0x5.de (n.0x5.de [217.197.85.144]) by mx1.freebsd.org (Postfix) with ESMTP id E5A186FB; Mon, 14 Jan 2013 19:51:49 +0000 (UTC) Received: by pc5.i.0x5.de (Postfix, from userid 1003) id 3YlQLD53Bzz7ySc; Mon, 14 Jan 2013 20:51:48 +0100 (CET) Date: Mon, 14 Jan 2013 20:51:48 +0100 From: Nicolas Rachinsky To: Artem Belevich Subject: Re: slowdown of zfs (tx->tx) Message-ID: <20130114195148.GA20540@mid.pc5.i.0x5.de> References: <20130108174225.GA17260@mid.pc5.i.0x5.de> <20130109162613.GA34276@mid.pc5.i.0x5.de> <20130110193949.GA10023@mid.pc5.i.0x5.de> <20130111073417.GA95100@mid.pc5.i.0x5.de> <20130114094010.GA75529@mid.pc5.i.0x5.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Powered-by: FreeBSD X-Homepage: http://www.rachinsky.de X-PGP-Keyid: 887BAE72 X-PGP-Fingerprint: 039E 9433 115F BC5F F88D 4524 5092 45C4 887B AE72 X-PGP-Keys: http://www.rachinsky.de/nicolas/gpg/nicolas_rachinsky.asc User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jan 2013 19:51:50 -0000 * Artem Belevich [2013-01-14 11:13 -0800]: > On Mon, Jan 14, 2013 at 1:40 AM, Nicolas Rachinsky > wrote: > > 5 Reallocated_Sector_Ct 0x0033 094 094 010 Pre-fail Always - 166 > > 195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 1259614646 > > 196 Reallocated_Event_Count 0x0032 096 096 000 Old_age Always - 166 > > > Reallocated_Sector_Ct did not increase during the last days. > > It does not matter IMHO. That hard drive already got quite a few bad > sectors that ECC could not deal with. There are apparently more > marginally bad sectors, but ECC deals with it for now. Once enough > bits rot, you'll get more bad sectors. I personally would replace the > drive. Yes, I'll do that. > >> Cound you do gstat with 1-second interval. Some of the 5-second > >> samples show that ada8 is the bottleneck -- it has its request queue > >> full (L(q)=10) when all other drives were done with their jobs. And > >> that's a 5-sec average. Its write service time also seems to be a lot > >> higher than for other drives. > > > > Attached. I have replace ada8 by ada9, which is a Western Digital > > Caviar Black. > > > > Now ada0 and ada4 seem to be the bottleneck. > > > > But I don't understand the intervalls without any disk activity. > > It is puzzling. Is rsync still sleeping in tx->tx state? Try running > "procstat -kk " periodically. It will print in-kernel stack > trace and may help giving a clue where/why rsync is stuck. # sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 1; done' | sort | uniq -c 100 PID TID COMM TDNAME KSTACK 1 36639 100574 rsync - 99 36639 100574 rsync - mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc # sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 0.36; done' | sort | uniq -c 100 PID TID COMM TDNAME KSTACK 1 36639 100574 rsync - mi_switch+0x176 sleepq_timedwait+0x42 _cv_timedwait+0x134 txg_delay+0x137 dsl_pool_tempreserve_space+0xd5 dsl_dir_tempreserve_space+0x154 dmu_tx_assign+0x370 zfs_freebsd_write+0x38a VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc 99 36639 100574 rsync - mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc # sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 0.1; done' | sort | uniq -c 100 PID TID COMM TDNAME KSTACK 100 36639 100574 rsync - mi_switch+0x176 sleepq_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc Thanks in advance Nicolas -- http://www.rachinsky.de/nicolas