FreeBSD Mail Archives

Date:      Mon, 14 Jan 2013 12:41:01 -0800
From:      Artem Belevich <art@freebsd.org>
To:        Nicolas Rachinsky <fbsd-mas-0@ml.turing-complete.org>
Cc:        freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: slowdown of zfs (tx->tx)
Message-ID:  <CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw@mail.gmail.com>
In-Reply-To: <20130114195148.GA20540@mid.pc5.i.0x5.de>
References:  <20130108174225.GA17260@mid.pc5.i.0x5.de> <CAFqOu6jgA8RWV5d%2BrOBk8D=3Vu3yWSnDkAi1cFJ0esj4OpBy2Q@mail.gmail.com> <20130109162613.GA34276@mid.pc5.i.0x5.de> <CAFqOu6jrng=v8eVyhqV-PBqJM_dYy%2BU7X4%2B=ahBeoxvK4mxcSA@mail.gmail.com> <20130110193949.GA10023@mid.pc5.i.0x5.de> <20130111073417.GA95100@mid.pc5.i.0x5.de> <CAFqOu6gWpMsWN0pTBiv10WfwyGWMfO9GzMLWTtcVxHixr-_i3Q@mail.gmail.com> <20130114094010.GA75529@mid.pc5.i.0x5.de> <CAFqOu6hxfGt_M6Jo9qWeifDz9YnNc_Bd9H-GEe4RYtutaPvH5w@mail.gmail.com> <20130114195148.GA20540@mid.pc5.i.0x5.de>

txg_wait_open means that ZFS is waiting for ongoing transaction group
sync. There should've been some write activity in this case.

Check what zfs kernel threads are doing with procstat -kk on zfskern proces=
s.

--Artem

On Mon, Jan 14, 2013 at 11:51 AM, Nicolas Rachinsky
<fbsd-mas-0@ml.turing-complete.org> wrote:
> * Artem Belevich <art@freebsd.org> [2013-01-14 11:13 -0800]:
>> On Mon, Jan 14, 2013 at 1:40 AM, Nicolas Rachinsky
>> <fbsd-mas-0@ml.turing-complete.org> wrote:
>> >   5 Reallocated_Sector_Ct   0x0033   094   094   010    Pre-fail  Alwa=
ys       -       166
>> > 195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Alwa=
ys       -       1259614646
>> > 196 Reallocated_Event_Count 0x0032   096   096   000    Old_age   Alwa=
ys       -       166
>>
>> > Reallocated_Sector_Ct did not increase during the last days.
>>
>> It does not matter IMHO. That hard drive already got quite a few bad
>> sectors that ECC could not deal with. There are apparently more
>> marginally bad sectors, but ECC deals with it for now. Once enough
>> bits rot, you'll get more bad sectors. I personally would replace the
>> drive.
>
> Yes, I'll do that.
>
>> >> Cound you do gstat with 1-second interval. Some of the 5-second
>> >> samples show that ada8 is the bottleneck -- it has its request queue
>> >> full (L(q)=3D10) when all other drives were done with their jobs. And
>> >> that's a 5-sec average. Its write service time also seems to be a lot
>> >> higher than for other drives.
>> >
>> > Attached.  I have replace ada8 by ada9, which is a Western Digital
>> > Caviar Black.
>> >
>> > Now ada0 and ada4 seem to be the bottleneck.
>> >
>> > But I don't understand the intervalls without any disk activity.
>>
>> It is puzzling. Is rsync still sleeping in tx->tx state? Try running
>> "procstat -kk <rsync-PID>" periodically. It will print in-kernel stack
>> trace and may help giving a clue where/why rsync is stuck.
>
> # sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 1; done' | sor=
t | uniq -c
>  100   PID    TID COMM             TDNAME           KSTACK
>    1 36639 100574 rsync            -                <running>
>   99 36639 100574 rsync            -                mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_W=
RITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 a=
md64_syscall+0x1f4 Xfast_syscall+0xfc
>
> # sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 0.36; done' | =
sort | uniq -c
>  100   PID    TID COMM             TDNAME           KSTACK
>    1 36639 100574 rsync            -                mi_switch+0x176 sleep=
q_timedwait+0x42 _cv_timedwait+0x134 txg_delay+0x137 dsl_pool_tempreserve_s=
pace+0xd5 dsl_dir_tempreserve_space+0x154 dmu_tx_assign+0x370 zfs_freebsd_w=
rite+0x38a VOP_WRITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0=
x60 write+0x55 amd64_syscall+0x1f4 Xfast_syscall+0xfc
>   99 36639 100574 rsync            -                mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_W=
RITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 a=
md64_syscall+0x1f4 Xfast_syscall+0xfc
>
> # sh -c 'for i in `jot 100`; do procstat -kk 36639 ; sleep 0.1; done' | s=
ort | uniq -c
>  100   PID    TID COMM             TDNAME           KSTACK
>  100 36639 100574 rsync            -                mi_switch+0x176 sleep=
q_wait+0x42 _cv_wait+0x129 txg_wait_open+0x85 zfs_freebsd_write+0x3a6 VOP_W=
RITE_APV+0xb2 vn_write+0x373 dofilewrite+0x8b kern_writev+0x60 write+0x55 a=
md64_syscall+0x1f4 Xfast_syscall+0xfc
>
>
> Thanks in advance
>
> Nicolas
>
> --
> http://www.rachinsky.de/nicolas

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFqOu6jwJ4qhbOovN_NhzusdQJvrbvUC3g93sziR=Uw99SGenw>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation