Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Jun 2013 15:28:39 +0200
From:      "Ronald Klop" <ronald-freebsd8@klop.yi.org>
To:        freebsd-stable@freebsd.org, =?utf-8?Q?Dennis_K=C3=B6gel?= <dk@neveragain.de>
Subject:   Re: Weird I/O hangs (9.1R, arcsas, interrupt spikes on uhci0)
Message-ID:  <op.wyxg11zc8527sy@ronaldradial.versatec.local>
In-Reply-To: <C2AA9591-CBF4-4956-BABE-08BD8994FF8C@neveragain.de>
References:  <C2AA9591-CBF4-4956-BABE-08BD8994FF8C@neveragain.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 19 Jun 2013 15:01:14 +0200, Dennis Kögel <dk@neveragain.de> wrote:

> Hi,
>
> very periodically, we see I/O hangs for about 10 seconds, roughly once  
> per minute.
>
> Each time this happens, the I/O rate simply drops to zero, and all disk  
> access hangs; this is also very noticeable on the shell, for NFS clients  
> etc. Everything else (networking, kernel, …) seems to continue normally.
>
> Environment: FreeBSD 9.1R GENERIC on amd64, using ZFS, on a ARC1320 PCIe  
> with 24x Seagate ST33000650SS (3rd party arcsas.ko driver).
>
> It's easy to observe these hangs under write load, e.g. with 'zpool  
> iostat 1':
>
> void        22.4T  42.6T     34  2.73K  1.07M   293M
> void        22.4T  42.6T     20  2.74K   623K   289M
> void        22.4T  42.6T    144  2.62K  4.83M   279M
> void        22.4T  42.6T     13  2.60K   437K   283M
> void        22.4T  42.6T      0      0      0      0 <-- hang starts
> void        22.4T  42.6T      0      0      0      0
> void        22.4T  42.6T      0      0      0      0
> void        22.4T  42.6T      0      0      0      0
> void        22.4T  42.6T      0      0      0      0
> void        22.4T  42.6T      0      0      0      0
> void        22.4T  42.6T      0      0      0      0
> void        22.4T  42.6T      0      0      0      0
> void        22.4T  42.6T      0    296  4.00K  34.2M <-- hang ends
> void        22.4T  42.6T      2  2.64K  73.8K   288M
> void        22.4T  42.6T      8  3.12K   278K   329M
>
> Each time this happens, there is a completely unexplained spike of  
> interrupts on uhci0: 'systat -vm' then displays numbers around 270k.
>
> # vmstat -i | grep -E '(arcsas|uhci0|Total)'
> irq16: uhci0                  1227020890      67708
> irq24: arcsas0                  12045211        664
> Total                         1266417827      69882
>
> Things to note:
>
> - Booting an USB-less kernel or disabling all USB in the BIOS doesn't  
> change a thing (no interrupt spikes to be seen, but the hangs remain)
> - The hangs / interrupt spikes happen just as often when the system is  
> idle
> - Board is a Supermicro x8dth
> - There's two igb cards
> - Root is ZFS as well (separate pool though)
> - BIOS, Areca FW and driver already are latest versions
> - Putting the controller to a different slot doesn't change the behaviour
> - We have two identical systems and both show the exact same symptoms,  
> so flaky hardware is probably not the issue
>
> Any ideas would be appreciated.
>
> Thanks,
> D.

First send more information about the system:
- The content of /var/run/dmesg.boot.
- Install /usr/ports/sysutils/zfs-stats and send the output of zfs-stats  
-a.
- Send the output of zpool status + zpool list.
- Did you configure compression or dedup on the pool?
- Do you keep a lot of snapshots?
- Do you run a cronjob every minute which does something with the pool?  
Gathers statistics or something like that.

Ronald.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?op.wyxg11zc8527sy>