Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 29 Oct 2013 11:17:34 +0400
From:      Alexey Tarasov <me@lexasoft.ru>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-stable@freebsd.org, =?utf-8?B?0KLQsNGA0LDRgdC+0LIg0JDQu9C10LrRgdC10LnigI4KINCS0Lg=?= =?utf-8?B?0LrRgtC+0YDQvtCy0LjRhw==?= <me@lexasoft.ru>
Subject:   Re: FreeBSD 9.2 UFS + GELI softdep_deallocate_dependencies: unrecovered I/O error
Message-ID:  <709C9B77-AD40-4662-96C9-A0F56369DBDE@lexasoft.ru>
In-Reply-To: <415FD2A4-E2D2-4784-A9AB-A7CCFBBAC27F@lexasoft.ru>
References:  <2AA765E7-1F17-4C6F-98BD-004AEFF88D32@lexasoft.ru> <20131027184625.GI59496@kib.kiev.ua> <415FD2A4-E2D2-4784-A9AB-A7CCFBBAC27F@lexasoft.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello.

Seems that setting kern.bio_transient_maxcnt to 8k resolved the problem.

On 27 =D0=BE=D0=BA=D1=82. 2013 =D0=B3., at 23:00, Alexey Tarasov =
<me@lexasoft.ru> wrote:

> Hello!
>=20
> Ok, I=E2=80=99ll try this.
> So this is software defect of FreeBSD 9.2?
>=20
> On 27 =D0=BE=D0=BA=D1=82. 2013 =D0=B3., at 22:46, Konstantin Belousov =
<kostikbel@gmail.com> wrote:
>=20
>> On Sat, Oct 26, 2013 at 01:47:18PM +0400, Alexey Tarasov wrote:
>>> Hello.=20
>>>=20
>>> I've upgraded server to 9.2 and now it hangs every 2-3 hours of =
intensive I/O to UFS SUJ + GELI disk. On 9.1 everything was good for a =
half of a year.=20
>>>=20
>>> g_vfs_done():da1.eli[WRITE(offset=3D614630752256, =
length=3D32768)]error =3D 11=20
>>> g_vfs_done():da1.eli[WRITE(offset=3D614631211008, =
length=3D32768)]error =3D 11=20
>>> g_vfs_done():da1.eli[WRITE(offset=3D614634815488, =
length=3D32768)]error =3D 11=20
>>> g_vfs_done():da1.eli[WRITE(offset=3D614642319360, =
length=3D32768)]error =3D 11=20
>>> g_vfs_done():da1.eli[WRITE(offset=3D614642909184, =
length=3D32768)]error =3D 11=20
>>> g_vfs_done():da1.eli[WRITE(offset=3D614643007488, =
length=3D32768)]error =3D 11=20
>>> g_vfs_done():da1.eli[WRITE(offset=3D614644875264, =
length=3D32768)]error =3D 11=20
>>> g_vfs_done():da1.eli[WRITE(offset=3D550691995648, =
length=3D98304)]error =3D 11=20
>>> g_vfs_done():da1.eli[WRITE(offset=3D550692519936, =
length=3D32768)]error =3D 11=20
>>> g_vfs_done():da1.eli[WRITE(offset=3D550704152576, =
length=3D32768)]error =3D 11=20
>>> /data/pgsql/data/base: got error 11 while accessing filesystem=20
>>> panic: softdep_deallocate_dependencies: unrecovered I/O error=20
>>> cpuid =3D 10=20
>>> KDB: stack backtrace:=20
>>> #0 0xffffffff80947986 at kdb_backtrace+0x66=20
>>> #1 0xffffffff8090d9ae at panic+0x1ce=20
>>> #2 0xffffffff80b3ff90 at clear_remove+0=20
>>> #3 0xffffffff8098fb65 at brelse+0x75=20
>>> #4 0xffffffff80990978 at bufdone+0x68=20
>>> #5 0xffffffff8098c83e at biodone+0xae=20
>>> #6 0xffffffff80872f4c at g_io_schedule_up+0xac=20
>>> #7 0xffffffff808736ac at g_up_procbody+0x5c=20
>>> #8 0xffffffff808db67f at fork_exit+0x11f=20
>>> #9 0xffffffff80cdc23e at fork_trampoline+0xe=20
>>> Uptime: 6d15h5m7s=20
>>> Dumping 7664 out of 196573 =
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%=20
>>>=20
>>> Full core.txt is here: http://lexasoft.ru/core.txt.1=20
>>>=20
>>> Server is HP Proliant DL180 G6 with P410 RAID controller.=20
>>=20
>> Look for your current value of the kern.bio_transient_maxcnt and =
increase
>> it by 4-8 times, using the same tunable.  If this helps, fine.  If =
not,
>> disable unmapped i/o with the vfs.unmapped_buf_allowed tunable.
>>=20
>> Real solution is to convert geom classes like geli to use limited
>> transient mapping windows to access the data, thus adding support for
>> unmapped i/o to them.
>=20
> --
> Alexey Tarasov
>=20
> (\__/)=20
> (=3D'.'=3D)=20
> E[: | | | | :]=D0=97=20
> (")_(")
>=20

--
Alexey Tarasov

(\__/)=20
(=3D'.'=3D)=20
E[: | | | | :]=D0=97=20
(")_(")




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?709C9B77-AD40-4662-96C9-A0F56369DBDE>