Date: Sun, 27 Oct 2013 20:46:25 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Alexey Tarasov <me@lexasoft.ru> Cc: freebsd-stable@freebsd.org Subject: Re: FreeBSD 9.2 UFS + GELI softdep_deallocate_dependencies: unrecovered I/O error Message-ID: <20131027184625.GI59496@kib.kiev.ua> In-Reply-To: <2AA765E7-1F17-4C6F-98BD-004AEFF88D32@lexasoft.ru> References: <2AA765E7-1F17-4C6F-98BD-004AEFF88D32@lexasoft.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
--6lCXDTVICvIQMz0h Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Oct 26, 2013 at 01:47:18PM +0400, Alexey Tarasov wrote: > Hello.=20 >=20 > I've upgraded server to 9.2 and now it hangs every 2-3 hours of intensive= I/O to UFS SUJ + GELI disk. On 9.1 everything was good for a half of a yea= r.=20 >=20 > g_vfs_done():da1.eli[WRITE(offset=3D614630752256, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614631211008, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614634815488, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614642319360, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614642909184, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614643007488, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614644875264, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D550691995648, length=3D98304)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D550692519936, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D550704152576, length=3D32768)]error = =3D 11=20 > /data/pgsql/data/base: got error 11 while accessing filesystem=20 > panic: softdep_deallocate_dependencies: unrecovered I/O error=20 > cpuid =3D 10=20 > KDB: stack backtrace:=20 > #0 0xffffffff80947986 at kdb_backtrace+0x66=20 > #1 0xffffffff8090d9ae at panic+0x1ce=20 > #2 0xffffffff80b3ff90 at clear_remove+0=20 > #3 0xffffffff8098fb65 at brelse+0x75=20 > #4 0xffffffff80990978 at bufdone+0x68=20 > #5 0xffffffff8098c83e at biodone+0xae=20 > #6 0xffffffff80872f4c at g_io_schedule_up+0xac=20 > #7 0xffffffff808736ac at g_up_procbody+0x5c=20 > #8 0xffffffff808db67f at fork_exit+0x11f=20 > #9 0xffffffff80cdc23e at fork_trampoline+0xe=20 > Uptime: 6d15h5m7s=20 > Dumping 7664 out of 196573 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81= %..91%=20 >=20 > Full core.txt is here: http://lexasoft.ru/core.txt.1=20 >=20 > Server is HP Proliant DL180 G6 with P410 RAID controller.=20 Look for your current value of the kern.bio_transient_maxcnt and increase it by 4-8 times, using the same tunable. If this helps, fine. If not, disable unmapped i/o with the vfs.unmapped_buf_allowed tunable. Real solution is to convert geom classes like geli to use limited transient mapping windows to access the data, thus adding support for unmapped i/o to them. --6lCXDTVICvIQMz0h Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) iQIcBAEBAgAGBQJSbV+AAAoJEJDCuSvBvK1BkSgQAI6SVZoMXhBHR9k6LvKdNFY7 EvMfT7+XNLrZIDt+kS8/+kRmJeaq9iVrR12KlOKMzVZA9WFbkPKMpjzw5HVEB2PU n4yLG1VcWoC+N7PhZGpq0hctItd3igYjO/nlL4TgadoH5+wTawhNiaO6F8nLoVeG jWdtDKi4R7Pc+eiBmJb4GM/iapmnM7fmCkgF1tgRScaCYkYwmK6lAZ5CN/KmRirA ayWaRVbtCJJbZsaAXnYznRYot+it1Ev0AdvBZPzatNTmO82jV5LB5OtSt7lQJFg5 wMqsUU9K/Y0sDDNdr3As1nScL8rE+3i/n5VSleuAqV29QQcVEysgAezBA4kkl/n8 Qs+1XwKAkzDf/3Jq0A/AZOZ9sFnsGAKzXdRwxsla13ox2mIJny7jwsyNinaFWp31 em1CBjvRR+H2ozhWKhKNQrBFLyZQJWswfOl4YuGAKQYBm2QQCtuXlyDXpfttbRpJ L3EKJ5ptq64tURfusC8vdj30fqBmq57U6XBGXnO2Wmre2yT6zDLnXW6ATrGaBxic FdwQgo8hwife/39Ew9lC0eH5Soz7L0Cyu+11xmezLBgaXUiTRKwNkC//GDdsJuwj 89Ej33jKTX43MfIV8KD7G3i4WCcKzRyPNLvGCMxAL7LdD0lE2KFKo9Gbe74EB0Eh krAOyLLkrsYzYezqCL7A =NbaG -----END PGP SIGNATURE----- --6lCXDTVICvIQMz0h--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20131027184625.GI59496>