From owner-freebsd-stable@FreeBSD.ORG Sun Oct 27 18:46:33 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 5482BC6F for ; Sun, 27 Oct 2013 18:46:33 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A27C62621 for ; Sun, 27 Oct 2013 18:46:32 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r9RIkPlY014905; Sun, 27 Oct 2013 20:46:25 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r9RIkPlY014905 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id r9RIkPVT014904; Sun, 27 Oct 2013 20:46:25 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 27 Oct 2013 20:46:25 +0200 From: Konstantin Belousov To: Alexey Tarasov Subject: Re: FreeBSD 9.2 UFS + GELI softdep_deallocate_dependencies: unrecovered I/O error Message-ID: <20131027184625.GI59496@kib.kiev.ua> References: <2AA765E7-1F17-4C6F-98BD-004AEFF88D32@lexasoft.ru> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="6lCXDTVICvIQMz0h" Content-Disposition: inline In-Reply-To: <2AA765E7-1F17-4C6F-98BD-004AEFF88D32@lexasoft.ru> User-Agent: Mutt/1.5.22 (2013-10-16) X-Spam-Status: No, score=-0.4 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED, FREEMAIL_FROM, NML_ADSP_CUSTOM_MED, URIBL_SBL autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Oct 2013 18:46:33 -0000 --6lCXDTVICvIQMz0h Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Oct 26, 2013 at 01:47:18PM +0400, Alexey Tarasov wrote: > Hello.=20 >=20 > I've upgraded server to 9.2 and now it hangs every 2-3 hours of intensive= I/O to UFS SUJ + GELI disk. On 9.1 everything was good for a half of a yea= r.=20 >=20 > g_vfs_done():da1.eli[WRITE(offset=3D614630752256, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614631211008, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614634815488, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614642319360, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614642909184, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614643007488, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D614644875264, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D550691995648, length=3D98304)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D550692519936, length=3D32768)]error = =3D 11=20 > g_vfs_done():da1.eli[WRITE(offset=3D550704152576, length=3D32768)]error = =3D 11=20 > /data/pgsql/data/base: got error 11 while accessing filesystem=20 > panic: softdep_deallocate_dependencies: unrecovered I/O error=20 > cpuid =3D 10=20 > KDB: stack backtrace:=20 > #0 0xffffffff80947986 at kdb_backtrace+0x66=20 > #1 0xffffffff8090d9ae at panic+0x1ce=20 > #2 0xffffffff80b3ff90 at clear_remove+0=20 > #3 0xffffffff8098fb65 at brelse+0x75=20 > #4 0xffffffff80990978 at bufdone+0x68=20 > #5 0xffffffff8098c83e at biodone+0xae=20 > #6 0xffffffff80872f4c at g_io_schedule_up+0xac=20 > #7 0xffffffff808736ac at g_up_procbody+0x5c=20 > #8 0xffffffff808db67f at fork_exit+0x11f=20 > #9 0xffffffff80cdc23e at fork_trampoline+0xe=20 > Uptime: 6d15h5m7s=20 > Dumping 7664 out of 196573 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81= %..91%=20 >=20 > Full core.txt is here: http://lexasoft.ru/core.txt.1=20 >=20 > Server is HP Proliant DL180 G6 with P410 RAID controller.=20 Look for your current value of the kern.bio_transient_maxcnt and increase it by 4-8 times, using the same tunable. If this helps, fine. If not, disable unmapped i/o with the vfs.unmapped_buf_allowed tunable. Real solution is to convert geom classes like geli to use limited transient mapping windows to access the data, thus adding support for unmapped i/o to them. --6lCXDTVICvIQMz0h Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) iQIcBAEBAgAGBQJSbV+AAAoJEJDCuSvBvK1BkSgQAI6SVZoMXhBHR9k6LvKdNFY7 EvMfT7+XNLrZIDt+kS8/+kRmJeaq9iVrR12KlOKMzVZA9WFbkPKMpjzw5HVEB2PU n4yLG1VcWoC+N7PhZGpq0hctItd3igYjO/nlL4TgadoH5+wTawhNiaO6F8nLoVeG jWdtDKi4R7Pc+eiBmJb4GM/iapmnM7fmCkgF1tgRScaCYkYwmK6lAZ5CN/KmRirA ayWaRVbtCJJbZsaAXnYznRYot+it1Ev0AdvBZPzatNTmO82jV5LB5OtSt7lQJFg5 wMqsUU9K/Y0sDDNdr3As1nScL8rE+3i/n5VSleuAqV29QQcVEysgAezBA4kkl/n8 Qs+1XwKAkzDf/3Jq0A/AZOZ9sFnsGAKzXdRwxsla13ox2mIJny7jwsyNinaFWp31 em1CBjvRR+H2ozhWKhKNQrBFLyZQJWswfOl4YuGAKQYBm2QQCtuXlyDXpfttbRpJ L3EKJ5ptq64tURfusC8vdj30fqBmq57U6XBGXnO2Wmre2yT6zDLnXW6ATrGaBxic FdwQgo8hwife/39Ew9lC0eH5Soz7L0Cyu+11xmezLBgaXUiTRKwNkC//GDdsJuwj 89Ej33jKTX43MfIV8KD7G3i4WCcKzRyPNLvGCMxAL7LdD0lE2KFKo9Gbe74EB0Eh krAOyLLkrsYzYezqCL7A =NbaG -----END PGP SIGNATURE----- --6lCXDTVICvIQMz0h--