From owner-freebsd-current@FreeBSD.ORG Tue Dec 4 14:34:07 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64F0D16A468 for ; Tue, 4 Dec 2007 14:34:07 +0000 (UTC) (envelope-from Benjamin.Close@clearchain.com) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by mx1.freebsd.org (Postfix) with ESMTP id D710C13C469 for ; Tue, 4 Dec 2007 14:34:06 +0000 (UTC) (envelope-from Benjamin.Close@clearchain.com) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ah4FAM7xVEd5LVoqWmdsb2JhbACBW41oASA X-IronPort-AV: E=Sophos;i="4.23,248,1194183000"; d="scan'208";a="11075385" Received: from ppp121-45-90-42.lns10.adl6.internode.on.net (HELO mail.clearchain.com) ([121.45.90.42]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Dec 2007 01:04:04 +1030 Received: from [192.168.155.249] (draco.internal.clearchain.com [192.168.155.249]) (authenticated bits=0) by mail.clearchain.com (8.13.8/8.13.8) with ESMTP id lB4EY1hF074714 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 5 Dec 2007 01:04:02 +1030 (CST) (envelope-from Benjamin.Close@clearchain.com) Message-ID: <47556554.3010505@clearchain.com> Date: Wed, 05 Dec 2007 01:03:56 +1030 From: Benjamin Close User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: Alexandre Biancalana References: <8e10486b0711280537n222d6cd5le33639b82a11f45@mail.gmail.com> <8e10486b0711290543h6f78fde1kefc01b4ee7147f8e@mail.gmail.com> <8e10486b0711290558q31217f5aif7b803e1ae08023c@mail.gmail.com> <9bbcef730711290634k347bc0c6re0da8676bab37873@mail.gmail.com> <8e10486b0711290637u711c67b3i17925777e4481346@mail.gmail.com> <8e10486b0711300247q4438235ata70ca42030871286@mail.gmail.com> <20071130124801.74c40ef9@peedub.jennejohn.org> <8e10486b0712040521u5c7015b8h86d0da5554162898@mail.gmail.com> In-Reply-To: <8e10486b0712040521u5c7015b8h86d0da5554162898@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV version 0.91.2, clamav-milter version 0.91.2 on pegasus.clearchain.com X-Virus-Status: Clean X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (mail.clearchain.com [192.168.154.1]); Wed, 05 Dec 2007 01:04:02 +1030 (CST) Cc: freebsd-current@freebsd.org Subject: Re: 7-BETA3 everyday reboot X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Dec 2007 14:34:07 -0000 Alexandre Biancalana wrote: > On Nov 30, 2007 9:48 AM, Gary Jennejohn wrote: > >> On Fri, 30 Nov 2007 07:47:51 -0300 >> "Alexandre Biancalana" wrote: >> >> >>> After the reboot, the kernel running have the patch suggested by >>> ivoras@ (http://people.freebsd.org/~pjd/patches/vm_kern.c.2.patch), >>> but how the patch is related to "kmem_map too small", I don't think >>> that will be usefull with this panic. >>> >>> I *really* need some solution :( >>> >>> Any ideas ? >>> >>> >> The patch is potentially useful because the new code tries harder to >> reclaim pages (8 times instead of once with a sleep in between the >> attempts). >> >> The idea is that pages may become available at some time during one >> of the sleeps. >> >> Just try it and see whether it helps. Can't do any harm. >> > > After apply the patch the machine survives to 3 days of work, but > paniced again after the system start using swap because some > applications were using more memory, here is the panic message: > > Dec 4 03:12:33 Manny syslogd: kernel boot file is /boot/kernel/kernel > Dec 4 03:12:33 Manny kernel: panic: vm_fault: fault on nofault entry, > addr: fffffffff7a3e000 > Dec 4 03:12:33 Manny kernel: cpuid = 0 > Dec 4 03:12:33 Manny kernel: Uptime: 3d5h5m25s > Dec 4 03:12:33 Manny kernel: Physical memory: 3061 MB > Dec 4 03:12:33 Manny kernel: Dumping 1788 MB: 1773 1757 1741 1725 > 1709 1693 1677 1661 1645 1629 1613 1597 1581 1565 1549 1533 1517 1501 > 1485 1469 1453 1437 1421 1405 138 > 9 1373 1357 1341 1325 1309 1293 1277 1261 1245 1229 1213 1197 1181 > 1165 1149 1133 1117 1101 1085 1069 1053 1037 1021 1005 989 973 957 941 > 925 909 893 877 861 845 829 813 > 797 781 765 749 733 717 701 685 669 653 637 621 605 589 573 557 541 > 525 509 493 477 461 445 429 413 397 381 365 349 333 317 301 285 269 > 253 237 221 205 189 173 157 141 12 > 5 109 93 77 61 45 29 13 > Dec 4 03:12:33 Manny kernel: Dump complete > Dec 4 03:12:33 Manny kernel: Automatic reboot in 15 seconds - press a > key on the console to abort > Dec 4 03:12:33 Manny kernel: Rebooting... > Dec 4 03:12:33 Manny kernel: cpu_reset: Stopping other CPUs > > Any other ideas ?! > I have found that turning off zil and prefetch seem to keep things happier on one of the heavily loaded servers that I look after. It also appears to prevent a deadlock under very heavy load - something I've not yet had time to debug. Try adding: vfs.zfs.zil_disable=1 vfs.zfs.prefetch_disable="1" to /boot/loader.conf and let us know if it makes a difference. Cheers, Benjamin