From owner-freebsd-stable@FreeBSD.ORG Fri Apr 27 19:03:27 2007 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2975E16A400 for ; Fri, 27 Apr 2007 19:03:27 +0000 (UTC) (envelope-from freebsd@hub.org) Received: from hub.org (hub.org [200.46.204.220]) by mx1.freebsd.org (Postfix) with ESMTP id BC7F513C45B for ; Fri, 27 Apr 2007 19:03:26 +0000 (UTC) (envelope-from freebsd@hub.org) Received: from localhost (unknown [200.46.204.187]) by hub.org (Postfix) with ESMTP id CF60348C5C0; Fri, 27 Apr 2007 16:03:24 -0300 (ADT) Received: from hub.org ([200.46.204.220]) by localhost (mx1.hub.org [200.46.204.187]) (amavisd-maia, port 10024) with ESMTP id 27930-02; Fri, 27 Apr 2007 16:03:23 -0300 (ADT) Received: from ganymede.hub.org (blk-89-241-126.eastlink.ca [24.89.241.126]) by hub.org (Postfix) with ESMTP id F0F3F48C57B; Fri, 27 Apr 2007 16:03:17 -0300 (ADT) Received: from localhost (localhost [127.0.0.1]) by ganymede.hub.org (Postfix) with ESMTP id B7EA25FABA; Fri, 27 Apr 2007 16:03:19 -0300 (ADT) Date: Fri, 27 Apr 2007 16:03:19 -0300 From: "Marc G. Fournier" To: Kris Kennaway , LI Xin Message-ID: In-Reply-To: <20070425035316.GB44054@xor.obsecurity.org> References: <20070313140848.GA89182@steerpike.hanley.stade.co.uk> <20070423025631.GA33256@steerpike.hanley.stade.co.uk> <20070423113912.GE2052@deviant.kiev.zoral.com.ua> <462DDB4D.8080507@delphij.net> <1177442585.462e5919c71f0@webmail.vsi.ru> <462EC294.3040001@delphij.net> <20070425035316.GB44054@xor.obsecurity.org> X-Mailer: Mulberry/4.0.7 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline Cc: freebsd-stable@freebsd.org, Oleg Derevenetz Subject: Re: How to report bugs (Re: 6.2-STABLE deadlock?) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Apr 2007 19:03:27 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - --On Tuesday, April 24, 2007 23:53:16 -0400 Kris Kennaway wrote: > On Wed, Apr 25, 2007 at 10:53:08AM +0800, LI Xin wrote: >> Hi, Oleg, >> >> Oleg Derevenetz wrote: >> > ??????? LI Xin : >> [...] >> >> I'm not very sure if this is specific to one disk controller. Actually >> >> I got some occasional reports about similar hangs on amd64 6.2-RELEASE >> >> (slightly patched version) that most of processes stuck in the 'ufs' >> >> state, under very light load, the box was equipped with amr(4) RAID. >> >> >> >> I was not able to reproduce the problem at my lab, though, it's still >> >> unknown that how to trigger the livelock :-( Still need some >> >> investigate on their production system. >> > >> > I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406: >> > >> > http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat= >> > >> > and there should be a thread related to this. Briefly, I suspects that >> > this is related to nullfs filesystems on my server and when I cvsuped to >> > FreeBSD 6.2- STABLE with Daichi's unionfs-related patches and replaced >> > nullfs-mounted fs with unionfs-mounted (that was done 10.03.07) problem >> > is gone (seems to be so, at least). >> >> Hmm... Seems to be different issues. The problem I have received was a >> pgsql server (no nullfs/unionfs involved), and the hang always happen >> when it is not being heavily loaded (usually in the morning, for >> instance, and there is no special configuration, like scheduled tasks >> which can generate disk load, etc., only the entropy harvesting), so >> this is quite confusing. > > Yes, a large part of the confusion is the unfortunate tendency of > people to do the following: > > my system hangs/panics/etc > my system hangs/panics/etc too; it must be the same problem! > > What we really need is for every FreeBSD user who encounters a > hang/panic/etc to avoid jumping to conclusions -- no matter how many > superficial similarities there may seem to you -- and instead go > through the relevant steps described here: > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kernelde > bug.html > > Until you (or a developer) have analyzed the resulting information, > you cannot definitively determine whether or not your problem is the > same as a given random other problem, and you may just confuse the > issue by making claims of similarity when you are really reporting a > completely separate problem. What about those that don't have the benefit of being able to access the console? :( I've recently started buying servers that have builtin, full remote console (ie. the HP servers), but, for instance, I have one box that I have to consistently reboot ever 3 days due to a 'No Buffer Space Available' ... A thought: how hard would it be to add some method of forcing a system crash, that would dump core, from the command line? Something that, by default, would be disabled, but for remote debugging purposes, one could enable in the kernel and do a 'sysctl kernel.force_core_crash=1' to have it do it? I imagine that having a core to analyze would allow providing more information then nothing at all, no? - ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGMkj34QvfyHIvDvMRAnIsAJ42loBGh0TkX4mfWSrZrMq2FheBuQCgiu4l B0PCLtLhd9ZiJ4oNLWZ6LT0= =KK9Y -----END PGP SIGNATURE-----