From owner-freebsd-current@FreeBSD.ORG Fri May 14 17:04:59 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EC668106566C; Fri, 14 May 2010 17:04:59 +0000 (UTC) (envelope-from mj@feral.com) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id A46908FC19; Fri, 14 May 2010 17:04:59 +0000 (UTC) Received: from [192.168.221.2] (remotevpn [192.168.221.2]) by ns1.feral.com (8.14.3/8.14.3) with ESMTP id o4EH4wfR078128; Fri, 14 May 2010 10:04:58 -0700 (PDT) (envelope-from mj@feral.com) Message-ID: <4BED82BA.4060904@feral.com> Date: Fri, 14 May 2010 10:04:58 -0700 From: Matthew Jacob Organization: Feral Software User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) To: Matthew Fleming References: <01NN32EOXMYC006UN1@tmk.com> <4BED3912.9080509@FreeBSD.org> <01NN3PQCOFHE006UN1@tmk.com> <06D5F9F6F655AD4C92E28B662F7F853E021D4D5E@seaxch09.desktop.isilon.com> In-Reply-To: <06D5F9F6F655AD4C92E28B662F7F853E021D4D5E@seaxch09.desktop.isilon.com> Content-Transfer-Encoding: 7bit X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.2.3 (ns1.feral.com [192.168.221.1]); Fri, 14 May 2010 10:04:59 -0700 (PDT) MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-current@freebsd.org, freebsd-stable@freebsd.org, Terry Kennedy Subject: Re: Crash dump problem - sleeping thread owns a non-sleepable lock during crash dump write X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 May 2010 17:05:00 -0000 Matthew Fleming wrote: As an aside, this is a quad-core in one package CPU (an X3363). On both this box and a similar one with an X5470, console messages continue to print out after "the system has been halted - press any key to reboot" - in particular, the shutdown makes a bunch of the "behind the scenes" man- agement stuff like the virtual keyboard and monitor appear. Plugging or unplugging USB devices will go through the whole deal of detecting and making their service available. Oops, youre right that other CPUs are running. The stop_cpus() call is only made if kdb is entered. doadump() is called out o f boot() which comes later. At Isilon weve been running with a patch that does stop_cpus() pretty close to the front of panic(9). As an design decision it seems reasonable to call stop_cpus() early in panic(9) simply because most causes for panic means something unexpected, and the soone r the other CPUs arent running the more likely it is that they dont do more dam age, leaving the system in a more useful state for dump or {g,d}db analysis. T his should be done before dump or entering kdb. Im ccing -current@ since I would like a small discussion of moving the stop_cpu s() to earlier in panic. If this change is agreeable I can roll up a patch and test it on CURRENT. Im not sure yet how much of the other panic-related chang es we have made at Isilon would be required. Work along this lines has been done at Panasas. We were planning on put it back to the community. There turns out to be lots of edge cases by changing this that we're still sorting thru.