From owner-freebsd-arch@FreeBSD.ORG Tue Aug 23 16:42:21 2011 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A9406106564A; Tue, 23 Aug 2011 16:42:21 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 4A3C38FC1A; Tue, 23 Aug 2011 16:42:21 +0000 (UTC) Received: by qwc9 with SMTP id 9so271109qwc.13 for ; Tue, 23 Aug 2011 09:42:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=9hHStlk0HBkpAEc/ukPQyWsOim7WnDAn4VaDH+wFkZ4=; b=FZossZt5mZvlVf5DypIKmtXyK5BNMKXYd6uXxHux/f1m/oiRmS0D9Gjx/5KP82swQf ZnCXaaA8KfKKPe7+6wg6IwN6JiOqALfoEnCiivYSALeCAEzZyvRFkljush2N6KSR4Vlq Nt3kQ9O6j9V664J7ZgLDcJEgW/eJOMgELO7wU= MIME-Version: 1.0 Received: by 10.229.136.81 with SMTP id q17mr2462715qct.170.1314117740542; Tue, 23 Aug 2011 09:42:20 -0700 (PDT) Sender: mdf356@gmail.com Received: by 10.229.98.8 with HTTP; Tue, 23 Aug 2011 09:42:20 -0700 (PDT) In-Reply-To: <4E53AC2F.1040006@FreeBSD.org> References: <4E53986B.5000804@FreeBSD.org> <4E53AC2F.1040006@FreeBSD.org> Date: Tue, 23 Aug 2011 09:42:20 -0700 X-Google-Sender-Auth: pNZt1sSqwDQWUgSTyGMzRq9c79I Message-ID: From: mdf@FreeBSD.org To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-arch@freebsd.org Subject: Re: skipping locks, mutex_owned, usb X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 16:42:21 -0000 On Tue, Aug 23, 2011 at 6:33 AM, Andriy Gapon wrote: > on 23/08/2011 15:58 mdf@FreeBSD.org said the following: >> On Tue, Aug 23, 2011 at 5:09 AM, Andriy Gapon wrote: >>> III. =A0With my stop_scheduler_on_panic patch ukbd_poll() produces infi= nite chains >>> of 'infinite' recursion because mtx_owned() always returns false. =A0Th= is is because >>> I skip all lock/unlock/etc operations in the post-panic context. =A0I t= hink that >>> it's a good philosophical question: what operations like mtx_owned(), >>> mtx_recursed(), mtx_trylock() 'should' return when we actually act as i= f no locks >>> exist at all? >>> >>> My first knee-jerk reaction was to change mtx_owned() to return true in= this >>> "lock-less" context, but _hypothetically_ there could exist some symmet= ric code >>> that does something like: >>> func() >>> { >>> =A0 =A0 =A0 =A0if (mtx_owned(&Giant)) { >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mtx_unlock(&Giant); >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0func(); >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0mtx_lock(&Giant); >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return; >>> =A0 =A0 =A0 =A0} >>> >>> =A0 =A0 =A0 =A0// etc ... >>> >>> What do you think about this problem? >> >> Given that checking for a lock being held is a lot more common than >> checking if it's not held (in the context of mtx_assert(9) and >> friends), it seems reasonable to report that a lock is held in the >> special context of after panic. > > But, OTOH, there are snippets like this (found with grep, haven't looked = at the code): > /usr/src/sys/kern/kern_sx.c: =A0 =A0 =A0 =A0 =A0 =A0while (mtx_owned(&Gia= nt)) { > >>> That question III actually brings another thought: perhaps the whole of= idea of >>> skipping locks in a certain context is not a correct direction? >>> Perhaps, instead we should identify all the code that could be legitima= tely >>> executed in the after-panic and/or kdb contexts and make that could exp= licitly >>> aware of its execution context. =A0That is, instead of trying to make _= lock, >>> _unlock, _owned, _trylock, etc do the right thing auto-magically, we sh= ould try to >>> make the calling code check panicstr and kdb_active and do the right th= ing on that >>> level (which would include not invoking those lock-related operations o= r other >>> inappropriate operations). >> >> I don't think it's possible to identify all the code, since what runs >> after panic isn't tested very much. :-) =A0I don't disagree that one >> should think about the issue, but providing reasonable defaults like >> skipping locks reduces the burden on the programmer. > > Yes, I agree with your and John's practical approach to this. > Maybe print a warning about each skipped locking operation? =A0But not su= re if that > would be useful, if there would be no intention of changing the code. Skipped locking seems like it should be left silent. I think this is a reasonable behaviour used on many operating systems -- at least I know it is used on AIX. I like the idea of a printf() in mtx_owned() since there is no completely correct answer there. Then at least on e.g. an infinite loop like you point out, it would be clear what's happening (and presumably this could use the __FILE__ and __LINE__ of the caller in the print). Cheers, matthew