From owner-freebsd-arch@FreeBSD.ORG Mon May 20 20:42:10 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id EEE0F261 for ; Mon, 20 May 2013 20:42:10 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id B7BC0184D for ; Mon, 20 May 2013 20:42:10 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 71858B948; Mon, 20 May 2013 16:42:09 -0400 (EDT) From: John Baldwin To: freebsd-arch@freebsd.org Subject: Re: FreeBSD spinlock - compatibility layer Date: Mon, 20 May 2013 09:50:26 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> In-Reply-To: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305200950.26834.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 20 May 2013 16:42:09 -0400 (EDT) Cc: Orit Moskovich X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 May 2013 20:42:11 -0000 On Tuesday, May 14, 2013 6:04:21 am Orit Moskovich wrote: > Hi, > > I read about the FreeBSD mutex implementation for spinlock in the compatibility layer. > I might be wrong, but I noticed a code section that might be problematic: > > Taken from http://svn.freebsd.org/base/release/9.1.0/sys/ofed/include/linux/spinlock.h: > > static inline void > spin_lock_init(spinlock_t *lock) > { > > memset(&lock->m, 0, sizeof(lock->m)); > mtx_init(&lock->m, "lnxspin", NULL, MTX_DEF | MTX_NOWITNESS); > } > > But MTX_DEF initializes mutex as a sleep mutex: > > By default, MTX_DEF mutexes will context switch when they are already > > held. > > > There is a flag MTX_SPIN Which I think is the right one in this case . > > > > I'd appreciate your take on this issue. Since FreeBSD uses a different approach to interrupt handlers (they run in threads, not in the bottom half), a regular mutex may in fact give the closest match to the same semantics. Regular mutexes are also cheaper and in general preferable to spin mutexes whenever possible. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Mon May 20 22:28:40 2013 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DA06383E; Mon, 20 May 2013 22:28:40 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (unknown [IPv6:2001:610:1108:5012::107]) by mx1.freebsd.org (Postfix) with ESMTP id A4C9CA0A; Mon, 20 May 2013 22:28:40 +0000 (UTC) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id 5F2AA1203B1; Tue, 21 May 2013 00:28:26 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id 3620A28493; Tue, 21 May 2013 00:28:26 +0200 (CEST) Date: Tue, 21 May 2013 00:28:26 +0200 From: Jilles Tjoelker To: John Baldwin Subject: Re: Extending MADV_PROTECT Message-ID: <20130520222825.GB43407@stack.nl> References: <201305071433.27993.jhb@freebsd.org> <201305090814.52166.jhb@freebsd.org> <20130509123147.GT3047@kib.kiev.ua> <201305101535.50633.jhb@freebsd.org> <20130514192115.GA34869@stack.nl> <5192AE7C.10105@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5192AE7C.10105@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Konstantin Belousov , arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 May 2013 22:28:40 -0000 On Tue, May 14, 2013 at 05:37:00PM -0400, John Baldwin wrote: > On 5/14/13 3:21 PM, Jilles Tjoelker wrote: > > All this is not very important for process protection because it > > requires root privileges anyway but future procctl commands may well be > > accessible to normal users (I'm thinking of avoiding proliferation of > > pd* calls in particular). > I originally used that approach in pprotect() since that is what ktrace > uses. I did it this way in procctl() to err on the side of reporting > errors vs not, but I can easily change it. This is something I wasn't > sure of and very much appreciate feedback on. > Do you have any thoughts on having this be more ioctl-like ("automatic" > copyin/out and size encoded in cmd) vs ptrace-like (explicit sizes and > in/out keyed off of command)? If it is ioctl-like, it is possible to redirect ioctl() on a process descriptor to procctl and use cap_ioctls_limit() infrastructure. I'm not sure Capsicum people actually like that, though. In either case, it is possible to have a P_PROCDESC to affect a process by process descriptor. Capsicum may then need more CAP_*. -- Jilles Tjoelker From owner-freebsd-arch@FreeBSD.ORG Tue May 21 04:37:08 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4A820685; Tue, 21 May 2013 04:37:08 +0000 (UTC) (envelope-from oritm@mellanox.com) Received: from eu1sys200aog117.obsmtp.com (eu1sys200aog117.obsmtp.com [207.126.144.143]) by mx1.freebsd.org (Postfix) with ESMTP id 4607CE55; Tue, 21 May 2013 04:37:06 +0000 (UTC) Received: from MTLCAS01.mtl.com ([193.47.165.155]) (using TLSv1) by eu1sys200aob117.postini.com ([207.126.147.11]) with SMTP ID DSNKUZr52KACssjLF3AlUVKoawGEPu+PIZN7@postini.com; Tue, 21 May 2013 04:37:07 UTC Received: from MTLDAG01.mtl.com ([10.0.8.75]) by MTLCAS01.mtl.com ([10.0.8.71]) with mapi id 14.03.0123.003; Tue, 21 May 2013 07:36:39 +0300 From: Orit Moskovich To: John Baldwin , "freebsd-arch@freebsd.org" Subject: RE: FreeBSD spinlock - compatibility layer Thread-Topic: FreeBSD spinlock - compatibility layer Thread-Index: Ac5QiaSgCms1CiujRJ+uiUawknitKQEvjFEAACUK0bA= Date: Tue, 21 May 2013 04:36:38 +0000 Message-ID: <981733489AB3BD4DB24B48340F53E0A55B0D091F@MTLDAG01.mtl.com> References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305200950.26834.jhb@freebsd.org> In-Reply-To: <201305200950.26834.jhb@freebsd.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.0.13.1] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 May 2013 04:37:08 -0000 That's not the case when using taskqueues for deferring execution of an in= terrupt handler. Tasks can be delayed using the global taskqueue taskqueue_swi, which execut= es its tasks in the context of an interrupt. In this case sleep is forbidden, and using spin mutex is not (although migh= t be not recommended). -----Original Message----- From: John Baldwin [mailto:jhb@freebsd.org]=20 Sent: Monday, May 20, 2013 11:42 PM To: freebsd-arch@freebsd.org Cc: Orit Moskovich Subject: Re: FreeBSD spinlock - compatibility layer On Tuesday, May 14, 2013 6:04:21 am Orit Moskovich wrote: > Hi, >=20 > I read about the FreeBSD mutex implementation for spinlock in the compatibility layer. > I might be wrong, but I noticed a code section that might be problematic: >=20 > Taken from http://svn.freebsd.org/base/release/9.1.0/sys/ofed/include/linux/spinlock.h= : >=20 > static inline void > spin_lock_init(spinlock_t *lock) > { >=20 > memset(&lock->m, 0, sizeof(lock->m)); > mtx_init(&lock->m, "lnxspin", NULL, MTX_DEF | MTX_NOWITNESS);=20 > } >=20 > But MTX_DEF initializes mutex as a sleep mutex: >=20 > By default, MTX_DEF mutexes will context switch when they are already >=20 > held. >=20 >=20 > There is a flag MTX_SPIN Which I think is the right one in this case . >=20 >=20 >=20 > I'd appreciate your take on this issue. Since FreeBSD uses a different approach to interrupt handlers (they run in = threads, not in the bottom half), a regular mutex may in fact give the clos= est match to the same semantics. Regular mutexes are also cheaper and in g= eneral preferable to spin mutexes whenever possible. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Tue May 21 04:56:31 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 55C96AC8 for ; Tue, 21 May 2013 04:56:31 +0000 (UTC) (envelope-from oritm@mellanox.com) Received: from eu1sys200aog121.obsmtp.com (eu1sys200aog121.obsmtp.com [207.126.144.151]) by mx1.freebsd.org (Postfix) with ESMTP id 1EC48F33 for ; Tue, 21 May 2013 04:56:29 +0000 (UTC) Received: from MTLCAS01.mtl.com ([193.47.165.155]) (using TLSv1) by eu1sys200aob121.postini.com ([207.126.147.11]) with SMTP ID DSNKUZr+dmzSTBOsY1GP9BxY9dWv66xsJket@postini.com; Tue, 21 May 2013 04:56:30 UTC Received: from MTLDAG01.mtl.com ([10.0.8.75]) by MTLCAS01.mtl.com ([10.0.8.71]) with mapi id 14.03.0123.003; Tue, 21 May 2013 07:56:21 +0300 From: Orit Moskovich To: "freebsd-arch@freebsd.org" Subject: compatibility layer - workqueues Thread-Topic: compatibility layer - workqueues Thread-Index: Ac5VXKj03BQxK72QTQOa5FKJdihNUQ== Date: Tue, 21 May 2013 04:56:20 +0000 Message-ID: <981733489AB3BD4DB24B48340F53E0A55B0D0938@MTLDAG01.mtl.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.0.13.1] MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 May 2013 04:56:31 -0000 Hi, I'm working on understanding the difference between Linux and FreeBSD inter= rupt handling. I looked at the compatibility layer and noticed this: * Linux workqueues are implemented using FreeBSD taskqueues (under = sys/ofed/include/linux/workqueue.h) * In linux, the function schedule_work() puts a job in the kernel g= lobal workqueue 'events'. This workqueue consists of worker threads - one p= er processor * The compatibility layer wraps this function to a macro, that impl= ements the functionality using taskqueue_enqueue() and set it to work on ta= skqueue_thread, that executes its tasks in the context of a kernel thread * BUT, taskqueue_thread is initialized in: o sys/kern/subr_taskqueue.c line 536: TASKQUEUE_DEFINE_THREAD(thread); o which is defined in sys/taskqueue.h line 133 and run taskqueue_start_threads() with only 1 thread, and not MAXCPU I'll appreciate your help understanding this issue, Thanks, Orit Moskovich From owner-freebsd-arch@FreeBSD.ORG Tue May 21 14:47:29 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C9DEF6D9 for ; Tue, 21 May 2013 14:47:29 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wg0-x231.google.com (mail-wg0-x231.google.com [IPv6:2a00:1450:400c:c00::231]) by mx1.freebsd.org (Postfix) with ESMTP id 66E2E7DC for ; Tue, 21 May 2013 14:47:29 +0000 (UTC) Received: by mail-wg0-f49.google.com with SMTP id y10so399501wgg.4 for ; Tue, 21 May 2013 07:47:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=JFJb2fHR8GvRgxBnRvDphySTIgqcnjOyZrTXCs7R6Lk=; b=IYFeymnchw55OcFAuK5em8m5yOfTl7Yei7TqdRJLMI3qTSYjk9ZokXA+Ynzy739pVB VKOUP/J0uFWE5j3H8x+BfD603rqoHPpgwzF4pTRtl3z+e2U17swmC8LUi9z6M4uUwKQi zPYqZ7hjctUdVFkdXn8r8LTdEZpUdg/ggI13JrT64ryWHr4eHciLxmXP2BaPmrrk3HJv nXs7rSmnsgUpl+JDxJ5jLSCj8Hx/uFTgMAFecSudoHfMg1CuBzola3ZFprP5etdikcu5 gimHflgXmBSJM1CkbfIrCKANqv6UWpLNW+qjpe1gZGNcId86iXahxqkfnug1lbOoGB6a hJgA== MIME-Version: 1.0 X-Received: by 10.180.72.195 with SMTP id f3mr24224877wiv.32.1369147648517; Tue, 21 May 2013 07:47:28 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.217.107.9 with HTTP; Tue, 21 May 2013 07:47:28 -0700 (PDT) In-Reply-To: <981733489AB3BD4DB24B48340F53E0A55B0D0938@MTLDAG01.mtl.com> References: <981733489AB3BD4DB24B48340F53E0A55B0D0938@MTLDAG01.mtl.com> Date: Tue, 21 May 2013 07:47:28 -0700 X-Google-Sender-Auth: sCerrGCm5ScVO-hDsqWUvn2oRds Message-ID: Subject: Re: compatibility layer - workqueues From: Adrian Chadd To: Orit Moskovich Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 May 2013 14:47:29 -0000 Is there some magic to do with spinlock versus spinlock_bh that we don't necessarily have with your lock/mutex operations? If any spinlocks grab bh locks in Linux then a bunch of work on other CPUs gets disabled. adrian On 20 May 2013 21:56, Orit Moskovich wrote: > Hi, > > I'm working on understanding the difference between Linux and FreeBSD interrupt handling. > I looked at the compatibility layer and noticed this: > > > * Linux workqueues are implemented using FreeBSD taskqueues (under sys/ofed/include/linux/workqueue.h) > > * In linux, the function schedule_work() puts a job in the kernel global workqueue 'events'. This workqueue consists of worker threads - one per processor > > * The compatibility layer wraps this function to a macro, that implements the functionality using taskqueue_enqueue() and set it to work on taskqueue_thread, that executes its tasks in the context of a kernel thread > > * BUT, taskqueue_thread is initialized in: > > o sys/kern/subr_taskqueue.c line 536: > TASKQUEUE_DEFINE_THREAD(thread); > > o which is defined in sys/taskqueue.h line 133 > and run taskqueue_start_threads() with only 1 thread, and not MAXCPU > > > > I'll appreciate your help understanding this issue, > > > > Thanks, > > Orit Moskovich > > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Tue May 21 14:48:03 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 022B7796 for ; Tue, 21 May 2013 14:48:03 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id CE0F57EC for ; Tue, 21 May 2013 14:48:02 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 3CD8FB968; Tue, 21 May 2013 10:48:01 -0400 (EDT) From: John Baldwin To: freebsd-arch@freebsd.org Subject: Re: compatibility layer - workqueues Date: Tue, 21 May 2013 10:38:56 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <981733489AB3BD4DB24B48340F53E0A55B0D0938@MTLDAG01.mtl.com> In-Reply-To: <981733489AB3BD4DB24B48340F53E0A55B0D0938@MTLDAG01.mtl.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305211038.56191.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 21 May 2013 10:48:01 -0400 (EDT) Cc: Orit Moskovich X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 May 2013 14:48:03 -0000 On Tuesday, May 21, 2013 12:56:20 am Orit Moskovich wrote: > Hi, > > I'm working on understanding the difference between Linux and FreeBSD interrupt handling. > I looked at the compatibility layer and noticed this: > > > * Linux workqueues are implemented using FreeBSD taskqueues (under sys/ofed/include/linux/workqueue.h) > > * In linux, the function schedule_work() puts a job in the kernel global workqueue 'events'. This workqueue consists of worker threads - one per processor > > * The compatibility layer wraps this function to a macro, that implements the functionality using taskqueue_enqueue() and set it to work on taskqueue_thread, that executes its tasks in the context of a kernel thread > > * BUT, taskqueue_thread is initialized in: > > o sys/kern/subr_taskqueue.c line 536: > TASKQUEUE_DEFINE_THREAD(thread); > > o which is defined in sys/taskqueue.h line 133 > and run taskqueue_start_threads() with only 1 thread, and not MAXCPU > > > > I'll appreciate your help understanding this issue, Do you need your events queued to multiple threads? If so, do you require that the event execute on the same CPU that it was scheduled on? (Kind of like a DPC in WDM) If you don't require these things, I think the global taskqueue will still satisfy your need even if it only has one thread. It would not be difficult to create a new taskqueue that used threads pinned to cores for the Linux workqueue compat layer, but I'm not sure you actually need that. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Tue May 21 19:02:11 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 59C36462 for ; Tue, 21 May 2013 19:02:11 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id 2DE896DF for ; Tue, 21 May 2013 19:02:11 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 36AEDB948; Tue, 21 May 2013 15:02:09 -0400 (EDT) From: John Baldwin To: Orit Moskovich Subject: Re: FreeBSD spinlock - compatibility layer Date: Tue, 21 May 2013 12:20:16 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305200950.26834.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D091F@MTLDAG01.mtl.com> In-Reply-To: <981733489AB3BD4DB24B48340F53E0A55B0D091F@MTLDAG01.mtl.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305211220.16776.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 21 May 2013 15:02:09 -0400 (EDT) Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 May 2013 19:02:11 -0000 On Tuesday, May 21, 2013 12:36:38 am Orit Moskovich wrote: > That's not the case when using taskqueues for deferring execution of an interrupt handler. > Tasks can be delayed using the global taskqueue taskqueue_swi, which executes its tasks in the context of an interrupt. > In this case sleep is forbidden, and using spin mutex is not (although might be not recommended). No, swi's run in an interrupt thread, and interrupt threads can use regular mutexes. (That is why they run in a thread context.) The only way you can run in a context requiring a spin lock in a driver is to use an interrupt filter. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Tue May 21 19:02:11 2013 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B607A463; Tue, 21 May 2013 19:02:11 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id 8AAD46E0; Tue, 21 May 2013 19:02:11 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id AB295B965; Tue, 21 May 2013 15:02:10 -0400 (EDT) From: John Baldwin To: Jilles Tjoelker Subject: Re: Extending MADV_PROTECT Date: Tue, 21 May 2013 12:22:11 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <201305071433.27993.jhb@freebsd.org> <5192AE7C.10105@FreeBSD.org> <20130520222825.GB43407@stack.nl> In-Reply-To: <20130520222825.GB43407@stack.nl> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305211222.11236.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 21 May 2013 15:02:10 -0400 (EDT) Cc: Konstantin Belousov , arch@freebsd.org, Robert Watson X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 May 2013 19:02:11 -0000 On Monday, May 20, 2013 6:28:26 pm Jilles Tjoelker wrote: > On Tue, May 14, 2013 at 05:37:00PM -0400, John Baldwin wrote: > > On 5/14/13 3:21 PM, Jilles Tjoelker wrote: > > > All this is not very important for process protection because it > > > requires root privileges anyway but future procctl commands may well be > > > accessible to normal users (I'm thinking of avoiding proliferation of > > > pd* calls in particular). > > > I originally used that approach in pprotect() since that is what ktrace > > uses. I did it this way in procctl() to err on the side of reporting > > errors vs not, but I can easily change it. This is something I wasn't > > sure of and very much appreciate feedback on. > > > Do you have any thoughts on having this be more ioctl-like ("automatic" > > copyin/out and size encoded in cmd) vs ptrace-like (explicit sizes and > > in/out keyed off of command)? > > If it is ioctl-like, it is possible to redirect ioctl() on a process > descriptor to procctl and use cap_ioctls_limit() infrastructure. I'm not > sure Capsicum people actually like that, though. > > In either case, it is possible to have a P_PROCDESC to affect a process > by process descriptor. Capsicum may then need more CAP_*. I talked to Robert about this in person at BSDCan and he indeed does not prefer general purpose multiplexers for system calls. In particular it does make auditing and access control for such things a lot harder to do. My impression from my discussion with him is that he would actually prefer much more narrowly focused system calls (so pprotect() in this case rather than a generic procctl()). -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Tue May 21 19:24:51 2013 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E0735899; Tue, 21 May 2013 19:24:51 +0000 (UTC) (envelope-from rwatson@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [198.74.231.69]) by mx1.freebsd.org (Postfix) with ESMTP id BF74F875; Tue, 21 May 2013 19:24:51 +0000 (UTC) Received: from [192.168.5.22] (ip-64-134-102-77.public.wayport.net [64.134.102.77]) by cyrus.watson.org (Postfix) with ESMTPSA id E93C046B58; Tue, 21 May 2013 15:24:50 -0400 (EDT) Subject: Re: Extending MADV_PROTECT Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=iso-8859-1 From: "Robert N. M. Watson" In-Reply-To: <201305211222.11236.jhb@freebsd.org> Date: Tue, 21 May 2013 15:24:50 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <201305071433.27993.jhb@freebsd.org> <5192AE7C.10105@FreeBSD.org> <20130520222825.GB43407@stack.nl> <201305211222.11236.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1283) Cc: Konstantin Belousov , arch@freebsd.org, Jilles Tjoelker X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 May 2013 19:24:51 -0000 On 21 May 2013, at 12:22, John Baldwin wrote: >> If it is ioctl-like, it is possible to redirect ioctl() on a process >> descriptor to procctl and use cap_ioctls_limit() infrastructure. I'm = not >> sure Capsicum people actually like that, though. >>=20 >> In either case, it is possible to have a P_PROCDESC to affect a = process >> by process descriptor. Capsicum may then need more CAP_*. >=20 > I talked to Robert about this in person at BSDCan and he indeed does = not=20 > prefer general purpose multiplexers for system calls. In particular = it does=20 > make auditing and access control for such things a lot harder to do. = My=20 > impression from my discussion with him is that he would actually = prefer much=20 > more narrowly focused system calls (so pprotect() in this case rather = than a=20 > generic procctl()). Yes -- based on experience with Capsicum, audit, but also things like = ktrace, LD_PRELOAD, etc, I have a strong preference for not using ioctl = for first-class (global) functions. We shouldn't go crazy on new system = calls, but key new abstraction functions in the kernel do reasonably = deserve new APIs. Matching pprotect() and pdprotect() APIs sound = plausible to me (without skimming back through the thread too much). Robert= From owner-freebsd-arch@FreeBSD.ORG Tue May 21 20:40:29 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 87277A1A; Tue, 21 May 2013 20:40:29 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 793A1CC1; Tue, 21 May 2013 20:40:29 +0000 (UTC) Received: from Alfreds-MacBook-Pro-9.local (OTWAON23-1176242366.sdsl.bell.ca [70.28.8.190]) by elvis.mu.org (Postfix) with ESMTPSA id 415AF1A3C6E; Tue, 21 May 2013 13:40:19 -0700 (PDT) Message-ID: <519BDBB0.2070302@mu.org> Date: Tue, 21 May 2013 16:40:16 -0400 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 MIME-Version: 1.0 To: John Baldwin Subject: Re: FreeBSD spinlock - compatibility layer References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305200950.26834.jhb@freebsd.org> In-Reply-To: <201305200950.26834.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Orit Moskovich , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 May 2013 20:40:29 -0000 On 5/20/13 9:50 AM, John Baldwin wrote: > On Tuesday, May 14, 2013 6:04:21 am Orit Moskovich wrote: >> Hi, >> >> I read about the FreeBSD mutex implementation for spinlock in the > compatibility layer. >> I might be wrong, but I noticed a code section that might be problematic: >> >> Taken from > http://svn.freebsd.org/base/release/9.1.0/sys/ofed/include/linux/spinlock.h: >> static inline void >> spin_lock_init(spinlock_t *lock) >> { >> >> memset(&lock->m, 0, sizeof(lock->m)); >> mtx_init(&lock->m, "lnxspin", NULL, MTX_DEF | MTX_NOWITNESS); >> } >> >> But MTX_DEF initializes mutex as a sleep mutex: >> >> By default, MTX_DEF mutexes will context switch when they are already >> >> held. >> >> >> There is a flag MTX_SPIN Which I think is the right one in this case . >> >> >> >> I'd appreciate your take on this issue. > Since FreeBSD uses a different approach to interrupt handlers (they run in > threads, not in the bottom half), a regular mutex may in fact give the closest > match to the same semantics. Regular mutexes are also cheaper and in general > preferable to spin mutexes whenever possible. > Sure, but is it possible that someone might want some of the other guarantees of MTX_SPIN spinlocks such as: critical section/non-pre-emptable/non-migrating on cpu/latency versus throughput ? -Alfred From owner-freebsd-arch@FreeBSD.ORG Wed May 22 06:15:22 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0C3E8683; Wed, 22 May 2013 06:15:22 +0000 (UTC) (envelope-from oritm@mellanox.com) Received: from eu1sys200aog121.obsmtp.com (eu1sys200aog121.obsmtp.com [207.126.144.151]) by mx1.freebsd.org (Postfix) with ESMTP id EBE3AB29; Wed, 22 May 2013 06:15:20 +0000 (UTC) Received: from MTLCAS01.mtl.com ([193.47.165.155]) (using TLSv1) by eu1sys200aob121.postini.com ([207.126.147.11]) with SMTP ID DSNKUZxiXWsBN4kQBPT+F8RhMSLHf6oqsFxh@postini.com; Wed, 22 May 2013 06:15:21 UTC Received: from MTLDAG01.mtl.com ([10.0.8.75]) by MTLCAS01.mtl.com ([10.0.8.71]) with mapi id 14.03.0123.003; Wed, 22 May 2013 09:14:52 +0300 From: Orit Moskovich To: John Baldwin Subject: RE: FreeBSD spinlock - compatibility layer Thread-Topic: FreeBSD spinlock - compatibility layer Thread-Index: Ac5QiaSgCms1CiujRJ+uiUawknitKQEvjFEAACUK0bAAEntnAAAi8Btg Date: Wed, 22 May 2013 06:14:51 +0000 Message-ID: <981733489AB3BD4DB24B48340F53E0A55B0D39EF@MTLDAG01.mtl.com> References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305200950.26834.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D091F@MTLDAG01.mtl.com> <201305211220.16776.jhb@freebsd.org> In-Reply-To: <201305211220.16776.jhb@freebsd.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.0.13.1] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 06:15:22 -0000 >From what I've read in "FreeBSD - device drivers" book by Joseph Kong on in= terrupt handling, you cannot voluntarily context switch (that is, sleep) in= interrupt threads . In any case, I think that the functionality of spin mutex should remain as = is, and not modified to sleep mutex, as it can be used in places that sleep= mustn't be used, or that require the properties of the spin due to perform= ance considerations. -----Original Message----- From: John Baldwin [mailto:jhb@freebsd.org]=20 Sent: Tuesday, May 21, 2013 10:02 PM To: Orit Moskovich Cc: freebsd-arch@freebsd.org Subject: Re: FreeBSD spinlock - compatibility layer On Tuesday, May 21, 2013 12:36:38 am Orit Moskovich wrote: > That's not the case when using taskqueues for deferring execution of=20 > an interrupt handler. > Tasks can be delayed using the global taskqueue taskqueue_swi, which executes its tasks in the context of an interrupt. > In this case sleep is forbidden, and using spin mutex is not (although=20 > might be not recommended). No, swi's run in an interrupt thread, and interrupt threads can use regular= mutexes. (That is why they run in a thread context.) The only way you ca= n run in a context requiring a spin lock in a driver is to use an interrupt= filter. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed May 22 06:32:33 2013 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 643A29D9 for ; Wed, 22 May 2013 06:32:33 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id A6896C2B for ; Wed, 22 May 2013 06:32:32 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id JAA29447; Wed, 22 May 2013 09:32:20 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Uf2ax-0007u7-U3; Wed, 22 May 2013 09:32:20 +0300 Message-ID: <519C663E.1090307@FreeBSD.org> Date: Wed, 22 May 2013 09:31:26 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130405 Thunderbird/17.0.5 MIME-Version: 1.0 To: Orit Moskovich Subject: Re: FreeBSD spinlock - compatibility layer References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305200950.26834.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D091F@MTLDAG01.mtl.com> <201305211220.16776.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D39EF@MTLDAG01.mtl.com> In-Reply-To: <981733489AB3BD4DB24B48340F53E0A55B0D39EF@MTLDAG01.mtl.com> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 06:32:33 -0000 on 22/05/2013 09:14 Orit Moskovich said the following: > From what I've read in "FreeBSD - device drivers" book by Joseph Kong on > interrupt handling, you cannot voluntarily context switch (that is, sleep) in > interrupt threads . See the table at the end of locking(9) manual page. -- Andriy Gapon From owner-freebsd-arch@FreeBSD.ORG Wed May 22 08:41:53 2013 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 90E86A1F; Wed, 22 May 2013 08:41:53 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 06B102D2; Wed, 22 May 2013 08:41:52 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r4M8fjTT074518; Wed, 22 May 2013 11:41:45 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r4M8fjTT074518 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id r4M8fjdT074517; Wed, 22 May 2013 11:41:45 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 22 May 2013 11:41:45 +0300 From: Konstantin Belousov To: "Robert N. M. Watson" Subject: Re: Extending MADV_PROTECT Message-ID: <20130522084145.GJ3047@kib.kiev.ua> References: <201305071433.27993.jhb@freebsd.org> <5192AE7C.10105@FreeBSD.org> <20130520222825.GB43407@stack.nl> <201305211222.11236.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="xApCgTs/B7u0zyQe" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: arch@freebsd.org, Jilles Tjoelker X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 08:41:53 -0000 --xApCgTs/B7u0zyQe Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, May 21, 2013 at 03:24:50PM -0400, Robert N. M. Watson wrote: >=20 > On 21 May 2013, at 12:22, John Baldwin wrote: >=20 > >> If it is ioctl-like, it is possible to redirect ioctl() on a process > >> descriptor to procctl and use cap_ioctls_limit() infrastructure. I'm n= ot > >> sure Capsicum people actually like that, though. > >>=20 > >> In either case, it is possible to have a P_PROCDESC to affect a process > >> by process descriptor. Capsicum may then need more CAP_*. > >=20 > > I talked to Robert about this in person at BSDCan and he indeed does no= t=20 > > prefer general purpose multiplexers for system calls. In particular it= does=20 > > make auditing and access control for such things a lot harder to do. M= y=20 > > impression from my discussion with him is that he would actually prefer= much=20 > > more narrowly focused system calls (so pprotect() in this case rather t= han a=20 > > generic procctl()). >=20 > Yes -- based on experience with Capsicum, audit, but also things like ktrace, LD_PRELOAD, etc, I have a strong preference for not using ioctl for first-class (global) functions. We shouldn't go crazy on new system calls, but key new abstraction functions in the kernel do reasonably deserve new APIs. Matching pprotect() and pdprotect() APIs sound plausible to me (without skimming back through the thread too much). I agree with statement that an ioctl()-like interface for the syscall is too wide, and I stated this already. On the other hand, I believe that e.g. ptrace(2) is fine as is, and splitting it into 20-30 syscalls each performing single ptrace operation would be a mistake. The same, IMO, holds for the procctl() syscall, which is better not split into pprotect(), then some improved version of pprotect() etc. I would prefer to not have proliferation of the FreeBSD-specific process-controlling syscalls, which could be cumulated in the single entry and single man page. --xApCgTs/B7u0zyQe Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iQIcBAEBAgAGBQJRnITIAAoJEJDCuSvBvK1BQVkP/12RqB+uQEuGdKY7Jz1etv2w oAYDiNKVwFFM8J/tBYnOQvMOeP09SY9XW2k8SABY18KNIS3xYvfcCmuA7gKJfnth nnslKZFsTvbOxhbaOZT50Y/0y+MzFspcsat9xmpngELhoPzEsHL46aBLrjpGyVm1 jvLDxXE1XLEpdL+HZS3xvdCfpXtl45WyyM6B4Zc3otApX+XqNUiDJ92l3jzmuyXn JKRZAKe+SIUQpoLCxErz2hjnz3JKz3npOxuM+lnVT8jf+PoMl4j3aBXFUeQinVFf aQFFVl86kllPQtXM78hT/itKNazRM96txjdhl2A21H+V44NSEwunJ0ZT2umBgbVN pvt2kZeEDWeSPK4W6tuiyHFqkFqMqWvByCZn1XQ09chXxO9mlSXEoYgB9HMmvj/z 8M5mpi7ZnsGCGE4e3e1PUyfV/Im4MPIQ7y4zOyxkQ3Z+DbfQv7hLUiIyS/HBTICC Tz8yV9ZHoPMOWMNOcs1V0I2j6oQplk64T6pvq5Cw5W7+PGopiffXmIcip4izcpjR sL8LRC/iK6DkXCw/d1DS8mFuWAhCPD38sf2i4WzuUlSd61vqWxDtXoKhBB4sjgkM ki8C1WGyKgtSB1xoHiGcUOKQFjwJKA8n9wZCbhn2gybrefT2D4PF5X0UJtIiy2AV +1xNFGuosOc92rcOicxp =/zkA -----END PGP SIGNATURE----- --xApCgTs/B7u0zyQe-- From owner-freebsd-arch@FreeBSD.ORG Wed May 22 13:06:06 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7779AF4C for ; Wed, 22 May 2013 13:06:06 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id 54F49EA2 for ; Wed, 22 May 2013 13:06:06 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id AA9BAB941; Wed, 22 May 2013 09:06:04 -0400 (EDT) From: John Baldwin To: Orit Moskovich Subject: Re: FreeBSD spinlock - compatibility layer Date: Wed, 22 May 2013 08:59:32 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305211220.16776.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D39EF@MTLDAG01.mtl.com> In-Reply-To: <981733489AB3BD4DB24B48340F53E0A55B0D39EF@MTLDAG01.mtl.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305220859.32948.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 22 May 2013 09:06:04 -0400 (EDT) Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 13:06:06 -0000 On Wednesday, May 22, 2013 2:14:51 am Orit Moskovich wrote: > From what I've read in "FreeBSD - device drivers" book by Joseph Kong on interrupt handling, you cannot voluntarily context switch (that is, sleep) in interrupt threads . That is not the same thing. By sleep it means call a *sleep() function or wait on a cond var. Not block on a mutex or rwlock. > In any case, I think that the functionality of spin mutex should remain as is, and not modified to sleep mutex, as it can be used in places that sleep mustn't be used, or that require the properties of the spin due to performance considerations. No, spin locks are _slower_ and reduce performance. FreeBSD is much more like Solaris in this regard. Spin mutexes on FreeBSD are similar to dispatcher locks in Solaris which 99% of the kernel should never use. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed May 22 13:06:08 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 78285F54 for ; Wed, 22 May 2013 13:06:08 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id 567C7EA5 for ; Wed, 22 May 2013 13:06:08 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id AA2B2B990; Wed, 22 May 2013 09:06:07 -0400 (EDT) From: John Baldwin To: Alfred Perlstein Subject: Re: FreeBSD spinlock - compatibility layer Date: Wed, 22 May 2013 09:05:57 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305200950.26834.jhb@freebsd.org> <519BDBB0.2070302@mu.org> In-Reply-To: <519BDBB0.2070302@mu.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305220905.57939.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 22 May 2013 09:06:07 -0400 (EDT) Cc: Orit Moskovich , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 13:06:08 -0000 On Tuesday, May 21, 2013 4:40:16 pm Alfred Perlstein wrote: > On 5/20/13 9:50 AM, John Baldwin wrote: > > On Tuesday, May 14, 2013 6:04:21 am Orit Moskovich wrote: > >> Hi, > >> > >> I read about the FreeBSD mutex implementation for spinlock in the > > compatibility layer. > >> I might be wrong, but I noticed a code section that might be problematic: > >> > >> Taken from > > http://svn.freebsd.org/base/release/9.1.0/sys/ofed/include/linux/spinlock.h: > >> static inline void > >> spin_lock_init(spinlock_t *lock) > >> { > >> > >> memset(&lock->m, 0, sizeof(lock->m)); > >> mtx_init(&lock->m, "lnxspin", NULL, MTX_DEF | MTX_NOWITNESS); > >> } > >> > >> But MTX_DEF initializes mutex as a sleep mutex: > >> > >> By default, MTX_DEF mutexes will context switch when they are already > >> > >> held. > >> > >> > >> There is a flag MTX_SPIN Which I think is the right one in this case . > >> > >> > >> > >> I'd appreciate your take on this issue. > > Since FreeBSD uses a different approach to interrupt handlers (they run in > > threads, not in the bottom half), a regular mutex may in fact give the closest > > match to the same semantics. Regular mutexes are also cheaper and in general > > preferable to spin mutexes whenever possible. > > > > Sure, but is it possible that someone might want some of the other > guarantees of MTX_SPIN spinlocks such as: > > critical section/non-pre-emptable/non-migrating on cpu/latency versus > throughput ? Probably not. For example, on FreeBSD you want your driver lock to be preempted by an interrupt to avoid higher interrupt latency for filter handlers. Most drivers should not need temporary pinning. If they want to pin work to threads they should bind threads or IRQs to specific CPUs, not rely on temporary pinning. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed May 22 13:27:24 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 902DAA7B; Wed, 22 May 2013 13:27:24 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 808A96F; Wed, 22 May 2013 13:27:23 +0000 (UTC) Received: from Alfreds-MacBook-Pro-9.local (OTWAON23-1176242366.sdsl.bell.ca [70.28.8.190]) by elvis.mu.org (Postfix) with ESMTPSA id 166461A3C23; Wed, 22 May 2013 06:27:20 -0700 (PDT) Message-ID: <519CC7B4.2030208@mu.org> Date: Wed, 22 May 2013 09:27:16 -0400 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 MIME-Version: 1.0 To: John Baldwin Subject: Re: FreeBSD spinlock - compatibility layer References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305200950.26834.jhb@freebsd.org> <519BDBB0.2070302@mu.org> <201305220905.57939.jhb@freebsd.org> In-Reply-To: <201305220905.57939.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Orit Moskovich , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 13:27:24 -0000 On 5/22/13 9:05 AM, John Baldwin wrote: > On Tuesday, May 21, 2013 4:40:16 pm Alfred Perlstein wrote: >> On 5/20/13 9:50 AM, John Baldwin wrote: >>> On Tuesday, May 14, 2013 6:04:21 am Orit Moskovich wrote: >>>> Hi, >>>> >>>> I read about the FreeBSD mutex implementation for spinlock in the >>> compatibility layer. >>>> I might be wrong, but I noticed a code section that might be problematic: >>>> >>>> Taken from > http://svn.freebsd.org/base/release/9.1.0/sys/ofed/include/linux/spinlock.h: >>>> static inline void >>>> spin_lock_init(spinlock_t *lock) >>>> { >>>> >>>> memset(&lock->m, 0, sizeof(lock->m)); >>>> mtx_init(&lock->m, "lnxspin", NULL, MTX_DEF | MTX_NOWITNESS); >>>> } >>>> >>>> But MTX_DEF initializes mutex as a sleep mutex: >>>> >>>> By default, MTX_DEF mutexes will context switch when they are already >>>> >>>> held. >>>> >>>> >>>> There is a flag MTX_SPIN Which I think is the right one in this case . >>>> >>>> >>>> >>>> I'd appreciate your take on this issue. >>> Since FreeBSD uses a different approach to interrupt handlers (they run in >>> threads, not in the bottom half), a regular mutex may in fact give the > closest >>> match to the same semantics. Regular mutexes are also cheaper and in > general >>> preferable to spin mutexes whenever possible. >>> >> Sure, but is it possible that someone might want some of the other >> guarantees of MTX_SPIN spinlocks such as: >> >> critical section/non-pre-emptable/non-migrating on cpu/latency versus >> throughput ? > Probably not. For example, on FreeBSD you want your driver lock to be > preempted by an interrupt to avoid higher interrupt latency for filter > handlers. Most drivers should not need temporary pinning. If they want to > pin work to threads they should bind threads or IRQs to specific CPUs, not > rely on temporary pinning. > I know how it works in FreeBSD. I think that a compatibility layer should first and foremost aim for compatibility, not speed at expense of expected semantics. That said, I'm hopeful that the ofed stack doesn't use any of the other guarantees you'd expect from real spinlocks other than mutual exclusion so it's not that big of a deal. Unless it does... which will be interesting track down. -Alfred From owner-freebsd-arch@FreeBSD.ORG Wed May 22 13:54:55 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 08D7D337; Wed, 22 May 2013 13:54:55 +0000 (UTC) (envelope-from oritm@mellanox.com) Received: from eu1sys200aog113.obsmtp.com (eu1sys200aog113.obsmtp.com [207.126.144.135]) by mx1.freebsd.org (Postfix) with ESMTP id D9574298; Wed, 22 May 2013 13:53:48 +0000 (UTC) Received: from MTLCAS02.mtl.com ([193.47.165.155]) (using TLSv1) by eu1sys200aob113.postini.com ([207.126.147.11]) with SMTP ID DSNKUZzN0Y+f8QlG2kB7TbPT6rZI1/Q2ARSN@postini.com; Wed, 22 May 2013 13:53:49 UTC Received: from MTLDAG01.mtl.com ([10.0.8.75]) by MTLCAS02.mtl.com ([10.0.8.72]) with mapi id 14.03.0123.003; Wed, 22 May 2013 16:48:00 +0300 From: Orit Moskovich To: John Baldwin Subject: RE: FreeBSD spinlock - compatibility layer Thread-Topic: FreeBSD spinlock - compatibility layer Thread-Index: Ac5QiaSgCms1CiujRJ+uiUawknitKQEvjFEAACUK0bAAEntnAAAi8BtgAAhXzQAAB+r/oA== Date: Wed, 22 May 2013 13:48:00 +0000 Message-ID: <981733489AB3BD4DB24B48340F53E0A55B0D3CD8@MTLDAG01.mtl.com> References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305211220.16776.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D39EF@MTLDAG01.mtl.com> <201305220859.32948.jhb@freebsd.org> In-Reply-To: <201305220859.32948.jhb@freebsd.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.0.13.1] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 13:54:55 -0000 >From the mutex man page " By default, MTX_DEF mutexes will context switch w= hen they are already held." How is sleeping forbidden, but blocking on a mutex that might context switc= h is ok? -----Original Message----- From: John Baldwin [mailto:jhb@freebsd.org]=20 Sent: Wednesday, May 22, 2013 04:06 PM To: Orit Moskovich Cc: freebsd-arch@freebsd.org Subject: Re: FreeBSD spinlock - compatibility layer On Wednesday, May 22, 2013 2:14:51 am Orit Moskovich wrote: > From what I've read in "FreeBSD - device drivers" book by Joseph Kong=20 > on interrupt handling, you cannot voluntarily context switch (that is, sleep) = in interrupt threads . That is not the same thing. By sleep it means call a *sleep() function or = wait on a cond var. Not block on a mutex or rwlock. > In any case, I think that the functionality of spin mutex should=20 > remain as is, and not modified to sleep mutex, as it can be used in places that sleep= mustn't be used, or that require the properties of the spin due to perform= ance considerations. No, spin locks are _slower_ and reduce performance. FreeBSD is much more l= ike Solaris in this regard. Spin mutexes on FreeBSD are similar to dispatc= her locks in Solaris which 99% of the kernel should never use. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed May 22 14:08:18 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7D9D9803; Wed, 22 May 2013 14:08:18 +0000 (UTC) (envelope-from eric@vangyzen.net) Received: from aussmtpmrkpc120.us.dell.com (aussmtpmrkpc120.us.dell.com [143.166.82.159]) by mx1.freebsd.org (Postfix) with ESMTP id 45ED538A; Wed, 22 May 2013 14:08:18 +0000 (UTC) X-Loopcount0: from 64.238.244.148 X-IronPort-AV: E=Sophos;i="4.87,721,1363150800"; d="scan'208";a="31446778" Message-ID: <519CD0CA.5060209@vangyzen.net> Date: Wed, 22 May 2013 09:06:02 -0500 From: Eric van Gyzen User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130413 Thunderbird/17.0.5 MIME-Version: 1.0 To: Orit Moskovich Subject: Re: FreeBSD spinlock - compatibility layer References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305211220.16776.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D39EF@MTLDAG01.mtl.com> <201305220859.32948.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D3CD8@MTLDAG01.mtl.com> In-Reply-To: <981733489AB3BD4DB24B48340F53E0A55B0D3CD8@MTLDAG01.mtl.com> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 14:08:18 -0000 On 05/22/2013 08:48, Orit Moskovich wrote: > From the mutex man page " By default, MTX_DEF mutexes will context switch when they are already held." > How is sleeping forbidden, but blocking on a mutex that might context switch is ok? The duration is the distinction. See the "Bounded vs. unbounded sleep" section in the locking(9) man page. In fact, it would serve you well to read that entire man page. > -----Original Message----- > From: John Baldwin [mailto:jhb@freebsd.org] > Sent: Wednesday, May 22, 2013 04:06 PM > To: Orit Moskovich > Cc: freebsd-arch@freebsd.org > Subject: Re: FreeBSD spinlock - compatibility layer > > On Wednesday, May 22, 2013 2:14:51 am Orit Moskovich wrote: >> From what I've read in "FreeBSD - device drivers" book by Joseph Kong >> on > interrupt handling, you cannot voluntarily context switch (that is, sleep) in interrupt threads . > > That is not the same thing. By sleep it means call a *sleep() function or wait on a cond var. Not block on a mutex or rwlock. > >> In any case, I think that the functionality of spin mutex should >> remain as > is, and not modified to sleep mutex, as it can be used in places that sleep mustn't be used, or that require the properties of the spin due to performance considerations. > > No, spin locks are _slower_ and reduce performance. FreeBSD is much more like Solaris in this regard. Spin mutexes on FreeBSD are similar to dispatcher locks in Solaris which 99% of the kernel should never use. From owner-freebsd-arch@FreeBSD.ORG Wed May 22 14:21:34 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0D164BC3 for ; Wed, 22 May 2013 14:21:34 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-ie0-x22d.google.com (mail-ie0-x22d.google.com [IPv6:2607:f8b0:4001:c03::22d]) by mx1.freebsd.org (Postfix) with ESMTP id D28946B9 for ; Wed, 22 May 2013 14:21:33 +0000 (UTC) Received: by mail-ie0-f173.google.com with SMTP id k5so5285668iea.32 for ; Wed, 22 May 2013 07:21:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=cEbJza/JjZLJ9SEwa1GTBW4WOF/dGoZ6X8RrQNpNX2I=; b=e+4P9L/leN1TmDJHxwha3Svla2whnfhDd4hM2PRAT2xGDDZqYF1sAmp7s1wy8IrJfp x5MBGnr5hFPH1g26hGnqr9cfyFhmj/oxxS8jvsMNXdH2RdYQyCg1ePuGFzELHcKtRpWX ifN7cCJdgA/zXxRz8Zac/JiK2cFxihQcMufO422njzzFfVN7+n9vfGlcsSx21NOJA4HP y+qrghv+xvMwbex5O9M/UIf7AdYQ/2+eBtN1j2FmI/PKkkosCrPAF1Xep7paCkF5gZoM b8SSF4G3ugiCVqgEueTih7SqrsWPc7yOQrrNC5cLkQ5ipINkpF97+LJZZs/T8xxqKJ8P Itqw== X-Received: by 10.42.50.202 with SMTP id b10mr6126713icg.7.1369232493543; Wed, 22 May 2013 07:21:33 -0700 (PDT) Received: from 53.imp.bsdimp.com (50-78-194-198-static.hfc.comcastbusiness.net. [50.78.194.198]) by mx.google.com with ESMTPSA id l14sm7283741igf.9.2013.05.22.07.21.31 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 22 May 2013 07:21:32 -0700 (PDT) Sender: Warner Losh Subject: Re: FreeBSD spinlock - compatibility layer Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: Warner Losh In-Reply-To: <981733489AB3BD4DB24B48340F53E0A55B0D39EF@MTLDAG01.mtl.com> Date: Wed, 22 May 2013 08:21:30 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305200950.26834.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D091F@MTLDAG01.mtl.com> <201305211220.16776.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D39EF@MTLDAG01.mtl.com> To: Orit Moskovich X-Mailer: Apple Mail (2.1085) X-Gm-Message-State: ALoCoQnVQt98hmf7IsFVSbgtXfzbNzPc0J98ARsL4FE+s9WzvKrkhkpvSSousJsyS9xoMja9qbRx Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 14:21:34 -0000 On May 22, 2013, at 12:14 AM, Orit Moskovich wrote: > =46rom what I've read in "FreeBSD - device drivers" book by Joseph = Kong on interrupt handling, you cannot voluntarily context switch (that = is, sleep) in interrupt threads . In ithreads, the interrupt thread context, you can voluntarily context = switch. In a interrupt filter you can't, but that's because you are = running in an interrupt context. Please note that an interrupt context = and an interrupt thread context are different things. > In any case, I think that the functionality of spin mutex should = remain as is, and not modified to sleep mutex, as it can be used in = places that sleep mustn't be used, or that require the properties of the = spin due to performance considerations. The linux compat layer is just for the oled stuff. It isn't intended to = be a complete 'drop in' replacement environment for Linux drivers. = Anything at odds with this basic purpose is unlikely to happen. Warner >=20 >=20 > -----Original Message----- > From: John Baldwin [mailto:jhb@freebsd.org]=20 > Sent: Tuesday, May 21, 2013 10:02 PM > To: Orit Moskovich > Cc: freebsd-arch@freebsd.org > Subject: Re: FreeBSD spinlock - compatibility layer >=20 > On Tuesday, May 21, 2013 12:36:38 am Orit Moskovich wrote: >> That's not the case when using taskqueues for deferring execution of=20= >> an > interrupt handler. >> Tasks can be delayed using the global taskqueue taskqueue_swi, which > executes its tasks in the context of an interrupt. >> In this case sleep is forbidden, and using spin mutex is not = (although=20 >> might > be not recommended). >=20 > No, swi's run in an interrupt thread, and interrupt threads can use = regular mutexes. (That is why they run in a thread context.) The only = way you can run in a context requiring a spin lock in a driver is to use = an interrupt filter. >=20 > -- > John Baldwin > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to = "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Wed May 22 15:17:16 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 23840896 for ; Wed, 22 May 2013 15:17:16 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id EA26CCDE for ; Wed, 22 May 2013 15:17:15 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 209FBB96F; Wed, 22 May 2013 11:17:14 -0400 (EDT) From: John Baldwin To: Orit Moskovich Subject: Re: FreeBSD spinlock - compatibility layer Date: Wed, 22 May 2013 11:05:54 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305220859.32948.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D3CD8@MTLDAG01.mtl.com> In-Reply-To: <981733489AB3BD4DB24B48340F53E0A55B0D3CD8@MTLDAG01.mtl.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305221105.55093.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 22 May 2013 11:17:14 -0400 (EDT) Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 15:17:16 -0000 On Wednesday, May 22, 2013 9:48:00 am Orit Moskovich wrote: > From the mutex man page " By default, MTX_DEF mutexes will context switch when they are already held." > How is sleeping forbidden, but blocking on a mutex that might context switch is ok? Because they are different. When you block on a lock you propragate your priority to the lock holder and will resume execution if you are more important as soon as the holder drops the lock. In other words, you are going to make forward progress. With "event" sleeps like *sleep() and condition variables, there is no owner to propagate priority to, and the sleep may very well be waiting for some arbitrary event (such as the arrival of a network packet or completion of an I/O request), so there is not the same guarantee of making forward progress. The other half of this that keeps this true is that you are not permitted to perform "event" sleeps while holding a mutex. You have to drop the lock while you wait which frees any threads waiting for the lock to run. When you block on a mutex the only thing you are ever waiting on is CPU time for either yourself or the lock holder to run. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed May 22 15:17:17 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 42C3C899 for ; Wed, 22 May 2013 15:17:17 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id 17A38CE0 for ; Wed, 22 May 2013 15:17:17 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 3BB50B990; Wed, 22 May 2013 11:17:16 -0400 (EDT) From: John Baldwin To: Alfred Perlstein Subject: Re: FreeBSD spinlock - compatibility layer Date: Wed, 22 May 2013 11:15:18 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305220905.57939.jhb@freebsd.org> <519CC7B4.2030208@mu.org> In-Reply-To: <519CC7B4.2030208@mu.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305221115.19093.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 22 May 2013 11:17:16 -0400 (EDT) Cc: Orit Moskovich , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 15:17:17 -0000 On Wednesday, May 22, 2013 9:27:16 am Alfred Perlstein wrote: > On 5/22/13 9:05 AM, John Baldwin wrote: > > Probably not. For example, on FreeBSD you want your driver lock to be > > preempted by an interrupt to avoid higher interrupt latency for filter > > handlers. Most drivers should not need temporary pinning. If they want to > > pin work to threads they should bind threads or IRQs to specific CPUs, not > > rely on temporary pinning. > > > I know how it works in FreeBSD. > > I think that a compatibility layer should first and foremost aim for > compatibility, not speed at expense of expected semantics. The problem with this is that whatever code runs under this layer also has to cooperate with the rest of the system. Blindly using spin locks does not do that. Also, I think my entire point is about "expected semantics". People should think about the actual semantics they need in a driver, not just assume that whatever side effects they get from the primitives and APIs provided on one platform defines the semantics they need. I still assert that in terms of what a device driver actually expects, a regular mutex will provide the correct semantics. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed May 22 16:32:48 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 6C1EE570; Wed, 22 May 2013 16:32:48 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 5CD8C292; Wed, 22 May 2013 16:32:48 +0000 (UTC) Received: from Alfreds-MacBook-Pro-9.local (OTWAON23-1176242366.sdsl.bell.ca [70.28.8.190]) by elvis.mu.org (Postfix) with ESMTPSA id 6C9CC1A3C1B; Wed, 22 May 2013 09:32:47 -0700 (PDT) Message-ID: <519CF32D.2040609@mu.org> Date: Wed, 22 May 2013 12:32:45 -0400 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 MIME-Version: 1.0 To: John Baldwin Subject: Re: FreeBSD spinlock - compatibility layer References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305220905.57939.jhb@freebsd.org> <519CC7B4.2030208@mu.org> <201305221115.19093.jhb@freebsd.org> In-Reply-To: <201305221115.19093.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Orit Moskovich , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 16:32:48 -0000 On 5/22/13 11:15 AM, John Baldwin wrote: > On Wednesday, May 22, 2013 9:27:16 am Alfred Perlstein wrote: >> On 5/22/13 9:05 AM, John Baldwin wrote: >>> Probably not. For example, on FreeBSD you want your driver lock to be >>> preempted by an interrupt to avoid higher interrupt latency for filter >>> handlers. Most drivers should not need temporary pinning. If they want to >>> pin work to threads they should bind threads or IRQs to specific CPUs, not >>> rely on temporary pinning. >>> >> I know how it works in FreeBSD. >> >> I think that a compatibility layer should first and foremost aim for >> compatibility, not speed at expense of expected semantics. > The problem with this is that whatever code runs under this layer also has to > cooperate with the rest of the system. Blindly using spin locks does not do > that. Also, I think my entire point is about "expected semantics". People > should think about the actual semantics they need in a driver, not just assume > that whatever side effects they get from the primitives and APIs provided on > one platform defines the semantics they need. I still assert that in terms of > what a device driver actually expects, a regular mutex will provide the correct > semantics. > I agree with your assertion that what we have MTX_DEF should work for drivers for the cases we have. I do believe though that any kernel dev outside FreeBSD will expect certain semantics from a spin mutex though. It's an interesting problem. -Alfred From owner-freebsd-arch@FreeBSD.ORG Wed May 22 17:13:03 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4FA11FE; Wed, 22 May 2013 17:13:03 +0000 (UTC) (envelope-from oritm@mellanox.com) Received: from eu1sys200aog112.obsmtp.com (eu1sys200aog112.obsmtp.com [207.126.144.133]) by mx1.freebsd.org (Postfix) with ESMTP id 44AA077C; Wed, 22 May 2013 17:13:01 +0000 (UTC) Received: from MTLCAS01.mtl.com ([193.47.165.155]) (using TLSv1) by eu1sys200aob112.postini.com ([207.126.147.11]) with SMTP ID DSNKUZz8jm96DL3o6UN4bfegzqYWBSEUg2Rn@postini.com; Wed, 22 May 2013 17:13:02 UTC Received: from MTLDAG01.mtl.com ([10.0.8.75]) by MTLCAS01.mtl.com ([10.0.8.71]) with mapi id 14.03.0123.003; Wed, 22 May 2013 20:12:44 +0300 From: Orit Moskovich To: John Baldwin Subject: RE: FreeBSD spinlock - compatibility layer Thread-Topic: FreeBSD spinlock - compatibility layer Thread-Index: Ac5QiaSgCms1CiujRJ+uiUawknitKQEvjFEAACUK0bAAEntnAAAi8BtgAAhXzQAAB+r/oP//4/YAgABUGN4= Date: Wed, 22 May 2013 17:12:43 +0000 Message-ID: <981733489AB3BD4DB24B48340F53E0A55B0D4D32@MTLDAG01.mtl.com> References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305220859.32948.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D3CD8@MTLDAG01.mtl.com>, <201305221105.55093.jhb@freebsd.org> In-Reply-To: <201305221105.55093.jhb@freebsd.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.2.8.71] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 17:13:03 -0000 Thanks, I appreciate all your help!=0A= =0A= The section about the unbounded sleep wasn't included in the man page I've = read so I wan't familiar with that concept...(http://www.unix.com/man-page/= FreeBSD/9/locking/)=0A= =0A= =0A= =0A= ________________________________________=0A= From: John Baldwin [jhb@freebsd.org]=0A= Sent: Wednesday, May 22, 2013 6:17 PM=0A= To: Orit Moskovich=0A= Cc: freebsd-arch@freebsd.org=0A= Subject: Re: FreeBSD spinlock - compatibility layer=0A= =0A= On Wednesday, May 22, 2013 9:48:00 am Orit Moskovich wrote:=0A= > From the mutex man page " By default, MTX_DEF mutexes will context switch= =0A= when they are already held."=0A= > How is sleeping forbidden, but blocking on a mutex that might context swi= tch=0A= is ok?=0A= =0A= Because they are different. When you block on a lock you propragate your= =0A= priority to the lock holder and will resume execution if you are more=0A= important as soon as the holder drops the lock. In other words, you are go= ing=0A= to make forward progress.=0A= =0A= With "event" sleeps like *sleep() and condition variables, there is no owne= r=0A= to propagate priority to, and the sleep may very well be waiting for some= =0A= arbitrary event (such as the arrival of a network packet or completion of a= n=0A= I/O request), so there is not the same guarantee of making forward progress= .=0A= =0A= The other half of this that keeps this true is that you are not permitted t= o=0A= perform "event" sleeps while holding a mutex. You have to drop the lock wh= ile=0A= you wait which frees any threads waiting for the lock to run. When you blo= ck=0A= on a mutex the only thing you are ever waiting on is CPU time for either=0A= yourself or the lock holder to run.=0A= =0A= --=0A= John Baldwin=0A= From owner-freebsd-arch@FreeBSD.ORG Wed May 22 18:03:40 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id F34F66AC for ; Wed, 22 May 2013 18:03:39 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: from mail-ob0-x22d.google.com (mail-ob0-x22d.google.com [IPv6:2607:f8b0:4003:c01::22d]) by mx1.freebsd.org (Postfix) with ESMTP id C2156AB0 for ; Wed, 22 May 2013 18:03:39 +0000 (UTC) Received: by mail-ob0-f173.google.com with SMTP id eh20so2732114obb.4 for ; Wed, 22 May 2013 11:03:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=53M7B7d3BMlq99gm5CGXVxqt2gc95klu77EhDnhrVVk=; b=MQ9URV3bEjiT8VkyflQqoIDAUhAgCJC4P4ctUvWk3Nx/6ICnOceCVSYN+9mOymVPY1 z/shhwtQdhirApJ/Qe1w13IVdNtGPmmi8FUOfEM6w1gTWJ00PlFSdGFn9u/H3Qo6DfhR CDOt3l03UHqFFS4oe3gf7Tf/aJRGcSNbLFhu3heYJr5pRzHy11Y80PD/T31olXYrPf83 RKjUy8qX2UrqRibn1zlSCfC58fNcDK1s0ESBsp6O6mSsUURN6BtE+pv+5rnVvpkpvMi0 k+8BuyTYTswVryRAfNqJlQ3k/fbTKb+1RckuKEvLJ2JgRHd4PxZ1kWzkXmi+J9qUS2gh FNvQ== MIME-Version: 1.0 X-Received: by 10.182.231.197 with SMTP id ti5mr5895034obc.64.1369245819429; Wed, 22 May 2013 11:03:39 -0700 (PDT) Sender: carpeddiem@gmail.com Received: by 10.60.149.194 with HTTP; Wed, 22 May 2013 11:03:39 -0700 (PDT) In-Reply-To: <981733489AB3BD4DB24B48340F53E0A55B0D4D32@MTLDAG01.mtl.com> References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305220859.32948.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D3CD8@MTLDAG01.mtl.com> <201305221105.55093.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D4D32@MTLDAG01.mtl.com> Date: Wed, 22 May 2013 14:03:39 -0400 X-Google-Sender-Auth: hRXDHVR_NkZl5up3xAhCcZRqB6M Message-ID: Subject: Re: FreeBSD spinlock - compatibility layer From: Ed Maste To: Orit Moskovich Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 18:03:40 -0000 On 22 May 2013 13:12, Orit Moskovich wrote: > Thanks, I appreciate all your help! > > The section about the unbounded sleep wasn't included in the man page I've read so I wan't familiar with that concept...(http://www.unix.com/man-page/FreeBSD/9/locking/) Ahh, they seem to be hosting a rather old version. Google doesn't rate FreeBSD's particularly highly, perhaps due to robots.txt settings for the cgi content on http://www.freebsd.org/. You'll find a mostly up-to-date one at http://www.freebsd.org/cgi/man.cgi?query=locking&sektion=9 . From owner-freebsd-arch@FreeBSD.ORG Wed May 22 21:50:32 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 302487F4 for ; Wed, 22 May 2013 21:50:32 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id 0EC09A3A for ; Wed, 22 May 2013 21:50:32 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 48C86B99A; Wed, 22 May 2013 17:50:31 -0400 (EDT) From: John Baldwin To: Alfred Perlstein Subject: Re: FreeBSD spinlock - compatibility layer Date: Wed, 22 May 2013 13:14:25 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305221115.19093.jhb@freebsd.org> <519CF32D.2040609@mu.org> In-Reply-To: <519CF32D.2040609@mu.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305221314.25663.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 22 May 2013 17:50:31 -0400 (EDT) Cc: Orit Moskovich , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 21:50:32 -0000 On Wednesday, May 22, 2013 12:32:45 pm Alfred Perlstein wrote: > On 5/22/13 11:15 AM, John Baldwin wrote: > > On Wednesday, May 22, 2013 9:27:16 am Alfred Perlstein wrote: > >> On 5/22/13 9:05 AM, John Baldwin wrote: > >>> Probably not. For example, on FreeBSD you want your driver lock to be > >>> preempted by an interrupt to avoid higher interrupt latency for filter > >>> handlers. Most drivers should not need temporary pinning. If they want to > >>> pin work to threads they should bind threads or IRQs to specific CPUs, not > >>> rely on temporary pinning. > >>> > >> I know how it works in FreeBSD. > >> > >> I think that a compatibility layer should first and foremost aim for > >> compatibility, not speed at expense of expected semantics. > > The problem with this is that whatever code runs under this layer also has to > > cooperate with the rest of the system. Blindly using spin locks does not do > > that. Also, I think my entire point is about "expected semantics". People > > should think about the actual semantics they need in a driver, not just assume > > that whatever side effects they get from the primitives and APIs provided on > > one platform defines the semantics they need. I still assert that in terms of > > what a device driver actually expects, a regular mutex will provide the correct > > semantics. > > > I agree with your assertion that what we have MTX_DEF should work for > drivers for the cases we have. > > I do believe though that any kernel dev outside FreeBSD will expect > certain semantics from a spin mutex though. No, not on Solaris. Probably not on some dinosaur UNIXes (Irix had adaptive mutexes for example). -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Wed May 22 22:17:38 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 9EF063BE for ; Wed, 22 May 2013 22:17:38 +0000 (UTC) (envelope-from sson@FreeBSD.org) Received: from ns1.son.org (son.org [65.48.68.179]) by mx1.freebsd.org (Postfix) with ESMTP id 6CEF0C34 for ; Wed, 22 May 2013 22:17:37 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ns1.son.org (Postfix) with ESMTP id 8F96E1F2AA08 for ; Wed, 22 May 2013 17:17:47 -0500 (CDT) Received: from ns1.son.org ([127.0.0.1]) by localhost (ns1.dev-random.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4BQNnhX6NS1e for ; Wed, 22 May 2013 17:17:45 -0500 (CDT) Received: from [192.168.0.22] (cpe-76-187-137-252.tx.res.rr.com [76.187.137.252]) by ns1.son.org (Postfix) with ESMTP id E520E1F2AA01 for ; Wed, 22 May 2013 17:17:44 -0500 (CDT) From: Stacey Son Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Subject: binmiscctl(8) (and imgact_binmisc kernel module) Date: Wed, 22 May 2013 17:17:29 -0500 Message-Id: To: freebsd-arch@freebsd.org Mime-Version: 1.0 (Apple Message framework v1283) X-Mailer: Apple Mail (2.1283) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 May 2013 22:17:38 -0000 Hi all: I added a command-line utility called 'binmiscctl' for the = imgact_binmisc kernel module that I previously proposed on this list. = As you may recall, imgact_binmisc is an image activator for = miscellaneous binary file types that are executed with the help of a = user-level interpreter or emulator. It has been proposed that = imgact_binmisc be added to the kernel as a module. The main reason I = created this is to support cross building packages using qemu user mode = (see my dev summit slides at = http://people.freebsd.org/~sson/imgact_binmisc/20130515-bsdcan-xbuild-port= s.pdf) but there are a lot of other applications for this module as = well. For example, Nathan Whitehorn previously proposed on this list a = similar code change (but much less general) to support transparently = execute LLVM bitcode using the 'lli' JIT compiler. This kernel module = if flexible enough that it supports that as well. Baptiste has also added support in poudri=E8re for cross-building mips64 = packages in a "cross jail" using qemu user mode. See his slides from = BSDCan 2013 (pg. 7): http://people.freebsd.org/~bapt/modern-package-management.pdf Bapt mentioned that he built over 10,000 mips64 packages in about 30 = hours. Of course, this is before adding imgact_binmisc which greatly = improves the cross build speed by allowing both native (amd64) cross = build tools to be used along with emulated mips64 binaries in a hybrid = fashion. With my limited testing of cross building a handful of ports = the overhead compared to building the port natively on a commodity amd64 = host is 2x to 4x using this kernel module. Without the module the = overhead is 10x or much more. The recently added 'binmiscctl' command-line utility allows for easy = configuration and management of the image activators in this = imgact_binmisc kernel module. For example, to add an image activator = for qemu-mips64 (the qemu user mode emulator for mips64): # binmiscctl add mips64elf --interpreter = "/usr/local/bin/qemu-mips64" --magic \ = "\x7f\x45\x4c\x46\x02\x02\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\= x00\x08" \ --mask = "\xff\xff\xff\xff\xff\xff\xff\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfe\= xff\xff" --size 20 --set-enabled To disable the above without removing it from the module's activator = list: # binmiscctl disable mips64elf To enable: # binmiscctl enable mips64elf To remove from the module's activator list: # binmiscctl remove mips64elf To lookup and print out the activator entry: # binmiscctl lookup mips64elf name: mips64elf interpreter: /usr/local/bin/qemu-mips64 flags: ENABLED USE_MASK=20 magic size: 20 magic offset: 0=09 magic: 0x7f 0x45 0x4c 0x46 0x02 0x02 0x01 0x00 0x00 0x00 0x00 = 0x00=20 0x00 0x00 0x00 0x00 0x00 0x02 0x00 0x08=20 mask: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0x00 0xff 0xff 0xff = 0xff=20 0xff 0xff 0xff 0xff 0xff 0xfe 0xff 0xff=20 To take a snapshot and list all the activators=20 # binmiscctl list name: mips64elf [...] To add an image activator for LLVM bitcode JIT lli(1) compiler: # binmiscctl add llvmbc --interpreter ''/usr/bin/lli = --fake-argv0=3D#a'' \ --magic ''BC\xc0\xde'' --size 4 --set-enabled Note the "#a", in the above example, is replaced with the old argv0 = value so lli(1) can use it to fake the argv0 as described in the lli(1) = man page. The source code, man page, and diff to add it to the source tree can be = found at: http://people.freebsd.org/~sson/imgact_binmisc/ Of course, comments, suggestions, concerns, detailed code reviews, etc. = are welcome. Best Regards, -stacey.= From owner-freebsd-arch@FreeBSD.ORG Fri May 24 01:42:23 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 63181196; Fri, 24 May 2013 01:42:23 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id 38F42E06; Fri, 24 May 2013 01:42:22 +0000 (UTC) Received: from jre-mbp.elischer.org (ppp121-45-237-17.lns20.per1.internode.on.net [121.45.237.17]) (authenticated bits=0) by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id r4O1gIRS077119 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 23 May 2013 18:42:21 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <519EC574.1070102@freebsd.org> Date: Fri, 24 May 2013 09:42:12 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 MIME-Version: 1.0 To: Andriy Gapon Subject: Re: FreeBSD spinlock - compatibility layer References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305200950.26834.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D091F@MTLDAG01.mtl.com> <201305211220.16776.jhb@freebsd.org> <981733489AB3BD4DB24B48340F53E0A55B0D39EF@MTLDAG01.mtl.com> <519C663E.1090307@FreeBSD.org> In-Reply-To: <519C663E.1090307@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Orit Moskovich , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 May 2013 01:42:23 -0000 On 5/22/13 2:31 PM, Andriy Gapon wrote: > on 22/05/2013 09:14 Orit Moskovich said the following: >> From what I've read in "FreeBSD - device drivers" book by Joseph Kong on >> interrupt handling, you cannot voluntarily context switch (that is, sleep) in >> interrupt threads . > See the table at the end of locking(9) manual page. > John is updating that table and has promised (nudge nudge) to get it up to date, but I think that it still is correct for what it says. (there is new stuff to add). From owner-freebsd-arch@FreeBSD.ORG Fri May 24 02:27:05 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 64648C6A; Fri, 24 May 2013 02:27:05 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) by mx1.freebsd.org (Postfix) with ESMTP id 390F9F84; Fri, 24 May 2013 02:27:04 +0000 (UTC) Received: from jre-mbp.elischer.org (ppp121-45-237-17.lns20.per1.internode.on.net [121.45.237.17]) (authenticated bits=0) by vps1.elischer.org (8.14.5/8.14.5) with ESMTP id r4O2QwPt077268 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 23 May 2013 19:27:01 -0700 (PDT) (envelope-from julian@freebsd.org) Message-ID: <519ECFED.9050602@freebsd.org> Date: Fri, 24 May 2013 10:26:53 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 MIME-Version: 1.0 To: John Baldwin Subject: Re: FreeBSD spinlock - compatibility layer References: <981733489AB3BD4DB24B48340F53E0A55B0CFD79@MTLDAG01.mtl.com> <201305220905.57939.jhb@freebsd.org> <519CC7B4.2030208@mu.org> <201305221115.19093.jhb@freebsd.org> In-Reply-To: <201305221115.19093.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Alfred Perlstein , Orit Moskovich , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 May 2013 02:27:05 -0000 On 5/22/13 11:15 PM, John Baldwin wrote: > On Wednesday, May 22, 2013 9:27:16 am Alfred Perlstein wrote: >> On 5/22/13 9:05 AM, John Baldwin wrote: >>> Probably not. For example, on FreeBSD you want your driver lock to be >>> preempted by an interrupt to avoid higher interrupt latency for filter >>> handlers. Most drivers should not need temporary pinning. If they want to >>> pin work to threads they should bind threads or IRQs to specific CPUs, not >>> rely on temporary pinning. >>> >> I know how it works in FreeBSD. >> >> I think that a compatibility layer should first and foremost aim for >> compatibility, not speed at expense of expected semantics. > The problem with this is that whatever code runs under this layer also has to > cooperate with the rest of the system. Blindly using spin locks does not do > that. Also, I think my entire point is about "expected semantics". People > should think about the actual semantics they need in a driver, not just assume > that whatever side effects they get from the primitives and APIs provided on > one platform defines the semantics they need. I still assert that in terms of > what a device driver actually expects, a regular mutex will provide the correct > semantics. > in the *mumble* (company I worked for recently) compat layer, we used mutexes to implement the "spinlock" that the linux driver used when it was run under FreeBSD.. It really didn't need a spinlock, it's just that by default linux drivers do/did so. From owner-freebsd-arch@FreeBSD.ORG Fri May 24 05:08:12 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7690D72D; Fri, 24 May 2013 05:08:12 +0000 (UTC) (envelope-from chmeeedalf@gmail.com) Received: from mail-bk0-x235.google.com (mail-bk0-x235.google.com [IPv6:2a00:1450:4008:c01::235]) by mx1.freebsd.org (Postfix) with ESMTP id D61A69FA; Fri, 24 May 2013 05:08:11 +0000 (UTC) Received: by mail-bk0-f53.google.com with SMTP id mx1so2242152bkb.26 for ; Thu, 23 May 2013 22:08:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=ARkcNriqJkKeuyyHZNQN0EM29rD/DrpdzfgQBx7zcDE=; b=k+lDHAiVs+Ngi4Y4JI2uaSb4mzTLJSi7jOxjx0bcWYfylgHcsQVdG/cGe/4+V/dp5k 7kijA45sLv0AqKXILlDVljTyQggW3tPdLgNgX8IPRllMLv2PujoHJbgGVvTv+9NmvRcG 3GBjGA/Mz6L/PpFh/W9UY2JVQIL7YSU8fN0uZ8bnzDIOdLoLJ+FEuhplLdYSEd3wsQuO +Q/lswUD2pFxLCbXGXmWzajGQzSEIN9EjdVIJbnGMEuOHO5MooV+Houbj3CxoRq44u+V wFN5spogukv1lFNagrjpz2vqDJutv6vYLoD6QLfTXwrjhyvZCqeznEVzz3RVqXL59LhN oEpQ== MIME-Version: 1.0 X-Received: by 10.205.34.132 with SMTP id ss4mr8324506bkb.125.1369372090844; Thu, 23 May 2013 22:08:10 -0700 (PDT) Sender: chmeeedalf@gmail.com Received: by 10.204.38.144 with HTTP; Thu, 23 May 2013 22:08:10 -0700 (PDT) In-Reply-To: <519369C4.6060402@FreeBSD.org> References: <51913B7D.1040801@freebsd.org> <288C9B9E-E943-4C5B-BCFB-15B721CBE94C@bsdimp.com> <519369C4.6060402@FreeBSD.org> Date: Thu, 23 May 2013 22:08:10 -0700 X-Google-Sender-Auth: kJLT9EJtEKjs2wuH_Yjs4O2K-So Message-ID: Subject: Re: late suspend/early resume From: Justin Hibbits To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 May 2013 05:08:12 -0000 John, On Wed, May 15, 2013 at 3:56 AM, John Baldwin wrote: > On 5/14/13 1:14 PM, Justin Hibbits wrote: > > You are right that the late suspend could lead to silly proliferation, > and > > an ordered list is much better, but another API would need to be added to > > do that as well. > > > > My north bridge is first in the top list of the tree, right under the > > nexus, so to suspend it last I wrote the nexus suspend to traverse its > > children in reverse. The problem comes that the clock controller is > under a > > later PCI bus, not even the immediate following one, and the north bridge > > children are i2c devices, so suspending them after their clock head away > is > > problematic. We can discuss this more at bsdcan, where it may be easier > to > > describe. But essentially I need the north bridge and that pesky clock > > controller to be the last to suspend and the first to resume. I guess we > > can take this as the starting discussion for modeling this relationship > on > > all platforms, since you mention it is common in embedded platforms. > > I think you can do this by having a notion of passes with drivers having > a suspend pass level and doing passes over the tree suspending devices > at each pass level and then walking the passes back up in reverse during > resume. You could borrow from the multipass stuff used on probe/attach > for this. > > -- > John Baldwin > I have an update in the projects/pmac_pmu branch. It works for my PowerBook, but I'm not certain how well it will fare in the end, because of the way the PCI driver resumes its children. It doesn't call bus_generic_resume(), and instead suspends each child individually, which can lead to devices being resumed multiple times, but I'm uncertain how to fix that. Any ideas? One I had was to have some kind of 'bus_generic_resume_filtered' or similar, that lets bus_generic_resume run its logic, but filter through a function to determine if the child is resumable. But that doesn't quite feel right to me. - Justin From owner-freebsd-arch@FreeBSD.ORG Fri May 24 05:11:54 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0559191F for ; Fri, 24 May 2013 05:11:54 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-ie0-x22c.google.com (mail-ie0-x22c.google.com [IPv6:2607:f8b0:4001:c03::22c]) by mx1.freebsd.org (Postfix) with ESMTP id CAE72A2C for ; Fri, 24 May 2013 05:11:53 +0000 (UTC) Received: by mail-ie0-f172.google.com with SMTP id 16so11153527iea.31 for ; Thu, 23 May 2013 22:11:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=f0ow6EKApjDC/dUeWRW+Rn1mzeDgCLYZ0rpG/0zmpi4=; b=BK5Nw4+gYMY8nFYDuhi+ujTKQBkZfeajHAuH8gIHPRXVTZErv+M7hy+oWEEYoExejh dhNseEl+GK8Lp4qgPgHkJgt235FumTvkz64R+Nu9qhGdX6ho9rKxS6RpR2C0xNAL94rF rvig07sjXrI08N15K9zb6HcOdYA6zoTwcnFln0YKVlG9n2+vT40Z6E5R4sCtMOSUmXlX R6fUC/CtzhVGFc2887yzq65Mdm/91XUjmXC3pGFXWdc3wW+Jt1Qn9AGjP7YzjIjldtqJ ftmuWpps5EI2STxdO6cbsCykXLYPKdj68igyQB30e0BO9//qYAZMu1SboXF9Dra1v0lB E98g== X-Received: by 10.50.11.229 with SMTP id t5mr2879020igb.65.1369372313552; Thu, 23 May 2013 22:11:53 -0700 (PDT) Received: from 53.imp.bsdimp.com (50-78-194-198-static.hfc.comcastbusiness.net. [50.78.194.198]) by mx.google.com with ESMTPSA id d9sm29295546igr.4.2013.05.23.22.11.51 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 23 May 2013 22:11:52 -0700 (PDT) Sender: Warner Losh Subject: Re: late suspend/early resume Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: Warner Losh In-Reply-To: Date: Thu, 23 May 2013 23:11:49 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <780B27A1-AA4F-4382-BE33-7587BD9EB615@bsdimp.com> References: <51913B7D.1040801@freebsd.org> <288C9B9E-E943-4C5B-BCFB-15B721CBE94C@bsdimp.com> <519369C4.6060402@FreeBSD.org> To: Justin Hibbits X-Mailer: Apple Mail (2.1085) X-Gm-Message-State: ALoCoQkAoeSLCtzXuaI3Za+jybu+zO+L8eURvEHs3B4Dzd/+LTgdZroON/UKwh/jbrwdv+KrMbC6 Cc: freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 May 2013 05:11:54 -0000 On May 23, 2013, at 11:08 PM, Justin Hibbits wrote: > John, >=20 >=20 > On Wed, May 15, 2013 at 3:56 AM, John Baldwin wrote: >=20 >> On 5/14/13 1:14 PM, Justin Hibbits wrote: >>> You are right that the late suspend could lead to silly = proliferation, >> and >>> an ordered list is much better, but another API would need to be = added to >>> do that as well. >>>=20 >>> My north bridge is first in the top list of the tree, right under = the >>> nexus, so to suspend it last I wrote the nexus suspend to traverse = its >>> children in reverse. The problem comes that the clock controller is >> under a >>> later PCI bus, not even the immediate following one, and the north = bridge >>> children are i2c devices, so suspending them after their clock head = away >> is >>> problematic. We can discuss this more at bsdcan, where it may be = easier >> to >>> describe. But essentially I need the north bridge and that pesky = clock >>> controller to be the last to suspend and the first to resume. I = guess we >>> can take this as the starting discussion for modeling this = relationship >> on >>> all platforms, since you mention it is common in embedded platforms. >>=20 >> I think you can do this by having a notion of passes with drivers = having >> a suspend pass level and doing passes over the tree suspending = devices >> at each pass level and then walking the passes back up in reverse = during >> resume. You could borrow from the multipass stuff used on = probe/attach >> for this. >>=20 >> -- >> John Baldwin >>=20 >=20 > I have an update in the projects/pmac_pmu branch. It works for my > PowerBook, but I'm not certain how well it will fare in the end, = because of > the way the PCI driver resumes its children. It doesn't call > bus_generic_resume(), and instead suspends each child individually, = which > can lead to devices being resumed multiple times, but I'm uncertain = how to > fix that. Any ideas? One I had was to have some kind of > 'bus_generic_resume_filtered' or similar, that lets bus_generic_resume = run > its logic, but filter through a function to determine if the child is > resumable. But that doesn't quite feel right to me. How does it lead there? Did you change the suspend/resume function = signatures? Or did you create new ones like I suggested that had a = default method that called the old suspend/resume function iff pass was = the last one.... Warner From owner-freebsd-arch@FreeBSD.ORG Fri May 24 15:08:23 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 30528D15; Fri, 24 May 2013 15:08:23 +0000 (UTC) (envelope-from chmeeedalf@gmail.com) Received: from mail-bk0-x22f.google.com (mail-bk0-x22f.google.com [IPv6:2a00:1450:4008:c01::22f]) by mx1.freebsd.org (Postfix) with ESMTP id 8B2A7827; Fri, 24 May 2013 15:08:22 +0000 (UTC) Received: by mail-bk0-f47.google.com with SMTP id jg1so2566200bkc.34 for ; Fri, 24 May 2013 08:08:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=4TgIgd8nTBy827rsFNp88qxMZzVtw1zvVkzcPSTWaLw=; b=ZZlMacLNoWCqK1yHf3DQO2EnmPfKBmnhYID76K7VrEP6WUyjDFmrbTYbLJzSTJvWcr com78M1FcMPuGk63QiTi+PoKJPvv2WBjWIJzICDND+HizVrKfgkDwcN2GnDD+Y001gTQ 7Z5WoD9DTDhZdADsrS9I5VDVx391UsKynyRTdFy6nnrtjn++cat9XjACUatlUWs5jAxz DW+/of/o4H07h6QVpWjok+jHEJHv0U7PkS3c14K6Ezc67GL8HZWs5rOiWPxY0JO2uUY9 btB2M5kgdesvQl1NhWohc7RIvJZVFWzEnv6IA2c7Fk2QDVF6Xgl7meJ6Gh83nLd9fmPT hxvA== MIME-Version: 1.0 X-Received: by 10.204.65.69 with SMTP id h5mr9399810bki.59.1369408101489; Fri, 24 May 2013 08:08:21 -0700 (PDT) Sender: chmeeedalf@gmail.com Received: by 10.204.38.144 with HTTP; Fri, 24 May 2013 08:08:21 -0700 (PDT) In-Reply-To: <780B27A1-AA4F-4382-BE33-7587BD9EB615@bsdimp.com> References: <51913B7D.1040801@freebsd.org> <288C9B9E-E943-4C5B-BCFB-15B721CBE94C@bsdimp.com> <519369C4.6060402@FreeBSD.org> <780B27A1-AA4F-4382-BE33-7587BD9EB615@bsdimp.com> Date: Fri, 24 May 2013 08:08:21 -0700 X-Google-Sender-Auth: p3uyIJDf3JYtS_b0LSK7Z1MHxJw Message-ID: Subject: Re: late suspend/early resume From: Justin Hibbits To: Warner Losh Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 May 2013 15:08:23 -0000 On Thu, May 23, 2013 at 10:11 PM, Warner Losh wrote: > > On May 23, 2013, at 11:08 PM, Justin Hibbits wrote: > > > John, > > > > > > On Wed, May 15, 2013 at 3:56 AM, John Baldwin wrote: > > > >> On 5/14/13 1:14 PM, Justin Hibbits wrote: > >>> You are right that the late suspend could lead to silly proliferation, > >> and > >>> an ordered list is much better, but another API would need to be added > to > >>> do that as well. > >>> > >>> My north bridge is first in the top list of the tree, right under the > >>> nexus, so to suspend it last I wrote the nexus suspend to traverse its > >>> children in reverse. The problem comes that the clock controller is > >> under a > >>> later PCI bus, not even the immediate following one, and the north > bridge > >>> children are i2c devices, so suspending them after their clock head > away > >> is > >>> problematic. We can discuss this more at bsdcan, where it may be > easier > >> to > >>> describe. But essentially I need the north bridge and that pesky clock > >>> controller to be the last to suspend and the first to resume. I guess > we > >>> can take this as the starting discussion for modeling this relationship > >> on > >>> all platforms, since you mention it is common in embedded platforms. > >> > >> I think you can do this by having a notion of passes with drivers having > >> a suspend pass level and doing passes over the tree suspending devices > >> at each pass level and then walking the passes back up in reverse during > >> resume. You could borrow from the multipass stuff used on probe/attach > >> for this. > >> > >> -- > >> John Baldwin > >> > > > > I have an update in the projects/pmac_pmu branch. It works for my > > PowerBook, but I'm not certain how well it will fare in the end, because > of > > the way the PCI driver resumes its children. It doesn't call > > bus_generic_resume(), and instead suspends each child individually, which > > can lead to devices being resumed multiple times, but I'm uncertain how > to > > fix that. Any ideas? One I had was to have some kind of > > 'bus_generic_resume_filtered' or similar, that lets bus_generic_resume > run > > its logic, but filter through a function to determine if the child is > > resumable. But that doesn't quite feel right to me. > > How does it lead there? Did you change the suspend/resume function > signatures? Or did you create new ones like I suggested that had a default > method that called the old suspend/resume function iff pass was the last > one.... > > > Warner > > I must've misunderstood you during our discussion. I simply reused the existing bus_generic_suspend/resume, and used EAGAIN to pass a token of 'rescan this again'. - Justin