From owner-freebsd-arch@FreeBSD.ORG Wed Dec 8 21:03:39 2010 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 792BD10656A8 for ; Wed, 8 Dec 2010 21:03:39 +0000 (UTC) (envelope-from gprspb@mail.ru) Received: from fallback1.mail.ru (fallback1.mail.ru [94.100.176.18]) by mx1.freebsd.org (Postfix) with ESMTP id 2ECA98FC35 for ; Wed, 8 Dec 2010 21:03:38 +0000 (UTC) Received: from smtp15.mail.ru (smtp15.mail.ru [94.100.176.133]) by fallback1.mail.ru (mPOP.Fallback_MX) with ESMTP id 381681D8E194 for ; Wed, 8 Dec 2010 23:44:51 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail; h=Content-Type:MIME-Version:Message-ID:Subject:To:From:Date; bh=uPjHu/ZAXRrgZfyLcQC+flQVSFLr7K0t/+XS9F3Vi3k=; b=beG2E/OJKEOtUb+X8OpQuJ1MwNs9ymg9oTLMtjGOVyacIJr6cI5zdrNIENapAhagKR9O4CWfKWfQDav5DWoU+U4D8+h1AYXrh1LRD+0WHtNUdtsZHLs9HZh4qUvCnTgw; Received: from [93.185.182.46] (port=9715 helo=gpr.nnz-home.ru) by smtp15.mail.ru with asmtp (TLSv1:AES256-SHA:256) id 1PQQsc-0000X6-00 for freebsd-arch@freebsd.org; Wed, 08 Dec 2010 23:44:50 +0300 Received: from gpr by gpr.nnz-home.ru with local (Exim 4.72) (envelope-from ) id 1PQQra-0000UO-EU for freebsd-arch@freebsd.org; Wed, 08 Dec 2010 23:43:46 +0300 Date: Wed, 8 Dec 2010 23:43:46 +0300 From: Gennady Proskurin To: freebsd-arch@freebsd.org Message-ID: <20101208204346.GA1762@gpr.nnz-home.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) X-Mras: Ok Subject: bsdtar and locale X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Dec 2010 21:03:39 -0000 bsdtar (which is default "tar" in freebsd) treats file/directory names in locale-dependent manner. For example, if you archive some file with utf-8 name in "C" locale (env LC_ALL=C tar -c ...), and then extract it in some UTF-8 locale, it's name will be corrupted. Such a behaviour is somewhat documented in archive_entry(3) and bsdtar(1) manpages, so this is not a bug, but feature. I agree, such conversions can be usefull in some cases, but should be disabled by default (we are unix, filenames are just binary data). It is very annoying, it makes you to always think about locales while creating and extracting archive. For now, I use gtar for backups to avoid such a problems. From owner-freebsd-arch@FreeBSD.ORG Thu Dec 9 07:17:42 2010 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 206AD1065670 for ; Thu, 9 Dec 2010 07:17:42 +0000 (UTC) (envelope-from tim@kientzle.com) Received: from mail-pv0-f182.google.com (mail-pv0-f182.google.com [74.125.83.182]) by mx1.freebsd.org (Postfix) with ESMTP id F3D258FC1A for ; Thu, 9 Dec 2010 07:17:41 +0000 (UTC) Received: by pvc22 with SMTP id 22so486712pvc.13 for ; Wed, 08 Dec 2010 23:17:41 -0800 (PST) Received: by 10.143.14.21 with SMTP id r21mr3642610wfi.127.1291877418377; Wed, 08 Dec 2010 22:50:18 -0800 (PST) Received: from [10.123.2.178] (99-74-169-43.lightspeed.sntcca.sbcglobal.net [99.74.169.43]) by mx.google.com with ESMTPS id w14sm2006496wfd.18.2010.12.08.22.50.16 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 08 Dec 2010 22:50:17 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=us-ascii From: Tim Kientzle In-Reply-To: <20101208204346.GA1762@gpr.nnz-home.ru> Date: Wed, 8 Dec 2010 22:50:14 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20101208204346.GA1762@gpr.nnz-home.ru> To: Gennady Proskurin X-Mailer: Apple Mail (2.1082) Cc: freebsd-arch@freebsd.org Subject: Re: bsdtar and locale X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Dec 2010 07:17:42 -0000 On Dec 8, 2010, at 12:43 PM, Gennady Proskurin wrote: > bsdtar ... if you archive some file with utf-8 name > in "C" locale (env LC_ALL=3DC tar -c ...), and then extract it in some = UTF-8 > locale, it's name will be corrupted. Such a behaviour is somewhat = documented in > archive_entry(3) and bsdtar(1) manpages, so this is not a bug, but = feature. >=20 > I agree, such conversions can be usefull in some cases, but should be = disabled > by default (we are unix, filenames are just binary data). > It is very annoying, it makes you to always think about locales while = creating > and extracting archive. The extended tar format used by bsdtar comes from the POSIX standard: http://www.opengroup.org/onlinepubs/9699919799/utilities/pax.html The issue you mention is discussed in the standard: > Translating filenames and other attributes from a locale's encoding to = UTF-8 and then back again can lose information, as the resulting = filename might not be byte-for-byte equivalent to the original. To avoid = this problem, users can specify the -o hdrcharset=3Dbinary option, which = will cause the resulting archive to use binary format for all names and = attributes. Such archives are not portable among hosts that use = different native encodings (e.g., EBCDIC versus ASCII-based encodings), = but they will allow interchange among the vast majority of POSIX file = systems in practical use. Also, the -o hdrcharset=3Dbinary option will = cause pax in copy mode to behave more like other standard utilities such = as cp. bsdtar does not yet implement an option equivalent to the -o = hdrcharset=3Dbinary option, but most of the logic is already implemented = in libarchive. Libarchive's write support for pax format does = automatically switch to hdrcharset=3Dbinary for entries if the names = cannot be translated to UTF-8. It should be easy to add a way to = explicitly request this handling for all entries. Cheers, Tim From owner-freebsd-arch@FreeBSD.ORG Fri Dec 10 15:52:10 2010 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3A1A6106566C for ; Fri, 10 Dec 2010 15:52:10 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 0E9098FC15 for ; Fri, 10 Dec 2010 15:52:10 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id B50FC46B58 for ; Fri, 10 Dec 2010 10:52:09 -0500 (EST) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 36AF48A027 for ; Fri, 10 Dec 2010 10:52:08 -0500 (EST) From: John Baldwin To: arch@freebsd.org Date: Fri, 10 Dec 2010 10:50:45 -0500 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20101102; KDE/4.4.5; amd64; ; ) MIME-Version: 1.0 Content-Type: Text/Plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <201012101050.45214.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Fri, 10 Dec 2010 10:52:08 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.3 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=0.7 required=4.2 tests=BAYES_00,TO_NO_BRKTS_DIRECT autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on bigwig.baldwin.cx Cc: Subject: Realtime thread priorities X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 15:52:10 -0000 So I finally had a case today where I wanted to use rtprio but it doesn't seem very useful in its current state. Specifically, I want to be able to tag certain user processes as being more important than any other user processes even to the point that if one of my important processes blocks on a mutex, the owner of that mutex should be more important than sshd being woken up from sbwait by new data (for example). This doesn't work currently with rtprio due to the way the priorities are laid out (and I believe I probably argued for the current layout back when it was proposed). The current layout breaks up the global thread priority space (0 - 255) into a couple of bands: 0 - 63 : interrupt threads 64 - 127 : kernel sleep priorities (PSOCK, etc.) 128 - 159 : real-time user threads (rtprio) 160 - 223 : time-sharing user threads 224 - 255 : idle threads (idprio and kernel idle procs) The problem I am running into is that when a time-sharing thread goes to sleep in the kernel (waiting on select, socket data, tty, etc.) it actually ends up in the kernel priorities range (64 - 127). This means when it wakes up it will trump (and preempt) a real-time user thread even though these processes nominally have a priority down in the 160 - 223 range. We do drop the kernel sleep priority during userret(), but we don't recheck the scheduler queues to see if we should preempt the thread during userret(), so it effectively runs with the kernel sleep priority for the rest of the quantum while it is in userland. My first question is if this behavior is the desired behavior? Originally I think I preferred the current layout because I thought a thread in the kernel should always have priority so it can release locks, etc. However, priority propagation should actually handle the case of some very important thread needing a lock. In my use case today where I actually want to use rtprio I think I want different behavior where the rtprio thread is more important than the thread waking up with PSOCK, etc. If we decide to change the behavior I see two possible fixes: 1) (easy) just move the real-time priority range above the kernel sleep priority range 2) (harder) make sched_userret() check the run queue to see if it should preempt when dropping the kernel sleep priority. I think bde@ has suggested that we should do this for correctness previously (and I've had some old, unfinished patches to do this in a branch in p4 for several years). -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Fri Dec 10 16:26:36 2010 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66927106564A; Fri, 10 Dec 2010 16:26:36 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 02BB88FC12; Fri, 10 Dec 2010 16:26:34 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id oBAGQVua045805 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 10 Dec 2010 18:26:31 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id oBAGQVuQ035343; Fri, 10 Dec 2010 18:26:31 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id oBAGQVGJ035342; Fri, 10 Dec 2010 18:26:31 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 10 Dec 2010 18:26:31 +0200 From: Kostik Belousov To: John Baldwin Message-ID: <20101210162631.GC33073@deviant.kiev.zoral.com.ua> References: <201012101050.45214.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="BYl/BInBdgsQr4gH" Content-Disposition: inline In-Reply-To: <201012101050.45214.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_20, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: arch@freebsd.org Subject: Re: Realtime thread priorities X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 16:26:36 -0000 --BYl/BInBdgsQr4gH Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Dec 10, 2010 at 10:50:45AM -0500, John Baldwin wrote: > So I finally had a case today where I wanted to use rtprio but it doesn't= seem=20 > very useful in its current state. Specifically, I want to be able to tag= =20 > certain user processes as being more important than any other user proces= ses=20 > even to the point that if one of my important processes blocks on a mutex= , the=20 > owner of that mutex should be more important than sshd being woken up fro= m=20 > sbwait by new data (for example). This doesn't work currently with rtpri= o due=20 > to the way the priorities are laid out (and I believe I probably argued f= or=20 > the current layout back when it was proposed). >=20 > The current layout breaks up the global thread priority space (0 - 255) i= nto a=20 > couple of bands: >=20 > 0 - 63 : interrupt threads > 64 - 127 : kernel sleep priorities (PSOCK, etc.) > 128 - 159 : real-time user threads (rtprio) > 160 - 223 : time-sharing user threads > 224 - 255 : idle threads (idprio and kernel idle procs) >=20 > The problem I am running into is that when a time-sharing thread goes to = sleep=20 > in the kernel (waiting on select, socket data, tty, etc.) it actually end= s up=20 > in the kernel priorities range (64 - 127). This means when it wakes up i= t=20 > will trump (and preempt) a real-time user thread even though these proces= ses=20 > nominally have a priority down in the 160 - 223 range. We do drop the ke= rnel=20 > sleep priority during userret(), but we don't recheck the scheduler queue= s to=20 > see if we should preempt the thread during userret(), so it effectively r= uns=20 > with the kernel sleep priority for the rest of the quantum while it is in= =20 > userland. >=20 > My first question is if this behavior is the desired behavior? Originall= y I=20 > think I preferred the current layout because I thought a thread in the ke= rnel=20 > should always have priority so it can release locks, etc. However, prior= ity=20 > propagation should actually handle the case of some very important thread= =20 > needing a lock. In my use case today where I actually want to use rtprio= I=20 > think I want different behavior where the rtprio thread is more important= than=20 > the thread waking up with PSOCK, etc. >=20 > If we decide to change the behavior I see two possible fixes: >=20 > 1) (easy) just move the real-time priority range above the kernel sleep= =20 > priority range >=20 > 2) (harder) make sched_userret() check the run queue to see if it should= =20 > preempt when dropping the kernel sleep priority. I think bde@ has sugges= ted=20 > that we should do this for correctness previously (and I've had some old,= =20 > unfinished patches to do this in a branch in p4 for several years). Would not doing #2 allow e.g. two threads that perform ping-pong with a single byte read/write into a socket to usurp the CPU ? The threads could try to also do some CPU-intensive calculations for some time during the quantum too. Such threads are arguably "interactive", but I think that the gain is priority is too unfair. --BYl/BInBdgsQr4gH Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk0CVLYACgkQC3+MBN1Mb4hzagCg3w/7+CjRXYLwf/MNvYtKet3E x9AAoKAmUetMkWLgsi+55nF6/t8m2767 =xQAV -----END PGP SIGNATURE----- --BYl/BInBdgsQr4gH-- From owner-freebsd-arch@FreeBSD.ORG Fri Dec 10 16:34:05 2010 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 046E21065670 for ; Fri, 10 Dec 2010 16:34:05 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id B5CE78FC08 for ; Fri, 10 Dec 2010 16:34:04 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 425BE46B03; Fri, 10 Dec 2010 11:34:04 -0500 (EST) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 46F528A009; Fri, 10 Dec 2010 11:34:03 -0500 (EST) From: John Baldwin To: Kostik Belousov Date: Fri, 10 Dec 2010 11:33:55 -0500 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20101102; KDE/4.4.5; amd64; ; ) References: <201012101050.45214.jhb@freebsd.org> <20101210162631.GC33073@deviant.kiev.zoral.com.ua> In-Reply-To: <20101210162631.GC33073@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201012101133.55389.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Fri, 10 Dec 2010 11:34:03 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.3 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.9 required=4.2 tests=BAYES_00 autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on bigwig.baldwin.cx Cc: arch@freebsd.org Subject: Re: Realtime thread priorities X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 16:34:05 -0000 On Friday, December 10, 2010 11:26:31 am Kostik Belousov wrote: > On Fri, Dec 10, 2010 at 10:50:45AM -0500, John Baldwin wrote: > > So I finally had a case today where I wanted to use rtprio but it doesn't seem > > very useful in its current state. Specifically, I want to be able to tag > > certain user processes as being more important than any other user processes > > even to the point that if one of my important processes blocks on a mutex, the > > owner of that mutex should be more important than sshd being woken up from > > sbwait by new data (for example). This doesn't work currently with rtprio due > > to the way the priorities are laid out (and I believe I probably argued for > > the current layout back when it was proposed). > > > > The current layout breaks up the global thread priority space (0 - 255) into a > > couple of bands: > > > > 0 - 63 : interrupt threads > > 64 - 127 : kernel sleep priorities (PSOCK, etc.) > > 128 - 159 : real-time user threads (rtprio) > > 160 - 223 : time-sharing user threads > > 224 - 255 : idle threads (idprio and kernel idle procs) > > > > The problem I am running into is that when a time-sharing thread goes to sleep > > in the kernel (waiting on select, socket data, tty, etc.) it actually ends up > > in the kernel priorities range (64 - 127). This means when it wakes up it > > will trump (and preempt) a real-time user thread even though these processes > > nominally have a priority down in the 160 - 223 range. We do drop the kernel > > sleep priority during userret(), but we don't recheck the scheduler queues to > > see if we should preempt the thread during userret(), so it effectively runs > > with the kernel sleep priority for the rest of the quantum while it is in > > userland. > > > > My first question is if this behavior is the desired behavior? Originally I > > think I preferred the current layout because I thought a thread in the kernel > > should always have priority so it can release locks, etc. However, priority > > propagation should actually handle the case of some very important thread > > needing a lock. In my use case today where I actually want to use rtprio I > > think I want different behavior where the rtprio thread is more important than > > the thread waking up with PSOCK, etc. > > > > If we decide to change the behavior I see two possible fixes: > > > > 1) (easy) just move the real-time priority range above the kernel sleep > > priority range > > > > 2) (harder) make sched_userret() check the run queue to see if it should > > preempt when dropping the kernel sleep priority. I think bde@ has suggested > > that we should do this for correctness previously (and I've had some old, > > unfinished patches to do this in a branch in p4 for several years). > > Would not doing #2 allow e.g. two threads that perform ping-pong with > a single byte read/write into a socket to usurp the CPU ? The threads > could try to also do some CPU-intensive calculations for some time > during the quantum too. > > Such threads are arguably "interactive", but I think that the gain is > priority is too unfair. Err, I think that what you describe is the current case and is what #2 would seek to change. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Fri Dec 10 17:27:22 2010 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 59C9E106566C for ; Fri, 10 Dec 2010 17:27:22 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-iw0-f174.google.com (mail-iw0-f174.google.com [209.85.214.174]) by mx1.freebsd.org (Postfix) with ESMTP id 21B048FC12 for ; Fri, 10 Dec 2010 17:27:21 +0000 (UTC) Received: by iwn9 with SMTP id 9so6206744iwn.19 for ; Fri, 10 Dec 2010 09:27:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=vKgsNnEfztcOzSgfHxY9XB1Iv0w5xAWfiutGc2JF/qc=; b=nf7jP95tHBF2k4zqaqQRiiBPUvSwOtZkOxmzqOO16s9Qs33/BhTbGrTWAJLy6bPH+0 NiCRUzzWznCTyEnalpP2f0caTkBF2fuIdYgUhtQbVVWZl4VNnqE1rbJk3sObszZY6TZ3 tKrchAneGsZrFQTSTwVra2gmA4X1xhLs+tmyg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=qd0cCwnyLkfySX0Yp3i5724/33HuUVfggh2oo7RsIbl0doqc4G9YZAPQxStu8qj0qw vwbWbJlhJPmX0IGeROeNxbe5BS1qPtIaIHOeC8clz+Hw2ZISkPtDNPd/hBKlirl9MEo1 uEiX4DiK6tKttDkrrPdjF89MsicBtLBGZtcs8= MIME-Version: 1.0 Received: by 10.231.37.129 with SMTP id x1mr689007ibd.1.1292000573968; Fri, 10 Dec 2010 09:02:53 -0800 (PST) Received: by 10.231.172.69 with HTTP; Fri, 10 Dec 2010 09:02:53 -0800 (PST) In-Reply-To: <201012101050.45214.jhb@freebsd.org> References: <201012101050.45214.jhb@freebsd.org> Date: Fri, 10 Dec 2010 09:02:53 -0800 Message-ID: From: Matthew Fleming To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: arch@freebsd.org Subject: Re: Realtime thread priorities X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 17:27:22 -0000 On Fri, Dec 10, 2010 at 7:50 AM, John Baldwin wrote: > So I finally had a case today where I wanted to use rtprio but it doesn't= seem > very useful in its current state. =A0Specifically, I want to be able to t= ag > certain user processes as being more important than any other user proces= ses > even to the point that if one of my important processes blocks on a mutex= , the > owner of that mutex should be more important than sshd being woken up fro= m > sbwait by new data (for example). =A0This doesn't work currently with rtp= rio due > to the way the priorities are laid out (and I believe I probably argued f= or > the current layout back when it was proposed). > > The current layout breaks up the global thread priority space (0 - 255) i= nto a > couple of bands: > > =A00 - =A063 : interrupt threads > =A064 - 127 : kernel sleep priorities (PSOCK, etc.) > 128 - 159 : real-time user threads (rtprio) > 160 - 223 : time-sharing user threads > 224 - 255 : idle threads (idprio and kernel idle procs) > > The problem I am running into is that when a time-sharing thread goes to = sleep > in the kernel (waiting on select, socket data, tty, etc.) it actually end= s up > in the kernel priorities range (64 - 127). =A0This means when it wakes up= it > will trump (and preempt) a real-time user thread even though these proces= ses > nominally have a priority down in the 160 - 223 range. =A0We do drop the = kernel > sleep priority during userret(), but we don't recheck the scheduler queue= s to > see if we should preempt the thread during userret(), so it effectively r= uns > with the kernel sleep priority for the rest of the quantum while it is in > userland. > > My first question is if this behavior is the desired behavior? =A0Origina= lly I > think I preferred the current layout because I thought a thread in the ke= rnel > should always have priority so it can release locks, etc. =A0However, pri= ority > propagation should actually handle the case of some very important thread > needing a lock. =A0In my use case today where I actually want to use rtpr= io I > think I want different behavior where the rtprio thread is more important= than > the thread waking up with PSOCK, etc. > > If we decide to change the behavior I see two possible fixes: > > 1) (easy) just move the real-time priority range above the kernel sleep > priority range > > 2) (harder) make sched_userret() check the run queue to see if it should > preempt when dropping the kernel sleep priority. =A0I think bde@ has sugg= ested > that we should do this for correctness previously (and I've had some old, > unfinished patches to do this in a branch in p4 for several years). As a note on what other operating systems do, AIX does not have any notion of user-level priorities versus kernel level. Real-time threads have the highest prio (and they are scheduled in a round-robin fashion), and everyone else is managed based on scheduling the thread that's runnable with the highest priority. The prios are adjusted regularly based on interactivity so that a CPU hog, unless explicitly nice'd or marked SCHED_RR, will get its prio marked down, and down, and down again as it continues to hog the CPU, while other threads that haven't been running will get their prio bumped up. Unfortunately, I only know the architectural detail I just mentioned. I do not have much knowledge of the 8000 lines of code that implemented this. :-) This is all just to say that it's perfectly possible to get a working OS without fixed priority bands. Cheers, matthew From owner-freebsd-arch@FreeBSD.ORG Fri Dec 10 21:41:54 2010 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 43D2B1065674 for ; Fri, 10 Dec 2010 21:41:54 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 0524D8FC08 for ; Fri, 10 Dec 2010 21:41:54 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 7E49E46B29; Fri, 10 Dec 2010 16:41:53 -0500 (EST) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 3340D8A01D; Fri, 10 Dec 2010 16:41:52 -0500 (EST) From: John Baldwin To: Kostik Belousov Date: Fri, 10 Dec 2010 16:41:51 -0500 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20101102; KDE/4.4.5; amd64; ; ) References: <201012101050.45214.jhb@freebsd.org> <201012101133.55389.jhb@freebsd.org> <20101210195716.GE33073@deviant.kiev.zoral.com.ua> In-Reply-To: <20101210195716.GE33073@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201012101641.51652.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Fri, 10 Dec 2010 16:41:52 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.3 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.9 required=4.2 tests=BAYES_00 autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on bigwig.baldwin.cx Cc: arch@freebsd.org Subject: Re: Realtime thread priorities X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2010 21:41:54 -0000 On Friday, December 10, 2010 2:57:16 pm Kostik Belousov wrote: > On Fri, Dec 10, 2010 at 11:33:55AM -0500, John Baldwin wrote: > > On Friday, December 10, 2010 11:26:31 am Kostik Belousov wrote: > > > On Fri, Dec 10, 2010 at 10:50:45AM -0500, John Baldwin wrote: > > > > So I finally had a case today where I wanted to use rtprio but it doesn't seem > > > > very useful in its current state. Specifically, I want to be able to tag > > > > certain user processes as being more important than any other user processes > > > > even to the point that if one of my important processes blocks on a mutex, the > > > > owner of that mutex should be more important than sshd being woken up from > > > > sbwait by new data (for example). This doesn't work currently with rtprio due > > > > to the way the priorities are laid out (and I believe I probably argued for > > > > the current layout back when it was proposed). > > > > > > > > The current layout breaks up the global thread priority space (0 - 255) into a > > > > couple of bands: > > > > > > > > 0 - 63 : interrupt threads > > > > 64 - 127 : kernel sleep priorities (PSOCK, etc.) > > > > 128 - 159 : real-time user threads (rtprio) > > > > 160 - 223 : time-sharing user threads > > > > 224 - 255 : idle threads (idprio and kernel idle procs) > > > > > > > > The problem I am running into is that when a time-sharing thread goes to sleep > > > > in the kernel (waiting on select, socket data, tty, etc.) it actually ends up > > > > in the kernel priorities range (64 - 127). This means when it wakes up it > > > > will trump (and preempt) a real-time user thread even though these processes > > > > nominally have a priority down in the 160 - 223 range. We do drop the kernel > > > > sleep priority during userret(), but we don't recheck the scheduler queues to > > > > see if we should preempt the thread during userret(), so it effectively runs > > > > with the kernel sleep priority for the rest of the quantum while it is in > > > > userland. > > > > > > > > My first question is if this behavior is the desired behavior? Originally I > > > > think I preferred the current layout because I thought a thread in the kernel > > > > should always have priority so it can release locks, etc. However, priority > > > > propagation should actually handle the case of some very important thread > > > > needing a lock. In my use case today where I actually want to use rtprio I > > > > think I want different behavior where the rtprio thread is more important than > > > > the thread waking up with PSOCK, etc. > > > > > > > > If we decide to change the behavior I see two possible fixes: > > > > > > > > 1) (easy) just move the real-time priority range above the kernel sleep > > > > priority range > > > > > > > > 2) (harder) make sched_userret() check the run queue to see if it should > > > > preempt when dropping the kernel sleep priority. I think bde@ has suggested > > > > that we should do this for correctness previously (and I've had some old, > > > > unfinished patches to do this in a branch in p4 for several years). > > > > > > Would not doing #2 allow e.g. two threads that perform ping-pong with > > > a single byte read/write into a socket to usurp the CPU ? The threads > > > could try to also do some CPU-intensive calculations for some time > > > during the quantum too. > > > > > > Such threads are arguably "interactive", but I think that the gain is > > > priority is too unfair. > > > > Err, I think that what you describe is the current case and is what #2 would > > seek to change. > > Sorry, might be my language was not clear, but I said "Would not doing > #2 allow ...", i.e. I specifically mean that we shall do #2 to avoid the > situation I described. Ah, yes, it does allow that. As bde@ said though, the overhead of extra context switches in the common case might not be worth it. I have a possible patch for 1), but it involves fixing a few places and is only compile tested yet (will run test it soon). I also think that in my case I almost always want 1) anyway (my realtime processes are always more important than sshd, even while sshd is in the kernel): Index: kern/kern_synch.c =================================================================== --- kern/kern_synch.c (revision 215592) +++ kern/kern_synch.c (working copy) @@ -214,7 +214,8 @@ * Adjust this thread's priority, if necessary. */ pri = priority & PRIMASK; - if (pri != 0 && pri != td->td_priority) { + if (pri != 0 && pri != td->td_priority && + td->td_pri_class == PRI_TIMESHARE) { thread_lock(td); sched_prio(td, pri); thread_unlock(td); @@ -552,7 +553,8 @@ { thread_lock(td); - sched_prio(td, PRI_MAX_TIMESHARE); + if (td->td_pri_class == PRI_TIMESHARE) + sched_prio(td, PRI_MAX_TIMESHARE); mi_switch(SW_VOL, NULL); thread_unlock(td); td->td_retval[0] = 0; Index: kern/subr_sleepqueue.c =================================================================== --- kern/subr_sleepqueue.c (revision 215592) +++ kern/subr_sleepqueue.c (working copy) @@ -693,7 +720,8 @@ /* Adjust priority if requested. */ MPASS(pri == -1 || (pri >= PRI_MIN && pri <= PRI_MAX)); - if (pri != -1 && td->td_priority > pri) + if (pri != -1 && td->td_priority > pri && + td->td_pri_class == PRI_TIMESHARE) sched_prio(td, pri); return (setrunnable(td)); } Index: sys/priority.h =================================================================== --- sys/priority.h (revision 215592) +++ sys/priority.h (working copy) @@ -68,8 +68,8 @@ * are insignificant. Ranges are as follows: * * Interrupt threads: 0 - 63 - * Top half kernel threads: 64 - 127 - * Realtime user threads: 128 - 159 + * Realtime user threads: 64 - 95 + * Top half kernel threads: 96 - 159 * Time sharing user threads: 160 - 223 * Idle user threads: 224 - 255 * @@ -81,7 +81,7 @@ #define PRI_MAX (255) /* Lowest priority. */ #define PRI_MIN_ITHD (PRI_MIN) -#define PRI_MAX_ITHD (PRI_MIN_KERN - 1) +#define PRI_MAX_ITHD (PRI_MIN_REALTIME - 1) #define PI_REALTIME (PRI_MIN_ITHD + 0) #define PI_AV (PRI_MIN_ITHD + 4) @@ -94,9 +94,12 @@ #define PI_DULL (PRI_MIN_ITHD + 32) #define PI_SOFT (PRI_MIN_ITHD + 36) -#define PRI_MIN_KERN (64) -#define PRI_MAX_KERN (PRI_MIN_REALTIME - 1) +#define PRI_MIN_REALTIME (64) +#define PRI_MAX_REALTIME (PRI_MIN_KERN - 1) +#define PRI_MIN_KERN (96) +#define PRI_MAX_KERN (PRI_MIN_TIMESHARE - 1) + #define PSWP (PRI_MIN_KERN + 0) #define PVM (PRI_MIN_KERN + 4) #define PINOD (PRI_MIN_KERN + 8) @@ -109,9 +112,6 @@ #define PLOCK (PRI_MIN_KERN + 36) #define PPAUSE (PRI_MIN_KERN + 40) -#define PRI_MIN_REALTIME (128) -#define PRI_MAX_REALTIME (PRI_MIN_TIMESHARE - 1) - #define PRI_MIN_TIMESHARE (160) #define PRI_MAX_TIMESHARE (PRI_MIN_IDLE - 1) -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Sat Dec 11 01:51:35 2010 Return-Path: Delivered-To: arch@freebsd.org Received: from alona.my.domain (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id CC39B106564A; Sat, 11 Dec 2010 01:51:34 +0000 (UTC) (envelope-from davidxu@freebsd.org) Message-ID: <4D02D90C.20503@freebsd.org> Date: Sat, 11 Dec 2010 09:51:08 +0800 From: David Xu User-Agent: Thunderbird 2.0.0.21 (X11/20090522) MIME-Version: 1.0 To: John Baldwin References: <201012101050.45214.jhb@freebsd.org> In-Reply-To: <201012101050.45214.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org Subject: Re: Realtime thread priorities X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Dec 2010 01:51:35 -0000 John Baldwin wrote: > So I finally had a case today where I wanted to use rtprio but it doesn't seem > very useful in its current state. Specifically, I want to be able to tag > certain user processes as being more important than any other user processes > even to the point that if one of my important processes blocks on a mutex, the > owner of that mutex should be more important than sshd being woken up from > sbwait by new data (for example). This doesn't work currently with rtprio due > to the way the priorities are laid out (and I believe I probably argued for > the current layout back when it was proposed). > > The current layout breaks up the global thread priority space (0 - 255) into a > couple of bands: > > 0 - 63 : interrupt threads > 64 - 127 : kernel sleep priorities (PSOCK, etc.) > 128 - 159 : real-time user threads (rtprio) > 160 - 223 : time-sharing user threads > 224 - 255 : idle threads (idprio and kernel idle procs) > > The problem I am running into is that when a time-sharing thread goes to sleep > in the kernel (waiting on select, socket data, tty, etc.) it actually ends up > in the kernel priorities range (64 - 127). This means when it wakes up it > will trump (and preempt) a real-time user thread even though these processes > nominally have a priority down in the 160 - 223 range. We do drop the kernel > sleep priority during userret(), but we don't recheck the scheduler queues to > see if we should preempt the thread during userret(), so it effectively runs > with the kernel sleep priority for the rest of the quantum while it is in > userland. > > My first question is if this behavior is the desired behavior? Originally I > think I preferred the current layout because I thought a thread in the kernel > should always have priority so it can release locks, etc. However, priority > propagation should actually handle the case of some very important thread > needing a lock. In my use case today where I actually want to use rtprio I > think I want different behavior where the rtprio thread is more important than > the thread waking up with PSOCK, etc. > > If we decide to change the behavior I see two possible fixes: > > 1) (easy) just move the real-time priority range above the kernel sleep > priority range > > This is not always correct, a userland realtime process may not be always more urgent than a normal time-sharing code which is backing up a file system or doing some important things, for example receiving money account from a socket. Process sleeping in kernel seems doing really important thing, for example removing data from a device interrupt or writing into device, while a thread which is realtime consuming 100% cpu time might be a deadloop thread. > 2) (harder) make sched_userret() check the run queue to see if it should > preempt when dropping the kernel sleep priority. I think bde@ has suggested > that we should do this for correctness previously (and I've had some old, > unfinished patches to do this in a branch in p4 for several years). > > This is too overhead, try it and benchmark it for real world application. From owner-freebsd-arch@FreeBSD.ORG Sat Dec 11 06:15:03 2010 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4740A1065670 for ; Sat, 11 Dec 2010 06:15:03 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from out-0.mx.aerioconnect.net (out-0-31.mx.aerioconnect.net [216.240.47.91]) by mx1.freebsd.org (Postfix) with ESMTP id 1FA768FC1B for ; Sat, 11 Dec 2010 06:15:02 +0000 (UTC) Received: from idiom.com (postfix@mx0.idiom.com [216.240.32.160]) by out-0.mx.aerioconnect.net (8.13.8/8.13.8) with ESMTP id oBB5v1h7015195; Fri, 10 Dec 2010 21:57:02 -0800 X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137]) by idiom.com (Postfix) with ESMTP id D77652D6019; Fri, 10 Dec 2010 21:57:00 -0800 (PST) Message-ID: <4D0312AA.7010009@freebsd.org> Date: Fri, 10 Dec 2010 21:56:58 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6 MIME-Version: 1.0 To: John Baldwin References: <201012101050.45214.jhb@freebsd.org> <201012101133.55389.jhb@freebsd.org> <20101210195716.GE33073@deviant.kiev.zoral.com.ua> <201012101641.51652.jhb@freebsd.org> In-Reply-To: <201012101641.51652.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 216.240.47.51 Cc: Kostik Belousov , arch@freebsd.org Subject: Re: Realtime thread priorities X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Dec 2010 06:15:03 -0000 On 12/10/10 1:41 PM, John Baldwin wrote: > On Friday, December 10, 2010 2:57:16 pm Kostik Belousov wrote: >> On Fri, Dec 10, 2010 at 11:33:55AM -0500, John Baldwin wrote: >>> On Friday, December 10, 2010 11:26:31 am Kostik Belousov wrote: >>>> On Fri, Dec 10, 2010 at 10:50:45AM -0500, John Baldwin wrote: >>>>> So I finally had a case today where I wanted to use rtprio but it doesn't seem >>>>> very useful in its current state. Specifically, I want to be able to tag >>>>> certain user processes as being more important than any other user processes >>>>> even to the point that if one of my important processes blocks on a mutex, the >>>>> owner of that mutex should be more important than sshd being woken up from >>>>> sbwait by new data (for example). This doesn't work currently with rtprio due >>>>> to the way the priorities are laid out (and I believe I probably argued for >>>>> the current layout back when it was proposed). >>>>> >>>>> The current layout breaks up the global thread priority space (0 - 255) into a >>>>> couple of bands: >>>>> >>>>> 0 - 63 : interrupt threads >>>>> 64 - 127 : kernel sleep priorities (PSOCK, etc.) >>>>> 128 - 159 : real-time user threads (rtprio) >>>>> 160 - 223 : time-sharing user threads >>>>> 224 - 255 : idle threads (idprio and kernel idle procs) >>>>> >>>>> The problem I am running into is that when a time-sharing thread goes to sleep >>>>> in the kernel (waiting on select, socket data, tty, etc.) it actually ends up >>>>> in the kernel priorities range (64 - 127). This means when it wakes up it >>>>> will trump (and preempt) a real-time user thread even though these processes >>>>> nominally have a priority down in the 160 - 223 range. We do drop the kernel >>>>> sleep priority during userret(), but we don't recheck the scheduler queues to >>>>> see if we should preempt the thread during userret(), so it effectively runs >>>>> with the kernel sleep priority for the rest of the quantum while it is in >>>>> userland. >>>>> >>>>> My first question is if this behavior is the desired behavior? Originally I >>>>> think I preferred the current layout because I thought a thread in the kernel >>>>> should always have priority so it can release locks, etc. However, priority >>>>> propagation should actually handle the case of some very important thread >>>>> needing a lock. In my use case today where I actually want to use rtprio I >>>>> think I want different behavior where the rtprio thread is more important than >>>>> the thread waking up with PSOCK, etc. >>>>> >>>>> If we decide to change the behavior I see two possible fixes: >>>>> >>>>> 1) (easy) just move the real-time priority range above the kernel sleep >>>>> priority range >>>>> >>>>> 2) (harder) make sched_userret() check the run queue to see if it should >>>>> preempt when dropping the kernel sleep priority. I think bde@ has suggested >>>>> that we should do this for correctness previously (and I've had some old, >>>>> unfinished patches to do this in a branch in p4 for several years). If you think how RT scheduling works when they stick an RT shim under an OS then it becomes obvious that all RT threads trump all TS threads, kernel or not. basically they have a separate RT scheduler that gets to schedule all RT threads and they only even bother to run the NON RT (TS) scheduler when there is spare time. TS threads are only ever scheduled by the RT scheduler when they own some resource needed by an RT thread. >>>> Would not doing #2 allow e.g. two threads that perform ping-pong with >>>> a single byte read/write into a socket to usurp the CPU ? The threads >>>> could try to also do some CPU-intensive calculations for some time >>>> during the quantum too. >>>> >>>> Such threads are arguably "interactive", but I think that the gain is >>>> priority is too unfair. the aim of RT is to be unfair. (to TS threads) >>> Err, I think that what you describe is the current case and is what #2 would >>> seek to change. >> Sorry, might be my language was not clear, but I said "Would not doing >> #2 allow ...", i.e. I specifically mean that we shall do #2 to avoid the >> situation I described. > Ah, yes, it does allow that. As bde@ said though, the overhead of extra > context switches in the common case might not be worth it. > > I have a possible patch for 1), but it involves fixing a few places and is > only compile tested yet (will run test it soon). I also think that in my > case I almost always want 1) anyway (my realtime processes are always more > important than sshd, even while sshd is in the kernel): > > Index: kern/kern_synch.c > =================================================================== > --- kern/kern_synch.c (revision 215592) > +++ kern/kern_synch.c (working copy) > @@ -214,7 +214,8 @@ > * Adjust this thread's priority, if necessary. > */ > pri = priority& PRIMASK; > - if (pri != 0&& pri != td->td_priority) { > + if (pri != 0&& pri != td->td_priority&& > + td->td_pri_class == PRI_TIMESHARE) { > thread_lock(td); > sched_prio(td, pri); > thread_unlock(td); > @@ -552,7 +553,8 @@ > { > > thread_lock(td); > - sched_prio(td, PRI_MAX_TIMESHARE); > + if (td->td_pri_class == PRI_TIMESHARE) > + sched_prio(td, PRI_MAX_TIMESHARE); > mi_switch(SW_VOL, NULL); > thread_unlock(td); > td->td_retval[0] = 0; > Index: kern/subr_sleepqueue.c > =================================================================== > --- kern/subr_sleepqueue.c (revision 215592) > +++ kern/subr_sleepqueue.c (working copy) > @@ -693,7 +720,8 @@ > > /* Adjust priority if requested. */ > MPASS(pri == -1 || (pri>= PRI_MIN&& pri<= PRI_MAX)); > - if (pri != -1&& td->td_priority> pri) > + if (pri != -1&& td->td_priority> pri&& > + td->td_pri_class == PRI_TIMESHARE) > sched_prio(td, pri); > return (setrunnable(td)); > } > Index: sys/priority.h > =================================================================== > --- sys/priority.h (revision 215592) > +++ sys/priority.h (working copy) > @@ -68,8 +68,8 @@ > * are insignificant. Ranges are as follows: > * > * Interrupt threads: 0 - 63 > - * Top half kernel threads: 64 - 127 > - * Realtime user threads: 128 - 159 > + * Realtime user threads: 64 - 95 > + * Top half kernel threads: 96 - 159 > * Time sharing user threads: 160 - 223 > * Idle user threads: 224 - 255 > * > @@ -81,7 +81,7 @@ > #define PRI_MAX (255) /* Lowest priority. */ > > #define PRI_MIN_ITHD (PRI_MIN) > -#define PRI_MAX_ITHD (PRI_MIN_KERN - 1) > +#define PRI_MAX_ITHD (PRI_MIN_REALTIME - 1) > > #define PI_REALTIME (PRI_MIN_ITHD + 0) > #define PI_AV (PRI_MIN_ITHD + 4) > @@ -94,9 +94,12 @@ > #define PI_DULL (PRI_MIN_ITHD + 32) > #define PI_SOFT (PRI_MIN_ITHD + 36) > > -#define PRI_MIN_KERN (64) > -#define PRI_MAX_KERN (PRI_MIN_REALTIME - 1) > +#define PRI_MIN_REALTIME (64) > +#define PRI_MAX_REALTIME (PRI_MIN_KERN - 1) > > +#define PRI_MIN_KERN (96) > +#define PRI_MAX_KERN (PRI_MIN_TIMESHARE - 1) > + > #define PSWP (PRI_MIN_KERN + 0) > #define PVM (PRI_MIN_KERN + 4) > #define PINOD (PRI_MIN_KERN + 8) > @@ -109,9 +112,6 @@ > #define PLOCK (PRI_MIN_KERN + 36) > #define PPAUSE (PRI_MIN_KERN + 40) > > -#define PRI_MIN_REALTIME (128) > -#define PRI_MAX_REALTIME (PRI_MIN_TIMESHARE - 1) > - > #define PRI_MIN_TIMESHARE (160) > #define PRI_MAX_TIMESHARE (PRI_MIN_IDLE - 1) > > From owner-freebsd-arch@FreeBSD.ORG Sat Dec 11 21:39:23 2010 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A4A171065674 for ; Sat, 11 Dec 2010 21:39:23 +0000 (UTC) (envelope-from peterjeremy@acm.org) Received: from mail12.syd.optusnet.com.au (mail12.syd.optusnet.com.au [211.29.132.193]) by mx1.freebsd.org (Postfix) with ESMTP id 2BD148FC08 for ; Sat, 11 Dec 2010 21:39:22 +0000 (UTC) Received: from server.vk2pj.dyndns.org (c220-239-116-103.belrs4.nsw.optusnet.com.au [220.239.116.103]) by mail12.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id oBBLdKC6021815 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 12 Dec 2010 08:39:21 +1100 X-Bogosity: Ham, spamicity=0.000000 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.4/8.14.4) with ESMTP id oBBLdIuA086414; Sun, 12 Dec 2010 08:39:18 +1100 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.4/8.14.4/Submit) id oBBLdI4D086413; Sun, 12 Dec 2010 08:39:18 +1100 (EST) (envelope-from peter) Date: Sun, 12 Dec 2010 08:39:18 +1100 From: Peter Jeremy To: John Baldwin Message-ID: <20101211213918.GB21959@server.vk2pj.dyndns.org> References: <201012101050.45214.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ZPt4rx8FFjLCG7dd" Content-Disposition: inline In-Reply-To: <201012101050.45214.jhb@freebsd.org> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.20 (2009-06-14) Cc: arch@freebsd.org Subject: Re: Realtime thread priorities X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Dec 2010 21:39:23 -0000 --ZPt4rx8FFjLCG7dd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2010-Dec-10 10:50:45 -0500, John Baldwin wrote: >The problem I am running into is that when a time-sharing thread goes to s= leep=20 >in the kernel (waiting on select, socket data, tty, etc.) it actually ends= up=20 >in the kernel priorities range (64 - 127). This means when it wakes up it= =20 >will trump (and preempt) a real-time user thread even though these process= es=20 >nominally have a priority down in the 160 - 223 range. We do drop the ker= nel=20 >sleep priority during userret(), but we don't recheck the scheduler queues= to=20 >see if we should preempt the thread during userret(), so it effectively ru= ns=20 >with the kernel sleep priority for the rest of the quantum while it is in= =20 >userland. This may also explain the situation I'm seeing where idprio processes are receiving more than "idle" time (see "idprio processes slowing down system" in -hackers). >My first question is if this behavior is the desired behavior? Originally= I=20 >think I preferred the current layout because I thought a thread in the ker= nel=20 >should always have priority so it can release locks, etc. I suspect it was intended as a solution to priority inversion issues. >1) (easy) just move the real-time priority range above the kernel sleep=20 >priority range This won't affect the associated issue of idprio processes "preempting" timesharing processes. >2) (harder) make sched_userret() check the run queue to see if it should= =20 >preempt when dropping the kernel sleep priority. IMHO, this is the "correct" solution but that needs to be tempered by the additional overhead this might incur. --=20 Peter Jeremy --ZPt4rx8FFjLCG7dd Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (FreeBSD) iEYEARECAAYFAk0D74YACgkQ/opHv/APuIcROACeNr2ajW+rBdeMeZ+kVJAJoftB xtMAnAv9ZBzEYKvf/r67onBgf/dNZhGl =6FHd -----END PGP SIGNATURE----- --ZPt4rx8FFjLCG7dd--