From owner-freebsd-i386@FreeBSD.ORG Sun Oct 31 12:15:57 2004 Return-Path: Delivered-To: freebsd-i386@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 261A116A4CE; Sun, 31 Oct 2004 12:15:57 +0000 (GMT) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 659A643D1F; Sun, 31 Oct 2004 12:15:56 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.0.87])i9VCFpGx015343; Sun, 31 Oct 2004 23:15:51 +1100 Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) i9VCFhxc023774; Sun, 31 Oct 2004 23:15:49 +1100 Date: Sun, 31 Oct 2004 23:15:43 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: "J. Porter Clark" In-Reply-To: <200410302307.i9UN7Cg0045288@www.freebsd.org> Message-ID: <20041031223051.R15841@delplex.bde.org> References: <200410302307.i9UN7Cg0045288@www.freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-gnats-submit@FreeBSD.org cc: freebsd-i386@FreeBSD.org Subject: Re: i386/73328: top shows NICE as -111 on processes started by idprio X-BeenThere: freebsd-i386@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: I386-specific issues for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Oct 2004 12:15:57 -0000 On Sat, 30 Oct 2004, J. Porter Clark wrote: > >Description: > The "top" program shows a NICE value of -111 for programs started by > idprio 31. On 4.X, the "correct" value of 52 is shown. > >How-To-Repeat: > As root, run "idprio 31 sleep 500 &". > Then run "top -Uroot". The sleep process is listed as having a NICE value > of -111. Other values besides 31 produce the same result. > >Fix: > My guess is that it's probably in /usr/src/usr.bin/top/machine.c about line 747 or so, but I haven't had time to dig into it. I use this fix. It may be out of date, and the comments about the "base" priority are too verbose and not quite right. %%% Index: machine.c =================================================================== RCS file: /home/ncvs/src/usr.bin/top/machine.c,v retrieving revision 1.51 diff -u -2 -r1.51 machine.c --- machine.c 6 Jun 2004 19:59:06 -0000 1.51 +++ machine.c 7 Jun 2004 04:37:20 -0000 @@ -573,18 +573,67 @@ smpmode ? smp_Proc_format : up_Proc_format, pp->ki_pid, - namelength, namelength, - (*get_userid)(pp->ki_ruid), - pp->ki_pri.pri_level - PZERO, - - /* - * normal time -> nice value -20 - +20 - * real time 0 - 31 -> nice value -52 - -21 - * idle time 0 - 31 -> nice value +21 - +52 + namelength, namelength, (*get_userid)(pp->ki_ruid), + pp->ki_pri.pri_level - PZERO, + /*- + * Mapping from various disorganized priority schemes to ordered + * pseudo-nice values: + * + * interrupt thread base pri 0 - 63 -> nice -180 - -117 + * top half kernel thread base pri 64 - 127 -> nice -116 - -53 + * realtime user threads rtprio 0 - 31 -> nice -52 - -21 + * normal user threads nice -20 - +20 -> nice -20 - +20 + * idle user threads idprio 0 - 31 -> nice +21 - +52 + * + * The number of interest is really the "base" priority of the + * process, not the niceness of the process directly. The base + * priority should be what is is td->td_base_pri in the kernel, + * which is ki_pri.pri_native here. In practice, that can't + * be used directly and the workarounds are complicated because + * of the following bugs: + * o td->td_base_pri is changed by priority propagation and + * not even restored. Thus it cannot be used to determine + * the priority class. The other priorities in k_pri can + * be used for this, but they are set inconsistently too so + * there is no one place that determines the correct base + * priority. + * o td->td_base_pri is not set to a useful value for normal + * user threads. It is initialized to 0 and only changed + * by priority propagation. Workaround: use the actual + * nice value for the "base priority" of normal user + * threads. + * o kg->kg_user_pri (pri_user here) is not set to a useful + * value for kernel threads. It is initialized to PUSER + * and never changed. Something like it should be used + * for all classes of threads to hold the previous priority + * during priority propagation. Then there might not need + * to be a special variable for the user -> kernel + * transitions (which are a type of priority propagation). + * I think a stack of such variables is needed in general + * though -- kg->kg_user_pri is special because it is at + * the top. + * + * We scale the base priority so that it agrees with the + * historical nice value for normal user threads, although this + * gives negative numbers for higher priority threads. + * + * PRI_BASE() strips the fifo scheduling bit from the priority + * class. This is not relevant for the conversion to niceness, + * but it should be shown somewhere other as a raw number in + * an abnormal ps format. We don't use PRI_IS_REALTIME() + * because there is no corresponding classification macro for + * non-realtime priority classes and the details are too + * messy to be hidden in macros. + * + * KNF indent -ci4 is intentionally violated here. */ - (pp->ki_pri.pri_class == PRI_TIMESHARE ? - pp->ki_nice - NZERO : - (PRI_IS_REALTIME(pp->ki_pri.pri_class) ? - (PRIO_MIN - 1 - (PRI_MAX_REALTIME - pp->ki_pri.pri_level)) : - (PRIO_MAX + 1 + pp->ki_pri.pri_level - PRI_MIN_IDLE))), + PRI_BASE(pp->ki_pri.pri_class) == PRI_ITHD ? + PRIO_MIN + (pp->ki_pri.pri_native - PRI_MIN_TIMESHARE) : + PRI_BASE(pp->ki_pri.pri_class) == PRI_REALTIME ? + PRIO_MIN + (pp->ki_pri.pri_user - PRI_MIN_TIMESHARE) : + PRI_BASE(pp->ki_pri.pri_class) == PRI_TIMESHARE ? + pp->ki_nice - NZERO : + PRI_BASE(pp->ki_pri.pri_class) == PRI_IDLE ? + PRIO_MAX + 1 + (pp->ki_pri.pri_user - PRI_MIN_IDLE) : + 666, format_k2(PROCSIZE(pp)), format_k2(pagetok(pp->ki_rssize)), %%% This area is broken in ps too. The most obvious ones are: - My ntpd process (which has realtime priority 0 and is correctly displayed by top as having "nice" -52) is displayed by `ps -o rtprio' as having priority "real:12". The bogus 12 is just ntpd's current priority less PZERO. This bug is the same as one of the ones fixed above. It is that pri_level gives the current priority so it gives a wrong value to subtract from when the process is running at an elevated priority in kernel mode. top and ps seem to get this wrong in RELENG_4 too. - ps.1 says that `-o rtprio' causes a display of "101" for non-rtprio processes, but the actual display is "normal" for normal ones and a "%u.%u" format for unknown ones. The man page became inconsistent with the code about "101" back in 1998 in rev.1.26 of ps/print.c. I think unknown cases occur for at least POSIX scheduling classes. Bruce