Date: Fri, 28 Nov 2008 10:48:00 +0800 From: David Xu <davidxu@freebsd.org> To: Jeff Roberson <jroberson@jroberson.net> Cc: freebsd-bugs@freebsd.org, Unga <unga888@yahoo.com>, jhb@freebsd.org, jeff@freebsd.org, brde@optusnet.com.au, yanefbsd@gmail.com Subject: Re: kern/129164: Wrong priority value for normal processes Message-ID: <492F5BE0.2040200@freebsd.org> In-Reply-To: <20081127000235.D971@desktop> References: <484904.78100.qm@web57001.mail.re3.yahoo.com> <20081127000235.D971@desktop>
next in thread | previous in thread | raw e-mail | index | archive | help
Jeff Roberson wrote: > On Thu, 27 Nov 2008, Unga wrote: > >> --- On Tue, 11/25/08, Unga <unga888@yahoo.com> wrote: >> >>> The priority value for root and other normal processes is >>> 65504 (rtp.prio) where zero (0) is expected. >>> >>> I checked the program flow from /usr/src/usr.bin/su/su.c to >>> /usr/src/lib/libutil/login_class.c and it looks >>> setusercontext() is setting the priority zero (0) right but >>> the moment it come out from the setusercontext() call in >>> su.c, the priority has already turn to 65504. >>> >>> Maximum priority value for normal priority processes can >>> take is 20, not 65504. Normal priority processes are >>> expected to run at priority zero (0) as it is specified in >>> /etc/login.conf under login class "default". >>> >> >> I have further checked the rtprio(2) system call for how it set and >> read priorities. >> >> Setting Priority: >> rtprio(RTP_SET, 0, &rtp) >> >> rtprio() => rtprio_thread() => rtp_to_pri() >> >> rtp_to_pri() calculates newpri as: >> newpri = PRI_MIN_TIMESHARE + rtp->prio; >> >> PRI_MIN_TIMESHARE is for normal priority, its PRI_MIN_REALTIME for >> realtime priority etc. >> >> Now rtp_to_pri() calls sched_class() to set the priority class. It >> sets td->td_pri_class to the priority class given. >> >> Then rtp_to_pri() calls sched_user_prio() to set the priority. It sets >> following fields to the priority calculated (newpri): >> td->td_base_user_pri >> td->td_user_pri >> >> Then rtp_to_pri() calls sched_prio(). It sets following field to the >> priority calculated (newpri): >> td->td_base_pri >> >> The sched_prio() calls sched_thread_priority() which sets >> td->td_priority to the priority calculated (newpri). >> >> Of course not all td->td_* fields are set in one go. Some are set >> conditionally. But the td->td_base_user_pri is always set. >> >> >> Reading Priority: >> rtprio(RTP_LOOKUP, 0, &rtp) >> >> rtprio() => rtprio_thread() => pri_to_rtp() >> >> At pri_to_rtp(), rtp->type and rtp->prio are set as follows: >> rtp->type = td->td_pri_class; >> rtp->prio = td->td_base_user_pri - PRI_MIN_TIMESHARE; >> >> That is, rtprio(2) system call sets the td->td_base_user_pri when >> request to set priority, and when request to read the priority, it >> reads the td->td_base_user_pri. >> >> In another word, for rtprio(2) to function properly the >> td->td_base_user_pri should not be changed. >> >> As the rtp->prio is unsigned short, for rtp->prio to become a huge >> number (65504), td->td_base_user_pri should be less than >> PRI_MIN_TIMESHARE. >> >> >> This shows the actual problem is in the scheduler. In this case, >> sched_ule. >> >> I presume the sched_ule should not touch the td->td_base_user_pri. >> Instead, probably, it should use td->td_priority for its internal >> purposes. >> >> Appreciate if Jeffrey Roberson <jeff@freebsd.org> could shed more >> light on this issue. > > The base_pri vs td_priority is really jhb's domain. I added him to the cc. > > Thanks, > Jeff > This might be caused by following code in sched_ule.c: static void sched_priority(struct thread *td) { int score; int pri; if (td->td_pri_class != PRI_TIMESHARE) return; /* * If the score is interactive we place the thread in the realtime * queue with a priority that is less than kernel and interrupt * priorities. These threads are not subject to nice restrictions. * * Scores greater than this are placed on the normal timeshare queue * where the priority is partially decided by the most recent cpu * utilization and the rest is decided by nice value. * * The nice value of the process has a linear effect on the calculated * score. Negative nice values make it easier for a thread to be * considered interactive. */ score = imax(0, sched_interact_score(td) - td->td_proc->p_nice); if (score < sched_interact) { pri = PRI_MIN_REALTIME; pri += ((PRI_MAX_REALTIME - PRI_MIN_REALTIME) / sched_interact) * score; KASSERT(pri >= PRI_MIN_REALTIME && pri <= PRI_MAX_REALTIME, ("sched_priority: invalid interactive priority %d score %d", pri, score)); } else { it uses PRI_MIN_REALTIME, then it calls sched_user_prio(td, pri) which sets td_base_user_pri and td_user_pri, and causes td_user_pri and td_base_user_pri to be out of range. Should PRI_MIN_REALTIME and PRI_MAX_REALTIME be PRI_MIN_TIMESHARE and PRI_MAX_TIMESHARE ? >> >> Since I'm not conversant with the sched_ule, I may not be able to >> develop a fix for sched_ule. Appreciate either Jeffrey or somebody >> else could look into a fix for sched_ule. I can certainly help in >> apply a patch and test. >> >> Best regards >> Unga >> >> >> >> > _______________________________________________ > freebsd-bugs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-bugs > To unsubscribe, send any mail to "freebsd-bugs-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?492F5BE0.2040200>