Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Aug 2007 17:23:54 -0400
From:      Jung-uk Kim <jkim@FreeBSD.org>
To:        perforce@FreeBSD.org
Cc:        Kostik Belousov <kostikbel@gmail.com>, Roman Divacky <rdivacky@FreeBSD.org>, Ken Smith <kensmith@cse.Buffalo.EDU>, re@FreeBSD.org
Subject:   Re: PERFORCE change 124529 for review
Message-ID:  <200708271723.56871.jkim@FreeBSD.org>
In-Reply-To: <1188241044.56896.46.camel@opus.cse.buffalo.edu>
References:  <200708021130.l72BUHrY077198@repoman.freebsd.org> <200708271416.08455.jkim@FreeBSD.org> <1188241044.56896.46.camel@opus.cse.buffalo.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 27 August 2007 02:57 pm, Ken Smith wrote:
> On Mon, 2007-08-27 at 14:16 -0400, Jung-uk Kim wrote:
> > sched_{get,set}affinity() are very misleading syscalls.  Userland
> > (glibc) and kernel have different definitions and Roman's patch
> > implemented linux 2.6 kernel behaviour, AFAIK.  Glibc wraps all
> > differences between kernel versions.  See the following link:
> >
> > http://jeff.squyres.com/journal/archives/2005/10/linux_processor.
> >html
>
> Wow, "misleading" may be the understatement of the year...  :-(
>
> My only interest is not committing something that a user-level
> Linux binary running on FreeBSD will be confused by if it were to
> go through our compat layers.  To that end I'm going to be a bit of
> a jerk on this and I apologize for that.  As far as I can tell you
> are exactly right that this syscall in Linux is extremely
> confusing.  What I ask is someone to point me at something that
> suggests what the submitted patch implements was at least in use in
> *some* Linux kernel, preferrably one that saw semi-widespread use. 
> :-)
>
> I did follow up on this a bit myself by downloading what appears to
> be the bleeding edge of the Linux kernel which is probably not the
> right thing to do.  Its implementation in that kernel
> (linux-2.6.22) is this:
>
> long sched_getaffinity(pid_t pid, cpumask_t *mask)
> {
>         struct task_struct *p;
>         int retval;
>
>         mutex_lock(&sched_hotcpu_mutex);
>         read_lock(&tasklist_lock);
>
>         retval = -ESRCH;
>         p = find_process_by_pid(pid);
>         if (!p)
>                 goto out_unlock;
>
>         retval = security_task_getscheduler(p);
>         if (retval)
>                 goto out_unlock;
>
>         cpus_and(*mask, p->cpus_allowed, cpu_online_map);
>
> out_unlock:
>         read_unlock(&tasklist_lock);
>         mutex_unlock(&sched_hotcpu_mutex);
>         if (retval)
>                 return retval;
>
>         return 0;
> }
>
> The security_task_getscheduler() call seems to be a no-op at the
> moment (a work in progress - as far as I can tell there is only one
> task_getscheduler() function implemented at the moment and it
> always returns 0).
>
> As the reference you provided said there does seem to be the
> possibility of "interference" from the glibc code but as far as I
> can tell from what I have access to none of the various options
> would wind up returning anything other than zero in the case of
> success and that is what has me worried.  If anyone can point me at
> something that shows a case where the size of the mask really winds
> up being the return value upon success I'm totally willing to
> approve this.
>
> Sorry for the hassle.  Thanks.

You missed actual syscall entry point:

http://lxr.linux.no/source/kernel/sched.c#L4542

It actually returns sizeof(cpumask_t). ;-)

Jung-uk Kim



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200708271723.56871.jkim>