Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Aug 2012 23:39:11 -0500
From:      Alan Cox <alan.l.cox@gmail.com>
To:        Andrey Zonov <zont@freebsd.org>
Cc:        Bryan Drewery <bryan@shatow.net>, Andriy Gapon <avg@freebsd.org>, freebsd-arch@freebsd.org
Subject:   Re: [patch] unprivileged mlock(2)
Message-ID:  <CAJUyCcMjUSUSyF9Ry3d16izrYCjebQFtpYy0OiFof%2BgOKAs5Qg@mail.gmail.com>
In-Reply-To: <503D34DB.3090000@FreeBSD.org>
References:  <503CF3B1.3050604@FreeBSD.org> <503D08D6.1040004@shatow.net> <503D281A.3080107@FreeBSD.org> <503D34DB.3090000@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 28, 2012 at 4:15 PM, Andrey Zonov <zont@freebsd.org> wrote:

> On 8/29/12 12:20 AM, Andriy Gapon wrote:
> > on 28/08/2012 21:07 Bryan Drewery said the following:
> >> On 8/28/2012 11:37 AM, Andrey Zonov wrote:
> >>> Hi,
> >>>
> >>> We've got RLIMIT_MEMLOCK for years, but this limit is useless, because
> >>> only root may call mlock(2), and root may raise any limits.
> >>>
> >>> I suggest patch that allows to call mlock(2) for unprivileged users.
> >>> Are there any objections to got it in tree?
> >>>
> >>
> >> FYI, see previous recent thread on this here:
> >>
> >> http://lists.freebsd.org/pipermail/freebsd-arch/2012-May/012552.html
> >> and
> >> http://lists.freebsd.org/pipermail/freebsd-arch/2012-June/012606.html
> >
> > Yes, Andrey, I highly suggest that you read those threads completely.
> >
> > Here are some observations.
> >
> > It doesn't look like mlockall and mlockall(MCL_FUTURE) in particular
> properly
> > honor RLIMIT_MEMLOCK.  If this is not fixed, then it would be premature
> to
> > enable the privilege for non-privileged users.
> >
>
> This should be surely fixed, but I don't know how.  Any suggestions are
> welcome.
>
> > I am against adding the sysctl knob.  If RLIMIT_MEMLOCK limit is properly
> > implemented then it is sufficient to effectively deny the privilege (and
> with
> > much finer granularity).
> >
>
> Until all bugs around this problem will be fixed, to have such sysctl
> would be nice, and even keep it turned off to not change default
> behavior (not like in patch).
>
> > When the privilege is allowed to ordinary users, then memorylocked in the
> > default login.conf would need to be set to something much lower than the
> current
> > 'unlimited' :-)
> >
>
> It's not a problem to set it to a new reasonable value in the tree, but
> it will be a problem to fix this everywhere.
>
> > Also, note that currently RLIMIT_MEMLOCK is abused at least in vslock()
> (used at
> > least by sysctl's kernel side).  The temporary wirings performed as an
> > implementation detail or side-effect should not be accounted against the
> limit.
> >  The limit is for wirings that a user makes and controls explicitly.  It
> should
> > not be applied to something that kernel does behind the scenes without
> user's
> > knowledge.
> >
>
> I was surprised when I stepped on this few years ago on machine with
> thousands processes.  top(8) ate 100% CPU in a forever loop, ps(1)
> didn't work, and that is because memorylocked limit was set to low.
> Later I submitted two patches which fixed kvm (r230873) and sockstat
> (r230874), but I totally agree with you here, we shouldn't check for
> limits in vslock().
>
>
I agree with Andriy's argument for making the following change.  Please go
ahead and commit it.

Alan

Index: sys/vm/vm_glue.c
> ===================================================================
> --- sys/vm/vm_glue.c    (revision 239772)
> +++ sys/vm/vm_glue.c    (working copy)
> @@ -183,7 +183,6 @@ int
>  vslock(void *addr, size_t len)
>  {
>         vm_offset_t end, last, start;
> -       unsigned long nsize;
>         vm_size_t npages;
>         int error;
>
> @@ -195,18 +194,6 @@ vslock(void *addr, size_t len)
>         npages = atop(end - start);
>         if (npages > vm_page_max_wired)
>                 return (ENOMEM);
> -       PROC_LOCK(curproc);
> -       nsize = ptoa(npages +
> -           pmap_wired_count(vm_map_pmap(&curproc->p_vmspace->vm_map)));
> -       if (nsize > lim_cur(curproc, RLIMIT_MEMLOCK)) {
> -               PROC_UNLOCK(curproc);
> -               return (ENOMEM);
> -       }
> -       if (racct_set(curproc, RACCT_MEMLOCK, nsize)) {
> -               PROC_UNLOCK(curproc);
> -               return (ENOMEM);
> -       }
> -       PROC_UNLOCK(curproc);
>  #if 0
>         /*
>          * XXX - not yet
> @@ -222,14 +209,6 @@ vslock(void *addr, size_t len)
>  #endif
>         error = vm_map_wire(&curproc->p_vmspace->vm_map, start, end,
>             VM_MAP_WIRE_SYSTEM | VM_MAP_WIRE_NOHOLES);
> -#ifdef RACCT
> -       if (error != KERN_SUCCESS) {
> -               PROC_LOCK(curproc);
> -               racct_set(curproc, RACCT_MEMLOCK,
> -
> ptoa(pmap_wired_count(vm_map_pmap(&curproc->p_vmspace->vm_map))));
> -               PROC_UNLOCK(curproc);
> -       }
> -#endif
>         /*
>          * Return EFAULT on error to match copy{in,out}() behaviour
>          * rather than returning ENOMEM like mlock() would.
>
> --
> Andrey Zonov
>
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJUyCcMjUSUSyF9Ry3d16izrYCjebQFtpYy0OiFof%2BgOKAs5Qg>