Date: Tue, 28 Aug 2012 23:39:11 -0500 From: Alan Cox <alan.l.cox@gmail.com> To: Andrey Zonov <zont@freebsd.org> Cc: Bryan Drewery <bryan@shatow.net>, Andriy Gapon <avg@freebsd.org>, freebsd-arch@freebsd.org Subject: Re: [patch] unprivileged mlock(2) Message-ID: <CAJUyCcMjUSUSyF9Ry3d16izrYCjebQFtpYy0OiFof%2BgOKAs5Qg@mail.gmail.com> In-Reply-To: <503D34DB.3090000@FreeBSD.org> References: <503CF3B1.3050604@FreeBSD.org> <503D08D6.1040004@shatow.net> <503D281A.3080107@FreeBSD.org> <503D34DB.3090000@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 28, 2012 at 4:15 PM, Andrey Zonov <zont@freebsd.org> wrote: > On 8/29/12 12:20 AM, Andriy Gapon wrote: > > on 28/08/2012 21:07 Bryan Drewery said the following: > >> On 8/28/2012 11:37 AM, Andrey Zonov wrote: > >>> Hi, > >>> > >>> We've got RLIMIT_MEMLOCK for years, but this limit is useless, because > >>> only root may call mlock(2), and root may raise any limits. > >>> > >>> I suggest patch that allows to call mlock(2) for unprivileged users. > >>> Are there any objections to got it in tree? > >>> > >> > >> FYI, see previous recent thread on this here: > >> > >> http://lists.freebsd.org/pipermail/freebsd-arch/2012-May/012552.html > >> and > >> http://lists.freebsd.org/pipermail/freebsd-arch/2012-June/012606.html > > > > Yes, Andrey, I highly suggest that you read those threads completely. > > > > Here are some observations. > > > > It doesn't look like mlockall and mlockall(MCL_FUTURE) in particular > properly > > honor RLIMIT_MEMLOCK. If this is not fixed, then it would be premature > to > > enable the privilege for non-privileged users. > > > > This should be surely fixed, but I don't know how. Any suggestions are > welcome. > > > I am against adding the sysctl knob. If RLIMIT_MEMLOCK limit is properly > > implemented then it is sufficient to effectively deny the privilege (and > with > > much finer granularity). > > > > Until all bugs around this problem will be fixed, to have such sysctl > would be nice, and even keep it turned off to not change default > behavior (not like in patch). > > > When the privilege is allowed to ordinary users, then memorylocked in the > > default login.conf would need to be set to something much lower than the > current > > 'unlimited' :-) > > > > It's not a problem to set it to a new reasonable value in the tree, but > it will be a problem to fix this everywhere. > > > Also, note that currently RLIMIT_MEMLOCK is abused at least in vslock() > (used at > > least by sysctl's kernel side). The temporary wirings performed as an > > implementation detail or side-effect should not be accounted against the > limit. > > The limit is for wirings that a user makes and controls explicitly. It > should > > not be applied to something that kernel does behind the scenes without > user's > > knowledge. > > > > I was surprised when I stepped on this few years ago on machine with > thousands processes. top(8) ate 100% CPU in a forever loop, ps(1) > didn't work, and that is because memorylocked limit was set to low. > Later I submitted two patches which fixed kvm (r230873) and sockstat > (r230874), but I totally agree with you here, we shouldn't check for > limits in vslock(). > > I agree with Andriy's argument for making the following change. Please go ahead and commit it. Alan Index: sys/vm/vm_glue.c > =================================================================== > --- sys/vm/vm_glue.c (revision 239772) > +++ sys/vm/vm_glue.c (working copy) > @@ -183,7 +183,6 @@ int > vslock(void *addr, size_t len) > { > vm_offset_t end, last, start; > - unsigned long nsize; > vm_size_t npages; > int error; > > @@ -195,18 +194,6 @@ vslock(void *addr, size_t len) > npages = atop(end - start); > if (npages > vm_page_max_wired) > return (ENOMEM); > - PROC_LOCK(curproc); > - nsize = ptoa(npages + > - pmap_wired_count(vm_map_pmap(&curproc->p_vmspace->vm_map))); > - if (nsize > lim_cur(curproc, RLIMIT_MEMLOCK)) { > - PROC_UNLOCK(curproc); > - return (ENOMEM); > - } > - if (racct_set(curproc, RACCT_MEMLOCK, nsize)) { > - PROC_UNLOCK(curproc); > - return (ENOMEM); > - } > - PROC_UNLOCK(curproc); > #if 0 > /* > * XXX - not yet > @@ -222,14 +209,6 @@ vslock(void *addr, size_t len) > #endif > error = vm_map_wire(&curproc->p_vmspace->vm_map, start, end, > VM_MAP_WIRE_SYSTEM | VM_MAP_WIRE_NOHOLES); > -#ifdef RACCT > - if (error != KERN_SUCCESS) { > - PROC_LOCK(curproc); > - racct_set(curproc, RACCT_MEMLOCK, > - > ptoa(pmap_wired_count(vm_map_pmap(&curproc->p_vmspace->vm_map)))); > - PROC_UNLOCK(curproc); > - } > -#endif > /* > * Return EFAULT on error to match copy{in,out}() behaviour > * rather than returning ENOMEM like mlock() would. > > -- > Andrey Zonov > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJUyCcMjUSUSyF9Ry3d16izrYCjebQFtpYy0OiFof%2BgOKAs5Qg>