Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Jul 2013 10:40:45 -0700
From:      Alan Cox <alc@rice.edu>
To:        Jeremie Le Hen <jlh@FreeBSD.org>
Cc:        Alan Cox <alc@FreeBSD.org>, Konstantin Belousov <kib@freebsd.org>, freebsd-current@FreeBSD.org, trasz@FreeBSD.org
Subject:   Re: Fix for sys_munlock(2) with racct
Message-ID:  <EDF1EFE1-E9A9-44CF-8007-A642971572BE@rice.edu>
In-Reply-To: <20130721215038.GE13628@caravan.chchile.org>
References:  <20130720112218.GD13628@caravan.chchile.org> <115EDC04-4B95-4FB5-9092-515188D8ADA7@rice.edu> <20130721215038.GE13628@caravan.chchile.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Jul 21, 2013, at 2:50 PM, Jeremie Le Hen wrote:

> On Sat, Jul 20, 2013 at 08:33:35PM -0700, Alan Cox wrote:
>>=20
>> On Jul 20, 2013, at 4:22 AM, Jeremie Le Hen wrote:
>>=20
>>> Hi Edward, Alan,
>>>=20
>>> I plan to commit the following patch:
>>> http://people.freebsd.org/~jlh/racct_munlock.diff
>>>=20
>>> This solves the following panic:
>>>=20
>>> panic: racct_sub: freeing 301989888 of resource 5, which is more =
than allocated 73728 for pwsafe (pid 4177)
>>>=20
>>> The problem is that the racct code in sys_munlock() trusts too much =
the
>>> user's input.  vm_map_unwire_count() now returns how much memory has
>>> really been unwired.
>>>=20
>>> Any objection?
>>=20
>>=20
>> Can you elaborate on what you mean by "sys_munlock() trusts too much
>> the user's input."   munlock(2) is supposed to return ENOMEM if any
>> addresses within the specified range are not backed by valid =
mappings.
>> (Valid mappings with PROT_NONE access are something of a gray area
>> here.)  However, it is not error for a to call munlock() on a range
>> that isn't locked, or to call it a second, third, etc. time on the
>> same range.  Is that what is provoking your panic?
>=20
> I'm using sysutils/pwsafe and it seems to mlock(2) 18 pages (end - =
start
> =3D 73728 bytes) but then munlock(2) 73728 pages :-). vm_map_unwire()
> seems to do the right thing with those buggy boundaries, but then the
> racct code assumes that those boundaries were correct as long as
> vm_map_unwire() returned successfully.  This is what causes the
> assertion failure.
>=20

It sounds like the application is calling munlock(2) on valid but =
unwired memory.  That's not considered an error under the various Unix =
standards that specify the behavior of munlock(2).  It's sloppy =
programming but not buggy programming.=20

>=20
>> By the way, sys_mlock() uses a simpler approach to deal with a =
similar
>> situation:
>>=20
>>        error =3D vm_map_wire(map, start, end,=20
>>            VM_MAP_WIRE_USER | VM_MAP_WIRE_NOHOLES);
>> #ifdef RACCT
>>        if (error !=3D KERN_SUCCESS) {
>>                PROC_LOCK(proc);
>>                racct_set(proc, RACCT_MEMLOCK,
>>                    ptoa(pmap_wired_count(map->pmap)));
>>                PROC_UNLOCK(proc);
>>        }
>> #endif
>>=20
>> However, the code in sys_mlock() for maintaining RACCT_MEMLOCK,
>> including the above snippet, is race-y.  Two simultaneous callers to
>> sys_mlock() will produce incorrect results.  (I haven't looked at
>> sys_mlockall() or vm_map_growstack().)
>=20
> Ok, I will commit that as the first bandaid then.
>=20
>=20
>> Also, a wired mapping can be destroyed by calling munmap(2) without
>> first calling munlock(2), in which case, RACCT_MEMLOCK will be
>> incorrect.
>=20
> So I think the right way to tackle this is to handle racct in the vm
> layer rather than at the syscall layer.
>=20

The VM system already maintains counters equivalent to RACCT_VMEM and =
RACCT_MEMLOCK.  They are "map->size" and "pmap_wired_count(map->pmap)".  =
Instead of maintaining duplicate counters, could the resource accounting =
framework be extended to support callbacks to obtain a value when it's =
actually needed?

> vm_fault_{wire,unwire}() look like natural places for this, but racct
> functions require the struct proc to be locked.  Any idea how to =
handle
> this?
>=20
> --=20
> Jeremie Le Hen
>=20
> Scientists say the world is made up of Protons, Neutrons and =
Electrons.
> They forgot to mention Morons.
>=20




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EDF1EFE1-E9A9-44CF-8007-A642971572BE>