Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 3 Mar 2018 08:29:21 -0700
From:      Alan Somers <asomers@freebsd.org>
To:        Alexander Richardson <Alexander.Richardson@cl.cam.ac.uk>
Cc:        "<cl-capsicum-discuss@lists.cam.ac.uk>" <cl-capsicum-discuss@lists.cam.ac.uk>,  "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: [capsicum] unlinkfd
Message-ID:  <CAOtMX2h83wddDwcvgae-a02AuyYPyTYfmzqeJemMtKt7%2BL74YQ@mail.gmail.com>
In-Reply-To: <CAEeofcgLD%2BTjKswPexNDUfeeAxHgUOjsZUdD3g3Jc%2BQuyRu4OQ@mail.gmail.com>
References:  <20180302183514.GA99279@x-wing> <CAK4o1Wyk54chHobhUkb2PBUtaWOF2rDv6tkX_bFGY6D331xUqw@mail.gmail.com> <17DE0BFF-42A2-4CD7-B09C-ABA2606C4041@cl.cam.ac.uk> <CAEeofcgLD%2BTjKswPexNDUfeeAxHgUOjsZUdD3g3Jc%2BQuyRu4OQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
In fact, FreeBSD has that same unlinkat(2) system call.  But it doesn't
solve Mariusz's problem.  He's concerned about race conditions.  With
either unlink(2) or unlinkat(2), there's no way to ensure that the
directory entry you remove is for the file you think it is.  Because after
reading/writing a file and before unlinking it, some other processes
could've unlinked it and created a new one with the same name.  It's this
race condition that Mariuz seeks to solve with unlinkfd.
-Alan

On Sat, Mar 3, 2018 at 5:46 AM, Alexander Richardson <
Alexander.Richardson@cl.cam.ac.uk> wrote:

> Linux has a unlinkat() system call (https://linux.die.net/man/2/unlinkat)
> but it doesn't seem to have a flag that lets you unlink the fd itself.
> Possibly pathname =3D=3D NULL and AT_EMPTY_PATH could mean unlink the fd =
but I
> haven't tried whether that works.
> It also has a AT_REMOVEDIR flag to make it function as rmdirat().
>
> On 3 March 2018 at 10:41, Robert N. M. Watson <robert.watson@cl.cam.ac.uk=
>
> wrote:
>
> > FWIW, this is part of why we introduced anonymous POSIX shared memory
> > objects with Capsicum in FreeBSD -- we allow shm_open(2) to be passed a
> > SHM_ANON special name, which causes the creation of a swap-backed,
> mappable
> > file-like object that can have I/O, memory mapping, etc, performed on i=
t
> ..
> > but never has any persistent state across reboots even in the event of =
a
> > crash.
> >
> > With Capsicum you can then refine a file descriptor to the otherwise
> > writable object to be read-only for the purposes of delegation. There i=
s
> > not, however, a mechanism to "freeze" the state of the object causing
> other
> > outstanding writable descriptors to become read-only -- certainly
> something
> > could be added, but some care regarding VM semantics would be required =
--
> > in particular, so that faults could not be experienced as a result of a=
n
> > memory store performed before the "freeze" but issued to VFS only later=
.
> >
> > I certainly have no objection to an unlinkat(2) system call -- it's
> > unfortunate that a full suite of the at(2) APIs wasn't introduced in th=
e
> > first place. It would be worth checking that no one else (e.g., Solaris=
,
> > Mac OS X, Linux) hasn't already added an unlinkat(2) that we can match
> API
> > semantics for. I think I take the view that for truly anonymous objects=
,
> > shm_open(2) without a name (or the Linux equiv) is the right thing -- a=
nd
> > hence unlinkat(2) is for more conventional use cases where the final
> > pathname element is known.
> >
> > On directories: There, I find myself falling back on a Casper-like
> > service, since GC'ing a single anonymous memory object is
> straightforward,
> > but GC'ing a directory hierarchy is a more messy business.
> >
> > Robert
> >
> > > On 3 Mar 2018, at 09:53, Justin Cormack <justin@specialbusservice.com=
>
> > wrote:
> > >
> > > I think it would make sense to have an unlinkfd() that unlinks the fi=
le
> > from
> > > everywhere, so it does not need a name to be specified. This might be
> > > hard to implement.
> > >
> > > For temporary files, I really like Linux memfd_create(2) that opens a=
n
> > anonymous
> > > file without a name. This semantics is really useful. (Linux memfd al=
so
> > has
> > > additional options for sealing the file fo make it immutable which ar=
e
> > very
> > > useful for safely passing files between processes.) Having a way to
> make
> > > unnamed temporary files solves a lot of deletion issues as the file
> > > never needs to
> > > be unlinked.
> > >
> > >
> > > On 2 March 2018 at 18:35, Mariusz Zaborski <oshogbo@freebsd.org>
> wrote:
> > >> Hello,
> > >>
> > >> Today I would like to propose a new syscall called unlinkfd(2) which
> > came up
> > >> during a discussion with Ed Maste.
> > >>
> > >> Currently in UNIX we can=E2=80=99t remove files safely. If we will t=
ry to do
> so
> > we
> > >> always end up in a race condition. For example when we open a file,
> and
> > check
> > >> it with fstat, etc. then we want to unlink(2) it=E2=80=A6 but the fi=
le we are
> > trying to
> > >> unlink could be a different one than the one we were fstating just a
> > moment ago.
> > >>
> > >> Another reason of implementing unlinkfd(2) came to us when we were
> > trying
> > >> to sandbox some applications like: uudecode/b64decode or bspatch. It
> > occured
> > >> to us that we don=E2=80=99t have a good way of removing single files=
. Of
> course
> > we can
> > >> try to determine in which directory we are in, and then open this
> > directory and
> > >> remove a single file.
> > >>
> > >> It looks even more bizarre if we would think about a program which
> > operates on
> > >> multiple files. If we would analyze a situation with two totally
> > different
> > >> directories like `/tmp` and `/home/oshogbo` we would end up with pre
> > opening
> > >> a root directory or keeping as many directories as we are working on
> > open.
> > >> All of that effort only to remove two files. This make it totally
> > impractical!
> > >>
> > >> I think that opening directories also presents some wider attack
> vector
> > because
> > >> we are keeping a single descriptor to a directory only to remove one
> > file.
> > >> Unfortunately this means that an attacker can remove all files in th=
at
> > directory.
> > >>
> > >> I proposed this as well on the last Capsicum call. There was a
> > suggestion that
> > >> instead of doing a single syscall maybe we should have a Casper
> service
> > that
> > >> will allow us to remove files. Another idea was that we should perha=
ps
> > redesign
> > >> programs to create some subdirs work on the subdirs and then remove
> all
> > files in
> > >> this subdir. I don=E2=80=99t feel that creating a Casper service is =
a good
> idea
> > because
> > >> we still have exactly the same issue of race condition. In my opinio=
n
> > creating
> > >> subdirs is also a problem for us.
> > >>
> > >> First we would need to redesign some of our tools and I think we
> should
> > >> simplyfiy capsicumizition of the process instead of making it harder=
.
> > >>
> > >> Secondly we can create a temporary subdirectory but what will remove
> it?
> > >> We are going back to having a fd to directory in which we just creat=
ed
> > a subdir.
> > >> Another way would be to have Casper service which would remove a
> > directory but
> > >> with the risk of RC.
> > >>
> > >> In conclusion, I think we need syscall like unlinkfd(2), which turn
> out
> > taht it
> > >> is easy to implement. The only downside of this implementation is th=
at
> > we not
> > >> only need to provide a fd but also a path file. This is because inod=
es
> > nor
> > >> vnodes don=E2=80=99t contain filenames. We are comparing vnodes of t=
he fd and
> > the given
> > >> path, if they are exactly the same we remove a file. In the syscall =
we
> > are using
> > >> a fd so there is no Ambient Authority because we are proving that we
> > already
> > >> have access to that file. Thanks to that the syscall can be safely
> used
> > with
> > >> Caspsicum. I have already discussed this with some people and they
> said
> > >> `Hey I already had that idea a while ago=E2=80=A6` so let=E2=80=99s =
do something with
> > that idea!
> > >> If you are intereted in patch you can find it here:
> > >> https://reviews.freebsd.org/D14567
> > >>
> > >> Thanks,
> > >> --
> > >> Mariusz Zaborski
> > >> oshogbo//vx             | http://oshogbo.vexillium.org
> > >> FreeBSD commiter        | https://freebsd.org
> > >> Software developer      | http://wheelsystems.com
> > >> If it's not broken, let's fix it till it is!!1
> > >
> >
> >
> >
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org=
"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2h83wddDwcvgae-a02AuyYPyTYfmzqeJemMtKt7%2BL74YQ>