Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 3 Mar 2018 19:43:54 +0100
From:      Mariusz Zaborski <oshogbo@FreeBSD.org>
To:        "Robert N. M. Watson" <robert.watson@cl.cam.ac.uk>
Cc:        Alan Somers <asomers@freebsd.org>, "<cl-capsicum-discuss@lists.cam.ac.uk>" <cl-capsicum-discuss@lists.cam.ac.uk>,  Alexander Richardson <Alexander.Richardson@cl.cam.ac.uk>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: [capsicum] unlinkfd
Message-ID:  <20180303184354.GA48406@x-wing>
In-Reply-To: <6E83EC9C-56C5-40E7-AED0-2692A15F7FD3@cl.cam.ac.uk>
References:  <20180302183514.GA99279@x-wing> <CAK4o1Wyk54chHobhUkb2PBUtaWOF2rDv6tkX_bFGY6D331xUqw@mail.gmail.com> <17DE0BFF-42A2-4CD7-B09C-ABA2606C4041@cl.cam.ac.uk> <CAEeofcgLD%2BTjKswPexNDUfeeAxHgUOjsZUdD3g3Jc%2BQuyRu4OQ@mail.gmail.com> <CAOtMX2h83wddDwcvgae-a02AuyYPyTYfmzqeJemMtKt7%2BL74YQ@mail.gmail.com> <6E83EC9C-56C5-40E7-AED0-2692A15F7FD3@cl.cam.ac.uk>

next in thread | previous in thread | raw e-mail | index | archive | help

--huq684BweRXVnRxX
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

I feel that there is two different things we can think about:
- What we would implement in the capability system if we would build it from
  scratch. Here shm_open(2) and SHM_ANON can be solution to our problems.
- On the other hand we have a working operating system and we can't expect =
that
  all our programs that are already implemented will fit to those assumptio=
ns
  nor ask developers to rewrite many existing programs.

On Sat, Mar 03, 2018 at 05:16:38PM +0000, Robert N. M. Watson wrote:
> New _check() variants of the unlinkat(2) and rmdirat(2) system calls migh=
t do the trick -- e.g.,
>=20
> 	int	unlinkat_check(dirfd, name, checkfd);
> 	int	rmdirat_check(dirfd, name, checkfd);
>=20
Similar API was proposed on the review. This solves the issue with RC.
Unfortunately it's not solve the problem with guessing in which directory we
will work in.

When I think about sandboxing for example rm(1) we would need to preopen ro=
ot
directory, or preopen all directories we will work in. Both solution just d=
on't
feel right.

I'm not saying that the unlinkfd is the right and only solution - I'm just =
trying
to solve problem we identified while sandboxing apps. I'm glad we started t=
his
discussion I hope we will work some compromise between all presented challe=
nges.

Thanks,
--=20
Mariusz Zaborski
oshogbo//vx		| http://oshogbo.vexillium.org
FreeBSD commiter	| https://freebsd.org
Software developer	| http://wheelsystems.com
If it's not broken, let's fix it till it is!!1

> The calls would succeed only if 'name' refers to the filesystem object pa=
ssed via checkfd. This would retain UNIX-style directory behaviour but allo=
ws an atomic check that the object is as expected.
>=20
> Of course, what you do about it if it turns out the check fails is anothe=
r question... Better not to have a name at all, hence shm_open(SHM_ANON, ..=
=2E) -- although just for file objects, and not directory hierarchies.
>=20
> Robert
>=20
> > On 3 Mar 2018, at 15:29, Alan Somers <asomers@freebsd.org> wrote:
> >=20
> > In fact, FreeBSD has that same unlinkat(2) system call.  But it doesn't=
 solve Mariusz's problem.  He's concerned about race conditions.  With eith=
er unlink(2) or unlinkat(2), there's no way to ensure that the directory en=
try you remove is for the file you think it is.  Because after reading/writ=
ing a file and before unlinking it, some other processes could've unlinked =
it and created a new one with the same name.  It's this race condition that=
 Mariuz seeks to solve with unlinkfd.
> > -Alan
> >=20
> > On Sat, Mar 3, 2018 at 5:46 AM, Alexander Richardson <Alexander.Richard=
son@cl.cam.ac.uk <mailto:Alexander.Richardson@cl.cam.ac.uk>> wrote:
> > Linux has a unlinkat() system call (https://linux.die.net/man/2/unlinka=
t <https://linux.die.net/man/2/unlinkat>)
> > but it doesn't seem to have a flag that lets you unlink the fd itself.
> > Possibly pathname =3D=3D NULL and AT_EMPTY_PATH could mean unlink the f=
d but I
> > haven't tried whether that works.
> > It also has a AT_REMOVEDIR flag to make it function as rmdirat().
> >=20
> > On 3 March 2018 at 10:41, Robert N. M. Watson <robert.watson@cl.cam.ac.=
uk <mailto:robert.watson@cl.cam.ac.uk>>
> > wrote:
> >=20
> > > FWIW, this is part of why we introduced anonymous POSIX shared memory
> > > objects with Capsicum in FreeBSD -- we allow shm_open(2) to be passed=
 a
> > > SHM_ANON special name, which causes the creation of a swap-backed, ma=
ppable
> > > file-like object that can have I/O, memory mapping, etc, performed on=
 it ..
> > > but never has any persistent state across reboots even in the event o=
f a
> > > crash.
> > >
> > > With Capsicum you can then refine a file descriptor to the otherwise
> > > writable object to be read-only for the purposes of delegation. There=
 is
> > > not, however, a mechanism to "freeze" the state of the object causing=
 other
> > > outstanding writable descriptors to become read-only -- certainly som=
ething
> > > could be added, but some care regarding VM semantics would be require=
d --
> > > in particular, so that faults could not be experienced as a result of=
 an
> > > memory store performed before the "freeze" but issued to VFS only lat=
er.
> > >
> > > I certainly have no objection to an unlinkat(2) system call -- it's
> > > unfortunate that a full suite of the at(2) APIs wasn't introduced in =
the
> > > first place. It would be worth checking that no one else (e.g., Solar=
is,
> > > Mac OS X, Linux) hasn't already added an unlinkat(2) that we can matc=
h API
> > > semantics for. I think I take the view that for truly anonymous objec=
ts,
> > > shm_open(2) without a name (or the Linux equiv) is the right thing --=
 and
> > > hence unlinkat(2) is for more conventional use cases where the final
> > > pathname element is known.
> > >
> > > On directories: There, I find myself falling back on a Casper-like
> > > service, since GC'ing a single anonymous memory object is straightfor=
ward,
> > > but GC'ing a directory hierarchy is a more messy business.
> > >
> > > Robert
> > >
> > > > On 3 Mar 2018, at 09:53, Justin Cormack <justin@specialbusservice.c=
om <mailto:justin@specialbusservice.com>>
> > > wrote:
> > > >
> > > > I think it would make sense to have an unlinkfd() that unlinks the =
file
> > > from
> > > > everywhere, so it does not need a name to be specified. This might =
be
> > > > hard to implement.
> > > >
> > > > For temporary files, I really like Linux memfd_create(2) that opens=
 an
> > > anonymous
> > > > file without a name. This semantics is really useful. (Linux memfd =
also
> > > has
> > > > additional options for sealing the file fo make it immutable which =
are
> > > very
> > > > useful for safely passing files between processes.) Having a way to=
 make
> > > > unnamed temporary files solves a lot of deletion issues as the file
> > > > never needs to
> > > > be unlinked.
> > > >
> > > >
> > > > On 2 March 2018 at 18:35, Mariusz Zaborski <oshogbo@freebsd.org <ma=
ilto:oshogbo@freebsd.org>> wrote:
> > > >> Hello,
> > > >>
> > > >> Today I would like to propose a new syscall called unlinkfd(2) whi=
ch
> > > came up
> > > >> during a discussion with Ed Maste.
> > > >>
> > > >> Currently in UNIX we can=E2=80=99t remove files safely. If we will=
 try to do so
> > > we
> > > >> always end up in a race condition. For example when we open a file=
, and
> > > check
> > > >> it with fstat, etc. then we want to unlink(2) it=E2=80=A6 but the =
file we are
> > > trying to
> > > >> unlink could be a different one than the one we were fstating just=
 a
> > > moment ago.
> > > >>
> > > >> Another reason of implementing unlinkfd(2) came to us when we were
> > > trying
> > > >> to sandbox some applications like: uudecode/b64decode or bspatch. =
It
> > > occured
> > > >> to us that we don=E2=80=99t have a good way of removing single fil=
es. Of course
> > > we can
> > > >> try to determine in which directory we are in, and then open this
> > > directory and
> > > >> remove a single file.
> > > >>
> > > >> It looks even more bizarre if we would think about a program which
> > > operates on
> > > >> multiple files. If we would analyze a situation with two totally
> > > different
> > > >> directories like `/tmp` and `/home/oshogbo` we would end up with p=
re
> > > opening
> > > >> a root directory or keeping as many directories as we are working =
on
> > > open.
> > > >> All of that effort only to remove two files. This make it totally
> > > impractical!
> > > >>
> > > >> I think that opening directories also presents some wider attack v=
ector
> > > because
> > > >> we are keeping a single descriptor to a directory only to remove o=
ne
> > > file.
> > > >> Unfortunately this means that an attacker can remove all files in =
that
> > > directory.
> > > >>
> > > >> I proposed this as well on the last Capsicum call. There was a
> > > suggestion that
> > > >> instead of doing a single syscall maybe we should have a Casper se=
rvice
> > > that
> > > >> will allow us to remove files. Another idea was that we should per=
haps
> > > redesign
> > > >> programs to create some subdirs work on the subdirs and then remov=
e all
> > > files in
> > > >> this subdir. I don=E2=80=99t feel that creating a Casper service i=
s a good idea
> > > because
> > > >> we still have exactly the same issue of race condition. In my opin=
ion
> > > creating
> > > >> subdirs is also a problem for us.
> > > >>
> > > >> First we would need to redesign some of our tools and I think we s=
hould
> > > >> simplyfiy capsicumizition of the process instead of making it hard=
er.
> > > >>
> > > >> Secondly we can create a temporary subdirectory but what will remo=
ve it?
> > > >> We are going back to having a fd to directory in which we just cre=
ated
> > > a subdir.
> > > >> Another way would be to have Casper service which would remove a
> > > directory but
> > > >> with the risk of RC.
> > > >>
> > > >> In conclusion, I think we need syscall like unlinkfd(2), which tur=
n out
> > > taht it
> > > >> is easy to implement. The only downside of this implementation is =
that
> > > we not
> > > >> only need to provide a fd but also a path file. This is because in=
odes
> > > nor
> > > >> vnodes don=E2=80=99t contain filenames. We are comparing vnodes of=
 the fd and
> > > the given
> > > >> path, if they are exactly the same we remove a file. In the syscal=
l we
> > > are using
> > > >> a fd so there is no Ambient Authority because we are proving that =
we
> > > already
> > > >> have access to that file. Thanks to that the syscall can be safely=
 used
> > > with
> > > >> Caspsicum. I have already discussed this with some people and they=
 said
> > > >> `Hey I already had that idea a while ago=E2=80=A6` so let=E2=80=99=
s do something with
> > > that idea!
> > > >> If you are intereted in patch you can find it here:
> > > >> https://reviews.freebsd.org/D14567 <https://reviews.freebsd.org/D1=
4567>
> > > >>
> > > >> Thanks,
> > > >> --
> > > >> Mariusz Zaborski
> > > >> oshogbo//vx             | http://oshogbo.vexillium.org <http://osh=
ogbo.vexillium.org/>
> > > >> FreeBSD commiter        | https://freebsd.org <https://freebsd.org=
/>
> > > >> Software developer      | http://wheelsystems.com <http://wheelsys=
tems.com/>
> > > >> If it's not broken, let's fix it till it is!!1
> > > >
> > >
> > >
> > >
> > _______________________________________________
> > freebsd-hackers@freebsd.org <mailto:freebsd-hackers@freebsd.org> mailin=
g list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers <https://lis=
ts.freebsd.org/mailman/listinfo/freebsd-hackers>
> > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.o=
rg <mailto:freebsd-hackers-unsubscribe@freebsd.org>"
> >=20
>=20

--huq684BweRXVnRxX
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEkD1x0xkJXVVY1Gwf38KEGuLGxWQFAlqa7N8ACgkQ38KEGuLG
xWSNRw/9FwcsJBggBQ9oCckXSdsxGN4SVwTkrD9TSr1HmQ9f204BcbbLody5btwd
xL/vCfh16MMRgJhFmsKxZEmhykVaBRfd4ZNZ9nVVY4t/RfzjbU+B61C5PyQ/TV/m
hyVMB1V8VIDOsiDs9MWL+v6sUY6e483vx9j3VmAqR1IAwyybB9mvuTZ1ZTpe81Iu
av2FAWjmlwpWRJWzFMthxMqVkWBn1QINM2EOHKzdgpG4fl+8iwc5DUV0Hjtaofba
R1kaUXvhuo2NTGMLqa5NTC1yoBttYPyQGYTmv3WUI4DdbK/0TwM9hmUJhXRJgA8+
D2iOietYwwllTlz5V4mkBTs6O9g3Dwde3QVHeGLV1eDdLgMYNk0uXHj2zzI1KU4p
VUXCTmgw1VR/dTdCchPKuxJmgKqRSi5MZhIP+E47sRTQernfNoj2zp3brdLeG0wG
tJR+FFTF0TC81rF2BPU6iWptuajXqiHWw5rrP5l6tr3Sk/qLiR0axc1muc7PjW7K
VJxrswG82KApJBQt+GgUjFySrvSZ5DiXs+zCOcrUjNvK1Tf0foAARD9ZXaAEYtpL
sFmpdfHjP3pOHP8zTG+1UrO00RiT0nO4dodzSFnMEBjmEfrBu2GNN5ygL8ATPenl
FbNUB/bQ6/xYsnQrVROBuLJu69klIoEfUUMtESt3K5mNUVn3MeE=
=+11b
-----END PGP SIGNATURE-----

--huq684BweRXVnRxX--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180303184354.GA48406>