Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Jan 2015 11:38:41 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Hans Petter Selasky <hps@selasky.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r277199 - in head/sys: fs/devfs kern
Message-ID:  <20150115093841.GX42409@kib.kiev.ua>
In-Reply-To: <54B76F2B.8040106@selasky.org>
References:  <201501142207.t0EM7Dfn041543@svn.freebsd.org> <20150115033109.GM42409@kib.kiev.ua> <54B76F2B.8040106@selasky.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jan 15, 2015 at 08:41:31AM +0100, Hans Petter Selasky wrote:
> On 01/15/15 04:31, Konstantin Belousov wrote:
> > On Wed, Jan 14, 2015 at 10:07:13PM +0000, Hans Petter Selasky wrote:
> >> Author: hselasky
> >> Date: Wed Jan 14 22:07:13 2015
> >> New Revision: 277199
> >> URL: https://svnweb.freebsd.org/changeset/base/277199
> >>
> >> Log:
> >>    Avoid race with "dev_rel()" when using the recently added
> >>    "delist_dev()" function. Make sure the character device structure
> >>    doesn't go away until the end of the "destroy_dev()" function due to
> >>    concurrently running cleanup code inside "devfs_populate()".
> >>
> >>    MFC after:	1 week
> >>    Reported by:	dchagin@
> >>
> >> Modified:
> >>    head/sys/fs/devfs/devfs_devs.c
> >>    head/sys/kern/kern_conf.c
> >>
> >> Modified: head/sys/fs/devfs/devfs_devs.c
> >> ==============================================================================
> >> --- head/sys/fs/devfs/devfs_devs.c	Wed Jan 14 22:05:57 2015	(r277198)
> >> +++ head/sys/fs/devfs/devfs_devs.c	Wed Jan 14 22:07:13 2015	(r277199)
> >> @@ -137,6 +137,12 @@ devfs_alloc(int flags)
> >>   	vfs_timestamp(&ts);
> >>   	cdev->si_atime = cdev->si_mtime = cdev->si_ctime = ts;
> >>   	cdev->si_cred = NULL;
> >> +	/*
> >> +	 * Avoid race with dev_rel() by setting the initial
> >> +	 * reference count to 1. This last reference is taken
> >> +	 * by the destroy_dev() function.
> >> +	 */
> >> +	cdev->si_refcount = 1;
> > This is wrong.  Not all devices are destroyed with destroy_dev().
> > dev_rel() must be allowed to clean up allocated device.
> >
> > That said, I do not understand what race you are trying to solve.
> > Freeing of the accessible cdev memory cannot happen in parallel while
> > dev_mtx is owned.
> >
> > Please do not commit (to devfs) without seeking for the review first.
> 
> Hi Konstantin,
> 
>  From my analysis there are basically three ways for a cdev to die:
> 
> 1) Through dev_free_devlocked()
> 2) Through destroy_devl() which then later calls dev_free_devlocked()
> 3) Through destroy_dev_sched() which really is a wrapper around 
> destroy_devl().
You only look from the consumers PoV.  Devfs cdev can be dereferenced
because e.g. clone handler decides that cdev is not valid/needed,
and now the memory is never freed due to extra reference.

Do not assume that all cdevs go through destroy_dev().

> 
> In the case of direct free through #1, the reference count is ignored 
> and it doesn't matter if it is one or zero. Only in the case of 
> destruction through destroy_dev() it matters.
> 
> Like the comment says in destroy_devl():
> 
> /* Avoid race with dev_rel() */
> 
> The problem is that the "cdev->si_refcount" is zero when the initial
> devfs_create() is called. Then one ref is made. When we clear the 
> CDP_ACTIVE flag in devfs_destroy() it instructs a !parallel! running 
> process to destroy all the FS related structures and the reference count 
> goes back to zero when the "cdp" is removed from the "cdevp_list". Then 
> the cdev is freed too early. This happens because destroy_devl() is 
> dropping the dev_lock() to sleep waiting for pending references.
Basically, this is very good explanation why your delist hack is wrong,
for one of the reason.  Another reason is explained below.
You are trying to cover it with additional reference, but this is wrong
as well.

> 
> Do you see something else?

I think that what you are trying to do with the CDP_ACTIVE hack is doomed
anyway, because you are allowing for devfs directory to have two entries
with the same name, until the populate loop cleans up the inactive one.
In the meantime, any access to the directory operates on random entry.

The checks for existent names in make_dev() are performed for the reason,
and you makes the rounds to effectively ignore it.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150115093841.GX42409>