Date: Wed, 14 Apr 2010 18:05:54 +0200 From: Fabio Checconi <fabio@freebsd.org> To: Luigi Rizzo <rizzo@iet.unipi.it> Cc: svn-src-head@FreeBSD.org, Luigi Rizzo <luigi@FreeBSD.org>, src-committers@FreeBSD.org, Pawel Jakub Dawidek <pjd@FreeBSD.org>, svn-src-all@FreeBSD.org Subject: Re: svn commit: r206497 - in head: sbin/geom/class sbin/geom/class/sched sys/geom/sched sys/modules/geom sys/modules/geom/geom_sched sys/modules/geom/geom_sched/gs_sched sys/modules/geom/geom_sched/gsc... Message-ID: <20100414160554.GB3188@gandalf.sssup.it> In-Reply-To: <20100414081630.GA74130@onelab2.iet.unipi.it> References: <201004121637.o3CGbjSK080066@svn.freebsd.org> <20100412204926.GB1743@garage.freebsd.pl> <20100412210512.GB94885@onelab2.iet.unipi.it> <20100414074616.GA1657@garage.freebsd.pl> <20100414081630.GA74130@onelab2.iet.unipi.it>
next in thread | previous in thread | raw e-mail | index | archive | help
> From: Luigi Rizzo <rizzo@iet.unipi.it> > Date: Wed, Apr 14, 2010 10:16:30AM +0200 > > [Cc-ing Fabio as he may have more details] > ... > > BTW. So you decided to implement insert/remove functionality after all. > > I have some questions: > > > > - It is implemented as internal gsched hack, which is a pity, because > > this might be very useful functionality for other classes in the future. > > Is there a plan to make it more general and move it to the GEOM itself? > > yes there is such a plan -- of course if nobody has objections. > In principle it is only a library extensions with no modifications > to geom internals. > > However, when we developed that last year, we hit some corner case > where removal of an active node causes either a race or (if you try > to protect from the race) a livelock. Fixing this may require some > small cleanup to geom internals (we discusses the issue with phk > at last EuroBSDCon. Fabio may give you more details, as far as i > remember the problem was that some geom code takes shortcuts instead > of following a chain of pointers, and this can end up in the wrong > place in case of a removal.) > > For this reason, at this time i am not recommending to remove a > node from a chain with outstanding transactions until the issue is solved. > I'm not sure I remember all the details, the major issues were: - g_disk_done() dereferences bio->parent->bio_to->geom->softc, thus changing bio_to->geom on the fly was not possible with pending request (they would find a wrong softc when bubbling up). For this reason we added a loop in g_insert_proxy() to wait for all the pending requests to be completed prior to inserting the proxy; but: - g_slice_finish_hot() completes requests in the event handling path, thus said loop (executed from an event handler) could result in a deadlock. To avoid this (it should be far from being frequent, considering the usages of hotspots in the slice code) g_insert_proxy() fails if it takes too long to complete the old requests. - Some classes (from a quick look I've seen only g_mirror, but I'm pretty sure there were some other ones) cache pointers to their providers. With these classes this implementation does not work. For the scheduler this is not a big issue, because its natural position is as close as possible to the disk device, but makes the mechanism quite hard to use in a more generic context. > > - Why g_sched_flush_pending() operates on global structure? I think it > > will break if you try to insert and remove at the same time. > If I'm not wrong, this should be safe because the global list is used only under critical sections protected by the topology lock.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100414160554.GB3188>