Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Dec 2017 15:32:48 -0500
From:      Mark Johnston <markj@freebsd.org>
To:        Warner Losh <imp@bsdimp.com>
Cc:        src-committers <src-committers@freebsd.org>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r326731 - head/sys/ufs/ffs
Message-ID:  <20171220203248.GB11032@raichu>
In-Reply-To: <20171210031450.GB15275@raichu>
References:  <201712091544.vB9FiVUI096790@repo.freebsd.org> <a1d303b4-14b8-6a3d-3648-96a69bed7144@FreeBSD.org> <CANCZdfoM0xhtFsajOQA2RB1FkX0y83z=vUUk72jcQNKqrdGkYQ@mail.gmail.com> <20171210031450.GB15275@raichu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Dec 09, 2017 at 10:14:50PM -0500, Mark Johnston wrote:
> On Sat, Dec 09, 2017 at 07:36:59PM -0700, Warner Losh wrote:
> > On Sat, Dec 9, 2017 at 11:03 AM, Andriy Gapon <avg@freebsd.org> wrote:
> > 
> > > On 09/12/2017 17:44, Mark Johnston wrote:
> > > > Some GEOMs do not appear to handle BIO_ORDERED correctly, meaning that
> > > the
> > > > barrier write may not work as intended.
> > 
> > 
> > There's a few places we send down a BIO_ORDERED BIO_FLUSH command
> > (see softdep_synchronize for one). Will those matter?
> 
> Some classes have separate handling for BIO_FLUSH, so it depends. I
> think gmirror's handling is buggy independent of BIO_ORDERED:
> g_mirror_start() sends BIO_FLUSH commands directly to the mirrors,
> while reads and writes are queued for handling by the gmirror worker
> thread. So as far as I can tell, a BIO_WRITE which arrives at the
> gmirror provider before a BIO_FLUSH might be sent to the mirrors
> after that BIO_FLUSH. I would expect BIO_FLUSH to implicitly have
> something like release semantics, i.e., a BIO_FLUSH shouldn't be
> reordered with a BIO_WRITE that preceded it. But I might be
> misunderstanding.
> 
> > As I've noted elsewhere: I'd really like to kill BIO_ORDERED since it has
> > too many icky effects (and BIO_FLUSH + BIO_ORDERED isn't guaranteed to do,
> > well, anything since it can turn into a NOP in a number of places. Plus
> > many of the implementations of BIO_ORDERED assume the drive is like SCSI
> > and you just set the right tag to make it 'ordered'. For ATA we issue a non
> > NCQ command, which is a full drain of outstanding commands, send this
> > command, then start them again which really shuts down the parallelism we
> > implemented NCQ for :(. We do similar for NVME which is even worse. There
> > we have multiple submission queues in the hardware. To simulated it, we do
> > a similar drain, but that's going to get in the way as we move to NUMA and
> > systems where we try to do the I/O entirely on one CPU (both submission and
> > completion) and ordered I/O is guaranteed lock contention.
> 
> Independent of this, it doesn't really look like we have any way of
> handling write errors when dependencies are enforced using BIO_ORDERED.
> In the case of the babarrierwrite() consumer in FFS, what happens if the
> inode block write fails due to a transient error, but the following CG
> update succeeds?

Looking at this some more, I think iosched is ok since bioq_disksort()
does in fact handle BIO_ORDERED. I'm concerned though that there are
places in the tree where we should be using BIO_ORDERED but aren't.
gmirror metadata writes for instance are performed synchronously but
should not be reordered with preceding BIO_WRITEs.

Anyway, I've posted a review for some gmirror I/O ordering fixes here if
anyone is interested: https://reviews.freebsd.org/D13559



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171220203248.GB11032>