Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 10 Mar 2013 12:59:46 -0700 (PDT)
From:      Don Lewis <truckman@FreeBSD.org>
To:        lev@FreeBSD.org
Cc:        freebsd-fs@FreeBSD.org, ivoras@FreeBSD.org, freebsd-geom@FreeBSD.org
Subject:   Re: Unexpected SU+J inconsistency AGAIN -- please, don't shift topic to ZFS!
Message-ID:  <201303101959.r2AJxkIg047829@gw.catspoiler.org>
In-Reply-To: <1809201254.20130309160817@serebryakov.spb.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On  9 Mar, Lev Serebryakov wrote:
> Hello, Don.
> You wrote 9 марта 2013 г., 7:03:52:
> 
> 
>>>     But anyway, torrent client is bad benchmark if we start to speak
>>>   about some real experiments to decide what could be improved in
>>>   FFS/GEOM stack, as it is not very repeatable.
> DL> I seem to recall that you mentioning that the raid5 geom layer is doing
> DL> a lot of caching, presumably to coalesce writes.  If this causes the
> DL> responses to writes to be delayed too much, then the geom layer could
> DL> end up starved for writes because the vfs.hirunningspace limit will be
> DL> reached.  If this happens, you'll see threads waiting on wdrain.  You
> DL> could also monitor vfs.runningbufspace to see how close it is getting to
> DL> the limit.  If this is the problem, you might want to try cranking up
>   Strangely  enough,  vfs.runningbufspace  is  always zero, even under
> load.

That's very odd ...

> My geom_raid5 is configured to dealy writes up to 15 seconds...
> 
> DL> Something else to look at is what problems might the delayed write
> DL> completion notifications from the drives cause in the raid5 layer
> DL> itself.  Could that be preventing the raid5 layer from sending other I/O
> DL> commands to the drives?   Between the time a write command has been sent
>    Nope. It should not. I'm not sure for 100%, as I picked up these
> sources from original author and sources are rather cryptic, but I
> could not see any throttling in it.
> DL> to a drive and the drive reports the completion of the write, what
> DL> happens if something wants to touch that buffer?
> 
> DL> What size writes does the application typically do?  What is the UFS
>   64K writes, 32K blocksize, 128K stripe size... Now I'm analyzing
> traces from this device to understand exact write patterns.

It would be interesting to see what percentage of the writes are full
stripe versus a partial stripe to see how effective the caching is.  The
partial stripe writes probably have to read the parity in order to
update it.

It would also be interesting to monitor the number of commands each
drive is handling.  In the ahci.c, it looks like it would be
ch->numrslots assuming that you aren't using a port multiplier.

> DL> blocksize?  What is the raid5 stripe size?  With this access pattern,
> DL> you may get poor results if the stripe size is much greater than the
> DL> block and write sizes.
> 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201303101959.r2AJxkIg047829>