Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 20 Dec 2014 12:56:49 -0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        "Pokala, Ravi" <rpokala@panasas.com>
Cc:        John-Mark Gurney <jmg@funkthat.com>, "freebsd-geom@freebsd.org" <freebsd-geom@freebsd.org>
Subject:   Re: Converting LBAs to byte offsets through the GEOM stack
Message-ID:  <CAJ-Vmomm2yst=NN6hYopY7DR_Nw=HDa2v-Y9xtqji8xZn5b92A@mail.gmail.com>
In-Reply-To: <D0BB136C.1280A4%rpokala@panasas.com>
References:  <D0B89F30.127DAE%rpokala@panasas.com> <20141219015210.GY25139@funkthat.com> <D0B8C76C.127E55%rpokala@panasas.com> <CAJ-VmokV3-ZRQmVZWcHUSxccwaRxySDExoSiF8%2BsgHtkHN5_yg@mail.gmail.com> <D0BB136C.1280A4%rpokala@panasas.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 20 December 2014 at 11:54, Pokala, Ravi <rpokala@panasas.com> wrote:
> Hi Adrian,
>
>>So when doing stuff like this, I ended up piggybacking commands through
>>the translation layers, so stuff was done (a) in line with the rest of IO
>>processing, and (b) wouldn't suffer from stale data.
>
> Could you expand on that a little?

So say you had a geom layer that was doing bad block remapping.

It's a black box with a queue (and now it'd be a black box with locks
protecting the state, since there's direct dispatch GEOM, but ..)
where you push in IO requests to some particular offsets, and the
black box figures out which real disk / real offsets those requests
are for.

So to start with, you issue a request for block 0 from your geom black
box, and it maps it to block 0 on disk 0.

At some point it decides that it should map it to block 100 on disk 0
(or block 0 on disk 1, etc.)

The only thing that knows about the current state of the mapping is
that black box. And it's up to that black box to make sure that the IO
requests that are coming in get mapped to the right places. If you
have multiple dispatch threads that are sending the black box
requests, it's up to the black box to ensure that some
ordering/consistency for where things are mapped to occurs.

So, imagine then you want to do a reverse lookup. You ask through the
layer for what disk/block backs "block 0." It tells you, "block 0,
disk 0." Now, that's valid as long as the remapping layer doesn't
change that underneath you. If it decides to, you don't know - so when
you send your direct-to-disk request as you said, it may be right for
the time you did the reverse lookup, but it's certainly not right
"now."

When i was doing this stuff, it was a kind of bad block remapping and
disk mirroring thing for caching disk blocks. So when you issued a
request for "block 0 from this provider", it (a) would map to some
arbitrary disk and arbitrary offset, (b) that could change at any
point and your information would be stale, and (c) it may have mapped
to multiple backend disks, so what you really needed to do was send
that command to "all" the disks that backed that particular block.

So I had a thing that I attached commands to that would funnel down to
the geom layer that did this mirroring/caching/remapping thing, and it
would handle schedule the commands to whatever block(s) on whatever
disk(s) actually represented that particular logical offset. I
actually had something that'd let me issue commands that would map to
a single command to a single disk, or could be replicated to multiple
commands to multiple disks (and then i'd just get the completion from
them all in the reply message, as the bio didn't have enough space to
write multiple block reads into, and mostly I was issuing status check
commands like you are. :)

Is that making more sense? I can whiteboard it up next time we're in
the same place.



-adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmomm2yst=NN6hYopY7DR_Nw=HDa2v-Y9xtqji8xZn5b92A>