3.  Implementing a struct bio

      The first decision to be made was who got to use the name "struct buf", and considering the fact that it is the I/O aspect which gets separated out and that it only covers about ¼ of the bytes in struct buf, obviously the new structure for the I/O aspect gets a new name. Examining the naming in the kernel, the "bio" prefix seemed a given, for instance, the function to signal completion of an I/O request is already named "biodone()".

      Making the transition smooth is obviously also a priority and after some prototyping [note 2] it was found that a totally transparent transition could be made by embedding a copy of the new "struct bio" as the first element of "struct buf" and by using cpp(1) macros to alias the fields to the legacy struct buf names.

3.1.  The b_flags problem.

      Struct bio was defined by examining all code existing in the driver tree and finding all the struct buf fields which were legitimately used (as opposed to "hi-jacked" fields). One field was found to have "dual-use": the b_flags field. This required special attention. Examination showed that b_flags were used for three things:

+ Communication of the I/O command (READ, WRITE, FORMAT, DELETE)

+ Communication of ordering and error status

+ General status for non I/O aspect consumers of struct buf.

      For historic reasons B_WRITE was defined to be zero, which lead to confusion and bugs, this pushed the decision to have a separate "b_iocmd" field in struct buf and struct bio for communicating only the action to be performed.

      The ordering and error status bits were put in a new flag field "b_ioflag". This has left sufficiently many now unused bits in b_flags that the b_xflags element can now be merged back into b_flags.

3.2.  Definition of struct bio

      With the cleanup of b_flags in place, the definition of struct bio looks like this:

struct bio {
u_int bio_cmd; /* I/O operation. */
dev_t bio_dev; /* Device to do I/O on. */
daddr_t bio_blkno; /* Underlying physical block number. */
off_t bio_offset; /* Offset into file. */
long bio_bcount; /* Valid bytes in buffer. */
caddr_t bio_data; /* Memory, superblocks, indirect etc. */
u_int bio_flags; /* BIO_ flags. */
struct buf *_bio_buf; /* Parent buffer. */
int bio_error; /* Errno for BIO_ERROR. */
long bio_resid; /* Remaining I/O in bytes. */
void (*bio_done) __P((struct buf *));
void *bio_driver1; /* Private use by the callee. */
void *bio_driver2; /* Private use by the callee. */
void *bio_caller1; /* Private use by the caller. */
void *bio_caller2; /* Private use by the caller. */
TAILQ_ENTRY(bio) bio_queue; /* Disksort queue. */
daddr_t bio_pblkno; /* physical block number */
struct iodone_chain *bio_done_chain;

3.3.  Definition of struct buf

      After adding a struct bio to struct buf and the fields aliased into it struct buf looks like this:

struct buf {
/* XXX: b_io must be the first element of struct buf for now /phk */
struct bio b_io; /* "Builtin" I/O request. */
#define b_bcount b_io.bio_bcount
#define b_blkno b_io.bio_blkno
#define b_caller1 b_io.bio_caller1
#define b_caller2 b_io.bio_caller2
#define b_data b_io.bio_data
#define b_dev b_io.bio_dev
#define b_driver1 b_io.bio_driver1
#define b_driver2 b_io.bio_driver2
#define b_error b_io.bio_error
#define b_iocmd b_io.bio_cmd
#define b_iodone b_io.bio_done
#define b_iodone_chain b_io.bio_done_chain
#define b_ioflags b_io.bio_flags
#define b_offset b_io.bio_offset
#define b_pblkno b_io.bio_pblkno
#define b_resid b_io.bio_resid
LIST_ENTRY(buf) b_hash; /* Hash chain. */
TAILQ_ENTRY(buf) b_vnbufs; /* Buffer's associated vnode. */
TAILQ_ENTRY(buf) b_freelist; /* Free list position if not active. */
TAILQ_ENTRY(buf) b_act; /* Device driver queue when active. *new* */
long b_flags; /* B_* flags. */
unsigned short b_qindex; /* buffer queue index */
unsigned char b_xflags; /* extra flags */

      Putting the struct bio as the first element in struct buf during a transition period allows a pointer to either to be cast to a pointer of the other, which means that certain pieces of code can be left un-converted with the use of a couple of casts while the remaining pieces of code are tested. The ccd and vinum modules have been left un-converted like this for now.

      This is basically where FreeBSD-current stands today.

      The next step is to substitute struct bio for struct buf in all the code which only care about the I/O aspect: device drivers, diskslice/label. The patch to do this is up for review. [note 3] and consists mainly of systematic substitutions like these

s/struct buf/struct bio/
&c &c

3.4.  Future work

      It can be successfully argued that the cpp(1) macros used for aliasing above are ugly and should be expanded in place. It would certainly be trivial to do so, but not by definition worthwhile.

      Retaining the aliasing for the b_* and bio_* name-spaces this way leaves us with considerable flexibility in modifying the future interaction between the two. The DEV_STRATEGY() macro is the single point where a struct buf is turned into a struct bio and launched into the drivers to full-fill the I/O request and this provides us with a single isolated location for performing non-trivial translations.

      As an example of this flexibility: It has been proposed to essentially drop the b_blkno field and use the b_offset field to communicate the on-disk location of the data. b_blkno is a 32bit offset of B_DEVSIZE (512) bytes sectors which allows us to address two terabytes worth of data. Using b_offset as a 64 bit byte-address would not only allow us to address 8 million times larger disks, it would also make it possible to accommodate disks which use non-power-of-two sector-size, Audio CD-ROMs for instance.

      The above mentioned flexibility makes an implementation almost trivial:

+ Add code to DEV_STRATEGY() to populate b_offset from b_blkno in the cases where it is not valid. Today it is only valid for a struct buf marked B_PHYS.

+ Change diskslice/label, ccd, vinum and device drivers to use b_offset instead of b_blkno.

+ Remove the bio_blkno field from struct bio, add it to struct buf as b_blkno and remove the cpp(1) macro which aliased it into struct bio.

      Another possible transition could be to not have a "built-in" struct bio in struct buf. If for some reason struct bio grows fields of no relevance to struct buf it might be cheaper to remove struct bio from struct buf, un-alias the fields and have DEV_STRATEGY() allocate a struct bio and populate the relevant fields from struct buf. This would also be entirely transparent to both users of struct buf and struct bio as long as we retain the aliasing mechanism and DEV_STRATEGY().