Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 31 Jan 1996 17:02:55 +73600 (PST)
From:      Julian Elischer <julian@ref.tfs.com>
To:        terry@lambert.org (Terry Lambert)
Cc:        bde@zeta.org.au, freebsd-current@freebsd.org, j@uriah.heep.sax.de
Subject:   Re: invalid primary partition table: no magic
Message-ID:  <199602010102.RAA11946@ref.tfs.com>
In-Reply-To: <199601312152.OAA10765@phaeton.artisoft.com> from "Terry Lambert" at Jan 31, 96 02:52:43 pm

next in thread | previous in thread | raw e-mail | index | archive | help
 Terry, will you stop stealing my ideas? :)
I have code for some of this written
Peter and I went over it in Perth a few weeks ago..

I think it basically impliments exactly what you are talking about here..
there are a few problems however....
consider a BSD partition put at the beginning of a disk..

at the beginning of the disk we therefore have a fdisk slice AND and disklabel
slice... which gets to claim the disk? :)
Possibly the fdisk-slice probe()  method knows that if a BSD  subslice
starts at 0 then to NOT grab control :)

> 
> Each logical device has a physical device a start sector, and a length,
> at a bare minimum.  Some logical devices substitute a pseudo-device for
> the physical device, and the pseudo device performs operations on the
> real physical device on the requesters behalf.  Like bad sector mapping,
> or 64 sector ofsetting for OnTrack DiskManager 6.x or 7.x (neither shown
> in the above example).
> 
> As long as we realize that directories are accessed via getdents and
> files via read/write, we can still treat the devices as raw devices
> even though they show up as directory vnodes.  That means we don't
> need /dev/dsk/wc0, /dev/dsk/wc0/t0, and /dev/wc0/t0/p0 to be seperate
> devices to let us access the controller, the first raw disk (including
> partition table), or the VFAT partition, erspectively.  We can use
> the directories as devices.
> 
> Because we get range restriction guarantees, FS events in the kernel
> on the /dev/wc0/t0/p0 devie can't screw up the contents of any other
> slice, period.  No matter what bad calculations the MSDOSFS makes.

I THINK we might have that now anyhow.. My guess our problem isn't just 
disk overshoots.

> 
> 
> So now we have the device/slice mess straightened out.
> 
> (*) One thing we may want to consider is that the "no claimant" cases
>     above are nearly the perfect mechanism to cause a callback to the
>     VFS code to ask each VFS if it want to "claim" a device -- causing
>     it to be mounted.

I don't know about that.. we don't know where to mount it..
such leaf nodes however would probably be given some sort of descriptive name 
however, describing what they are...

 The way I've been looking at it 
is that there are many stackable "disk-like object" drivers that have a bunch
of methods. The default method is to simply supply an offset and the handle
to the next layer down, however there are at LEAST the following methods
available:

probe
attach
doIO   <------ These two are really mutually exlusive
offset <------ This is a special case of doIO for common simple cases
parent <------ not used for such things as CCD drivers

so that if type.diIO is NULL then you simply add type.offset and switch to
type.parent, which might in turn have a doIO or offset..  etc.
eventually you hit the methods that were expoerted by the physical
device driver. Devices always have a doIO method.

basically
1) when you register a new 'disk-like' object, the 
	'disk-object' handler creates a DEVFS entry for it and 
	calls the 'probe' method of all known
	types until one says "I can handle this".
2) the new method is 'stacked' and it's 'attach' method is called.
3) The attach method will 'register' any sub-partitions it finds, 
	(goto 1 for ewach such sub partition)
4) Any sub partition that doesn't have a 'claiment'
	still has it's devfs entry which becomes the only source of
	actions.

Notes:
A 'type' might be a CCD driver, which recognises a label saying
"part 4 of a 5 part volume"

Every time you register a new 'disk-like' device, a 'structure is allocated,
and the 'next' ID is incremented. an entry is put into a hash table so
that that structure can be easily located, given that ID number.
The ID number is the minor number..

This means that there is no encoding of bits in minor numbers.
It also means that minors might be differnt each time you boot
(that's why devfs).. 

The whole thind hangs off a NEW major number
and might be done in parallel witht eh existing system for a while..
All disk-like parts form all devices would have the same major number
but would have differnt names under devfs and would have different minors
(allocated sequentially).

There would be methods to allow a 'disk-like' part to subdivide itself
assuming all existing parts were closed. I haven't done or worked out this
bit yet, though it might be enough to allow each method to 'spy'
on writes so that they can figure out when a new partition table (or whatever)
is being written.

Interestingly enough, when Peter and I did a white-board session
over this we decided that there were really
two classes of methods. 
Those called by a layer to act on it's own subdivisions
and those which acted on the larger (parent?) "device".

Because probing can be tricky I plan on passing 'context' hints
at probe time so that various probe routines are not working totally in
the dark as to what happenned before..
(e.g. finding a fdisk slice within an fdisk slice is legal but
should be treated differently (I think block numbers are not absolute in
extended partitions .. needs confirmation). 

Writing a new disk driver get's to be really simple..
write basic IO routines,
register a disk-like device.. stand back and await work..


+----------------------------------+       ______ _  __
|   __--_|\  Julian Elischer       |       \     U \/ / On assignment
|  /       \ julian@ref.tfs.com    +------>x   USA    \ in a very strange
| (   OZ    ) 300 lakeside Dr. oakland CA. \___   ___ | country !
+- X_.---._/  USA+(510) 645-3137(wk)           \_/   \\            
          v



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199602010102.RAA11946>