Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Mar 1996 20:48:57 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        bde@zeta.org.au (Bruce Evans)
Cc:        freebsd-hackers@FreeBSD.org
Subject:   Re: fdisk and partition info
Message-ID:  <199603290348.UAA04470@phaeton.artisoft.com>
In-Reply-To: <199603280717.SAA18075@godzilla.zeta.org.au> from "Bruce Evans" at Mar 28, 96 06:17:12 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> >This means that in order to provide a FreeBSD fdisk program, you
> >must be able to determine the BIOS geometry of the underlying
> >drive.  This is without regard for the underlying implementation
> >details of the fdisk program itself.
> 
> >Generally, there are three approaches to solving this problem:
> 
> 0)	Ask the user.

Hee hee.  I will cop out and say "I meant 'solving this problem
correctly'"; an indirect reference through a flawed user gets
you a flawed answer.  8-).


> >1)	Assume the geometry is translated, and that it is
> >	1024/64/256.  This is what the current FreeBSD slice
> >	code does, much to the consternation of those of us
> >	who own WD1007 or other non-translating controllers.
> 
> No, it doesn't do anything like that.  Read the code.

If you will reciprocate and install a machine with a WD1007 with
factory jumpers, I will be pleased to do so.  8-).

Actually, looking at the code, I see an assumption for Adaptec;
in fact, a 1740 has two possible translation modes.  8-(.

You can correct this to "geometry assumptions are bad, even if
they are correct 90% of the time".


> >	geometries result in the same values, especially when you
> >	have a small number of partitions (for instance, one),
> >	this is not a 100% reliable approach -- in general, it is
> >	so *unreliable* that the slice code was invented to replace
> >	it.
> 
> A small part of the slice code was invented to _centralize_ this
> geometry guessing.  It's very confusing to have disk controllers BIOSes,
> fdisks, installerpersons, install programs, slice code and drivers all
> guessing the geometry.

Yes.  I agree that if guessing is to be used, then it should be
common code.  I dislike the guessing in the first place, but I
can't complain too loudly because of the level of difficulty
required to implement real fixes.  It's a 90% solution.  If it
weren't, the disklabel code could be moved into the kernel where
it wants to be (as a logical device driver that reexports physical
devices as multiple devices with device/bias/range).


> >	It's possible to "help" this approach be more accurate
> >	using the 32 bit absolute sector address, which is also in
> >	the partition table.  The problem with that is that older
> >	FDISK programs did not generally fill out the 32 bit sector
> >	offset correctly, so determining this information is usually
> >	no more reliable if the offset happens to check a valid
> >	geometry [NB: the FReeBSD FDISK should always fill these
> >	fields out correctly in any case).
> 
> The slice code _relies_ on the absolute sector numbers.  It could be
> smarter about guessing the geometry based on the number of sectors,
> but its not clear what it should do about inconsistencies.  Many
> inconsistencies are already detected, but no correction is attempted,
> and the warnings are disabled by default (set the kernel variable
> dsi_debug to enable the warnings).

Right. And this is a problem.  Unfortuantely, as I said before, the
amount of effort required to fix it to work in all cases is phenominal.

[ ... ]
> FreeBSD knows all the BIOS geometries but not the mappings.

Yup.  8-(.  This is the single largest obstacle to a "just do it
and work in all cases" interface, which is what the people who
haven't "paid their dues" complain about.

> >Other types of partitioning, specifically, DOS Extended partitions,
> >BSD disklabels, Solaris disklabels, SVR4 slicing, OSF disklabels,
> >etc. etc., should all be manageable with a single tool, in my opinion.
> 
> I disagree.  You'll end up with support for 101 different OS's in one
> program.

No; that's the beauty of it.  You don't put the intelligence in the
program.  You put it in the logical-to-physical drivers as callable
ioctl()'s, and you either demand-load, or compile-time select the
types of partitioning you want to allow.

The program deals with a generic class of ioctl()'s and only a
rough understanding of device hierarchy.

Basically, if your kernel can understand that type of partition,
then it can manipulate it.  The code overhead is relative small
compared to the existing requirements for read/status/device-nodes.

> >To achieve this, I would suggest leveraging the devfs code.  Specifically,
> 
> devfs would only provide the scaffolding.

Right.  You would need to have a class of driver that when a disk
device was registered with the devfs, it called back into a
logical-to-physical mapping list.

For instance, a disk with a DOS partition table with a DOS partition
on P1, a BSD partition on P2, and a DOS extended partition on P4,
with the DOS extended partition containing a DOS logical partition
(drive), and the BSD partition containing slice 'a' (ufs), slice
'b' (swap), and slice 'c' (ufs) whould do the following:

CONTROLLER 0 probes for devices
  DEVICE 1 found
  *REGISTER DEVICE 1 WITH DEVFS, NAME "/dev/dsk/c0d0
    -> DEVFS ASKS "BSD slice driver" DO YOU WANT THIS DEVICE?
       "BSD slice driver" TRYS TO RECOGNIZE DEVICE
       "BSD slice driver" FAILS TO RECOGNIZE DEVICE
    <- "BSD slice driver" SAYS NO
    -> DEVFS ASKS "DOS partition driver" DO YOU WANT THIS DEVICE?
       "DOS partition driver" TRYS TO RECOGNIZE DEVICE
       "DOS partition driver" RECOGNIZES DEVICE
       *REGISTER PARTITION 1 WITH DEVFS, NAME "/dev/dsk/c0d0/p1"
       -> DEVFS ASKS "BSD slice driver" DO YOU WANT THIS DEVICE?
          "BSD slice driver" TRYS TO RECOGNIZE DEVICE
          "BSD slice driver" FAILS TO RECOGNIZE DEVICE
       <- "BSD slice driver" SAYS NO
       -> DEVFS ASKS "DOS partition driver" DO YOU WANT THIS DEVICE?
          "DOS partition driver" TRYS TO RECOGNIZE DEVICE
          "DOS partition driver" FAILS TO RECOGNIZE DEVICE
       <- "DOS partition driver" SAYS NO
       -> DEVFS ASKS "DOS extended partition driver" DO YOU WANT THIS DEVICE?
          "DOS extended partition driver" TRYS TO RECOGNIZE DEVICE
          "DOS extended partition driver" FAILS TO RECOGNIZE DEVICE
       <- "DOS extended partition driver" SAYS NO
       #DEVFS CALLS MOUNT CODE "/dev/dsk/c0d0/p1 ARRIVED"
       *REGISTER PARTITION 2 WITH DEVFS, NAME "/dev/dsk/c0d0/p2"
       -> DEVFS ASKS "BSD slice driver" DO YOU WANT THIS DEVICE?
          "BSD slice driver" TRYS TO RECOGNIZE DEVICE
          "BSD slice driver" RECOGNIZES DEVICE
          *REGISTER PARTITION 1 WITH DEVFS, NAME "/dev/dsk/c0d0/p2/a"
          -> DEVFS ASKS "BSD slice driver" DO YOU WANT THIS DEVICE?
             "BSD slice driver" TRYS TO RECOGNIZE DEVICE
             "BSD slice driver" FAILS TO RECOGNIZE DEVICE
          <- "BSD slice driver" SAYS NO
          -> DEVFS ASKS "DOS partition driver" DO YOU WANT THIS DEVICE?
             "DOS partition driver" TRYS TO RECOGNIZE DEVICE
             "DOS partition driver" FAILS TO RECOGNIZE DEVICE
          <- "DOS partition driver" SAYS NO
          -> DEVFS ASKS "DOS extended partition driver" DO YOU WANT THIS DEVICE?
             "DOS extended partition driver" TRYS TO RECOGNIZE DEVICE
             "DOS extended partition driver" FAILS TO RECOGNIZE DEVICE
          <- "DOS extended partition driver" SAYS NO
          #DEVFS CALLS MOUNT CODE "/dev/dsk/c0d0/p2/a ARRIVED"
          *REGISTER PARTITION 1 WITH DEVFS, NAME "/dev/dsk/c0d0/p2/b"
          -> DEVFS ASKS "BSD slice driver" DO YOU WANT THIS DEVICE?
             "BSD slice driver" TRYS TO RECOGNIZE DEVICE
             "BSD slice driver" FAILS TO RECOGNIZE DEVICE
          <- "BSD slice driver" SAYS NO
          -> DEVFS ASKS "DOS partition driver" DO YOU WANT THIS DEVICE?
             "DOS partition driver" TRYS TO RECOGNIZE DEVICE
             "DOS partition driver" FAILS TO RECOGNIZE DEVICE
          <- "DOS partition driver" SAYS NO
          -> DEVFS ASKS "DOS extended partition driver" DO YOU WANT THIS DEVICE?
             "DOS extended partition driver" TRYS TO RECOGNIZE DEVICE
             "DOS extended partition driver" FAILS TO RECOGNIZE DEVICE
          <- "DOS extended partition driver" SAYS NO
          #DEVFS CALLS MOUNT CODE "/dev/dsk/c0d0/p2/b ARRIVED"
          *REGISTER PARTITION 1 WITH DEVFS, NAME "/dev/dsk/c0d0/p2/c"
          -> DEVFS ASKS "BSD slice driver" DO YOU WANT THIS DEVICE?
             "BSD slice driver" TRYS TO RECOGNIZE DEVICE
             "BSD slice driver" FAILS TO RECOGNIZE DEVICE
          <- "BSD slice driver" SAYS NO
          -> DEVFS ASKS "DOS partition driver" DO YOU WANT THIS DEVICE?
             "DOS partition driver" TRYS TO RECOGNIZE DEVICE
             "DOS partition driver" FAILS TO RECOGNIZE DEVICE
          <- "DOS partition driver" SAYS NO
          -> DEVFS ASKS "DOS extended partition driver" DO YOU WANT THIS DEVICE?
             "DOS extended partition driver" TRYS TO RECOGNIZE DEVICE
             "DOS extended partition driver" FAILS TO RECOGNIZE DEVICE
          <- "DOS extended partition driver" SAYS NO
          #DEVFS CALLS MOUNT CODE "/dev/dsk/c0d0/p2/c ARRIVED"
       <- "BSD slice driver" SAYS YES
       *REGISTER PARTITION 4 WITH DEVFS, NAME "/dev/dsk/c0d0/p4"
       -> DEVFS ASKS "BSD slice driver" DO YOU WANT THIS DEVICE?
          "BSD slice driver" TRYS TO RECOGNIZE DEVICE
          "BSD slice driver" FAILS TO RECOGNIZE DEVICE
       <- "BSD slice driver" SAYS NO
       -> DEVFS ASKS "DOS extended partition driver" DO YOU WANT THIS DEVICE?
          "DOS extended partition driver" TRYS TO RECOGNIZE DEVICE
          "DOS extended partition driver" RECOGNIZES DEVICE
          *REGISTER PARTITION 1 WITH DEVFS, NAME "/dev/dsk/c0d0/p4/1"
          -> DEVFS ASKS "BSD slice driver" DO YOU WANT THIS DEVICE?
             "BSD slice driver" TRYS TO RECOGNIZE DEVICE
             "BSD slice driver" FAILS TO RECOGNIZE DEVICE
          <- "BSD slice driver" SAYS NO
          -> DEVFS ASKS "DOS partition driver" DO YOU WANT THIS DEVICE?
             "DOS partition driver" TRYS TO RECOGNIZE DEVICE
             "DOS partition driver" FAILS TO RECOGNIZE DEVICE
          <- "DOS partition driver" SAYS NO
          -> DEVFS ASKS "DOS extended partition driver" DO YOU WANT THIS DEVICE?
             "DOS extended partition driver" TRYS TO RECOGNIZE DEVICE
             "DOS extended partition driver" FAILS TO RECOGNIZE DEVICE
          <- "DOS extended partition driver" SAYS NO
          #DEVFS CALLS MOUNT CODE "/dev/dsk/c0d0/p4/1 ARRIVED"
       <- "DOS extended partition driver" SAYS YES
    <- "DOS partition driver" SAYS YES

[ ... controller registers additional devices ... ]
[ ... additional controllers register additional devices ... ]


> >1)	The hierarchy could get large fast.  For instance, a device
> 
> No kidding.  There are already about 512+64 possible devices for the
> slice and partition layers.

Right... that's why you would use directories for population.

The example "callback as a result of probe true" above is actually
too verbose.  In reality, each driver would have "allowable parent"
codes that would prevent, for instance, searching out DOS partitions
on BSD slices or DOS extended paritions, etc..


> >2)	There is no natural recognition of hierarchical ordering in
> >	a flat name space, when in fact what is presented is a logical
> >	on physical driver hierarchy.  The n-m mapping of the graph is
> >	too complex to deal with in a flat name space and still present
> >	a uniform user interface.  Specifically, if I have an arbitrary
> >	device in a flat name space, am I allowed to add DOS partitioning
> >	to it or not?  My argument here is that without an easy way
> >	to traverse the hierarchy of devices to find parents, there
> >	is no easy answer to that question, short of iterating all
> >	devices...
> 
> I really want only one layer.  Multiple layers are difficult both to
> implement and to use, and don't provide any significant benefits.

The benefit here is that I can ioctl() the intermediate device
"/dev/dsk/c0d0" so that subsequnet access through the FD treats it as
a device rather than as a directory.  This is for the single FD
instance for that open.  This would allow a disk managment program
to do ioctl()'s to manipulate the DOS partitioning without having
to build devices with knowledge of possible export names that will be
used by the logical-to-physical drivers.

You could do the same thing with a flat namespace, if you were willing
to parse the devices into semantic units and build them up one
character concatenation at a time.

I think that for a minor change in the kernel (30 or less lines of
additional code), you could get what you wanted from a hierarchy
much easier.  Note that coning devices (I'll assume we will go to
cloning for pty's eventually...) will need this capability anyway:
I have to be able to ask the master (directory) device for the next
available clone device.

> >As for intermediate soloutions, the first problem is, as you've
> >identified, finding out the BIOS geometry for a given drive to
> >allow application of DOS disk partitioning and extended partitioning.
> 
> Imagine having this problem at `L' layers for `N' operating systems :-(.
> The geometry information might be burried at layer L(O) for operating
> system O.  For a practical example, consider accessing FreeBSD partitions
> under Linux vs accessing Linux partitions under FreeBSD.  Linux would
> have to do a lot more work to access FreeBSD partitions because they
> are one layer deeper.

No, they would have to provide their own device abstraction for BSD
devices -- they already must do this anyway.

The layering is not really an issue.  I mean, I assume you will want
to have logical layering that reeports the deviceto the same node
without changing its name for media perfection (ie: a bad144 that
can be applied to a disklabel and which doesn't care about the
1024 cylinder limit because it reserves up front), or you will want
a device that gets exported a a top level device.  In the second case,
we have volumes consisting of multiple phisical devices and doing
striping, soft RAID, mirroring, and simple concatenation -- the
difference being that the export waits for the arrival of all
members of the volume set.

It's a powerful abstraction, even if you complicate the tools by
flattening the hierarchy into the name space, and then limiting
the allowable name space (mayby to a single character plus a digit)
for each layer.  Hell, you could even build JFS volume spanning
in as a logical-to-physical mapping layer... and mount AIX disks.
Or NT disks, with spanned volume sets.

The mount issue is resolved by calling the FS to establish the mount
as a "per device root" mount instance -- and then mapping the instance
into the fstab.

The biggest change here is in identifying root -- which you can do
using the "last mounted on field", if it's initialized correctly on
install.


The eventual win on this whole thing is that once root is mounted, you
can establish mappings for the devices the system recognizes but for
which a traversal mount point does not yet exist.

This would allow you to drop a FreeBSD kernel on a Linux box (assuming
that the kernel did not require the bootloader to communicate information
to it, like BIOS geometries) and have it "just work".

"Competitive Upgrade" anyone?  8-).


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199603290348.UAA04470>