Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 3 Aug 1999 16:33:16 +0100
From:      Nik Clayton <nik@freebsd.org>
To:        Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>
Cc:        Tim Vanderhoek <vanderh@ecf.utoronto.ca>, Greg Lehey <grog@lemis.com>, Mike Pritchard <mpp@mpp.pro-ns.net>, Bruce Evans <bde@zeta.org.au>, rnordier@nordier.com, doc@freebsd.org, nik@freebsd.org
Subject:   Re: cvs commit: src/sbin/disklabel disklabel.8
Message-ID:  <19990803163315.D39416@kilt.nothing-going-on.org>
In-Reply-To: <19990803083741.B58351@daemon.ninth-circle.org>; from Jeroen Ruigrok/Asmodai on Tue, Aug 03, 1999 at 08:37:41AM %2B0200
References:  <199908010038.KAA16506@godzilla.zeta.org.au> <199908011141.GAA02125@mpp.pro-ns.net> <19990803113759.J62948@freebie.lemis.com> <19990802225533.A19050@mad> <19990803083741.B58351@daemon.ninth-circle.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 03, 1999 at 08:37:41AM +0200, Jeroen Ruigrok/Asmodai wrote:
> > Because using strictly mdoc will make it easier to change all the
> > manpages to DocBook?
> 
> Now this is interesting.

I was hoping this particular can of worms wouldn't come up -- at least not
for a while anyway.  No such luck :-)

[...]

> I would like to hear some comments on this because it might truly be the
> way to proceed on, but I still have some doubts and blanks about how to
> realise a few goals.

Here's my current take on manpages-in-DocBook -- this is not too well
organised, as it's something I tend to think about for three or four
minutes at a time, and then I start thinking about something (anything)
more interesting instead.

First of all, there is a precedent for having system manual pages marked
up in DocBook (or DocBook-lite, or whatever).  Sun's Solaris uses a 
variant of DocBook (called SolBook) which at least some of their manual
pages are written in.  I don't know what proportion of the pages this is
though, or why Sun chose to do this.

That said, I remain to be convinced that it would be a good idea for
FreeBSD -- certainly for the near to middle future (over the next year 
or so, at least).

One of the 'killer-apps' that's missing from DocBook is a good, standard,
mechanism to go from DocBook to *roff based markup.  We have DocBook
to HTML, plain text, PostScript, PDF, and RTF, but not *roff.

There are a number of approaches that could be used to tackle this problem.
In decreasing order of complexity (and increasing order of desirability, at
least to my mind) these are;

1.  Write a program that does this, and nothing else.  This might be in C,
    Python, Perl, or similar.  The formatting rules would be embedded in the
    program, making it pretty useless as a general purpose formatter.

    This is the simplest approach, and also the least expandable.  There
    already exist Perl implementations of this approach -- a DocBook
    RefEntry -> man page converter can be found on the web, probably
    somewhere under <URL:http://www.oasis-open.org/docbook/>;

2.  Try and find a program that can apply its own proprietry stylesheet or
    other formatting language to DocBook documents, and then write a 
    stylesheet in this language, to go from DocBook to *roff.

    For example, instant (ports/textproc/instant) can do this.  I'm not at
    a machine I can check this on, but I'm fairly certain there are 
    instant(1) 'translation specifications' to go from DocBook to *roff.
    If not, it's pretty easy to write one.

3.  Try and find a 'standard' stylesheet language, along with a processor
    for these stylesheets, and then write your stylesheets in this standard
    language, and hope for the best.

    There are three contenders for this approach.

    The first is Jade, which is what we're using to do the conversion to the
    other formats at the moment.  Jade is moderately big, and (here's the
    kicker) can not currently produce *roff output, which pretty much removes
    it from consideration at the moment.  Jade's stylesheets are written in 
    a Lisp/Scheme-ish language called DSSSL, which some people find to be
    a turn off.  

    I don't think Jade (or its successor, OpenJade) is likely to be able 
    to do this.  People have been talking about writing a *roff backend for
    Jade (which is written in C++, big, and not very well internally 
    documented) for a year or more now, and no one's actually stepped up
    and done the work.

    The second approach uses XML and XSL.  For the purposes of this 
    discussion XML == SGML-lite, and XSL is a procedural stylesheet language
    (unlike DSSSL).  

    In theory, we could convert the DocBook documents to XML (that's a no-
    brainer, and easy to do).  We would then have to write some XSL 
    stylesheets, and then run a hypothetical processor over the XML and the
    XSL to produce *roff.  For this we need the XSL processor, which 
    doesn't really exist yet -- also, the XSL language is in a state of
    flux at the moment.

    The third approach uses XML and XSLT.

    XSLT is a companion to XSL -- XSL is a 'style and formatting' language,
    it takes things like

	<sect1>
	  <title>This is a title</title>

	  <para>This is a para...</para>

    and converts that in to formatting instructions targetted at whatever
    output your producing (*roff commands, postscript code, and so on).

    XSLT, on the other hand, is used to transform (that's the 'T') documents
    from one DTD to another.  So, if you wanted to go from DocBook to
    Postscript you'd write a stylesheet in XSL, but if you wanted to go 
    from DocBook to HTML you'd write a stylesheet in XSLT (because DocBook
    and HTML are both DTDs, so going from one to the other is a 
    transformation -- in the case of DocBook to HTML it's a lossy 
    transformation).

    What I think we should have is a RoffDTD.  This would be markup 
    designed to capture the ins and outs of *roff markup.  This DTD 
    should be designed so that converting from RoffDTD to actual *roff
    markup is as simple as possible.

    This might be a bit too ambitious, so maybe an MDOCDTD, or similar
    instead -- whatever.  The aim is to have a final DTD which can be used
    to markup documents which can then be easily converted to *roff.

    Then going from DocBook to *roff becomes a two step process;  first
    you convert the document from the DocBook DTD to the RoffDTD (or
    MDOCDTD, or whatever).  This transformation is carried out be a 
    stylesheet written in XSLT (or, possibly, using Jade, which has an
    extension to DSSSL to support transforming from one DTD to another,
    which is how the DocBook to HTML conversion is carried out).

    Then you convert from the RoffDTD down to *roff markup, and process
    from there.

This last approach is the most flexible -- it allows you to transform from
arbitrary DTDs to RoffDTD, using any software that implements the standard
XSLT language, and then to go from *roff with a final step that's
hopefully quite simple.

However, there are one or two problems with this;

  1.  No one's written RoffDTD yet -- I don't know *roff at all well,
      and a lot of the above is handwaving on my part -- I assume that
      it's possible to write a DTD that (a) accurately captures *roff
      formatting, (b) is easy to mechanically convert to the *roff 
      formatting codes, but I have no actual proof of this.

  2.  I haven't found a light weight XSLT parser yet.  All the ones I've
      looked at are written in Java, and I'm not for one moment suggesting
      that we bring the hulking behemoth that is Java in to the base 
      system.  As I type this, the 30MB or so of source code that's
      required as dependencies for the textproc/lotusxsl port is 
      downloading in another window, and I wouldn't want to force that on
      anyone.[1]

And, on top of that, I'm not really sure we need the system man pages in
anything other than mdoc at the moment.  About the only real benefit 
that I can think of is that it would make conversion to HTML a little
simpler.  But the toolchain really isn't there to support it yet.

FWIW, Chuck Robey is working on a liteweight DocBook -> *roff formatter
which probably falls in to category (1) above.  I know he's very busy at
the moment -- he'll probably drop a note in on this discussion if he's
got the time, but if he doesn't then it's probably best not to bother
him at the moment.  This might well fill the gap sufficiently that 
starting to contemplate a migration from mdoc to DocBook would be 
worthwhile, but it's certainly a few months away from completion.

So, to sum up -- man pages in DocBook is a nice idea, but I don't think
it's of overwhelming importance yet.  The toolchain isn't there, and
there are lots of other things to do on the documentation that are (IMHO)
more pressing.

N

[1]  Just in case anyone's wondering why;  I purchased a Palm Pilot 
     recently, and I'm spending a little bit of spare time investigating
     how hard it would be to get documents like the FAQ on to the Pilot.
     More news as and when.
-- 
 [intentional self-reference] can be easily accommodated using a blessed,
 non-self-referential dummy head-node whose own object destructor severs
 the links.
    -- Tom Christiansen in <375143b5@cs.colorado.edu>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-doc" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990803163315.D39416>