Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 May 2000 14:35:57 +0000
From:      Nik Clayton <nik@freebsd.org>
To:        doc@freebsd.org
Subject:   Including images in the documentation
Message-ID:  <20000509143555.A1692@kilt.nothing-going-on.org>

next in thread | raw e-mail | index | archive | help
Hi gang,

Well, it's been a busy month or so, but I think I've just about got
prototyped a reasonable scheme for including images in our documentation
set.  And the toolchain seems to be coming along well too.

The Basic Idea
--------------

We need to be able to include images in our documentation.  Diagrams,
screenshots, pictures of me doing a Peter Norton impression on the front
cover of the Handbook. . . but I digress.

There are a number of problems that have to be solved before this is 
feasible.

  *  Can we include images in our output formats?  Recall that our 'main'
     output formats are HTML, PS, PDF, and RTF.  All the other formats are
     generated from one of these.  So if we can't put images in all four
     formats we'll have problems.

  *  How do we reference the images from our documents?  We need a 
     DocBook-sanctioned mechanism for referring to individual images.

  *  What formats should the images be stored in, in the CVS repository?

  *  How do we convert images from one format to another?  We certainly
     don't want to have to carry around multiple copies of the same image
     in the repository if we can help it.

  *  How do we share images among languages?  I expect that at least some
     screenshots and diagrams will be language neutral.

Images in output formats
------------------------

Fortunately, this bit's relatively easy.  HTML, PS, PDF, and RTF all have
mechanisms for including images.  Sadly, they all use different formats,
based on the tools we're using.

  HTML             GIF, JPEG, PNG (becoming more widespread)

  PS               EPS

  PDF              PNG

  RTF              BMP

As you can see, that's a multitude of bitmap formats, and one vector
format (EPS).

Images in DocBook
-----------------

Including (or referring to) images using DocBook markup is a little more
complicated.

The basic idea (and I'm simplifying a bit here) is that you have a 
<mediaobject>, which contains one or more <imageobject>s.  Each <imageobject> 
is supposed to refer to the same image, but in different formats.

For example, suppose you have

    <mediaobject>
      <imageobject>
	<imagedata fileref="test.gif" format="gif">
      </imageobject>
    </mediaobject>

If you convert a document containing that fragment to HTML everything works
fine, as GIF is a format usable with HTML.  However, if you try and produce
a PS document you get a blank space.  The solution is to use

    <mediaobject>
      <!-- For the HTML output -->
      <imageobject>
	<imagedata fileref="test.gif" format="gif">
      </imageobject>

      <!-- For the PS output -->
      <imageobject>
        <imagedata fileref="test.eps" format="eps">
      </imageobject>

      <!-- For the PDF output -->
      <imageobject>
	<imagedata fileref="test.png" format="png">
      </imageobject>

      <!-- Skipped RTF for the time being -->
    </mediaobject>

The stylesheets are then supposed to take note of the output format
and only use the appropriate <imageobject>.

Sadly, it doesn't work like that with our current toolset.

If you look at how our stylesheets work, we have one set for HTML output,
and one set for print output.  And the PS and PDF versions are currently
all produced from the same .tex file.

This means that when the stylesheets are producing the .tex file, they
don't know which one of the <imageobject>s to include (i.e., they can't
decide whether to use the EPS one or the PDF one).

Being blunt, this is a pain in the arse.  We've got to resort to some 
SGML tricks to work around it.

So, this is what I think we're going to have to do.

 1.  Create some new parameter entities in freebsd.dtd, 

	 %output.ps;
	 %output.pdf;

     These are analogous to %output.html; and %output.print; which we
     already have.

 2.  Use these parameter entities when including images.  For example:

     <mediaobject>
       <imageobject>
	 <imagedata fileref="test.gif" format="gif">
       </imageobject>

       <![ %output.ps; [
       <imageobject>
	 <imagedata fileref="test.eps" format="eps">
       </imageobject>
       ]]>

       <![ %output.pdf; [
       <imageobject>
	 <imagedata fileref="test.pdf" format="pdf">
       </imageoject>
       ]]>

     To anticipate some questions:

       a)  Why don't you wrap the GIF image in %output.html; in the same
	   way?

	   Because there's no need.  DocBook is supposed to allow multiple
	   <imageobject>s, we're working around a problem with the 
	   stylesheets, not DocBook per se.

       b)  Why not do

	   <imageobject>
	     <![ %output.ps [ <imagedata fileref="test.eps" format="eps">]]>
	     <![ %output.pdf [ <imagedata fileref="test.png" format="png">]]>
	   </imageobject>

	   ?

	   Won't work.  When both of them are set to "IGNORE" we'll have an
	   invalid document.  I don't want to put %output.html; in there
	   for the same reason as in (a).  When the stylesheets support
	   things the way they're supposed to be, my earlier proposal can
	   be pulled out of the tree with the minimum of fuss (delete 4
	   lines in each <mediaobject>) whereas the approach above would
	   involve adding text as well.  I'd rather we did it as correctly
	   as possible the first time around.

 3.  Change the way we produce .tex files in doc.docbook.mk.  Currently
     .dvi and .pdf both depend on the .tex file.  We're going to need to
     change that, so that .dvi depends on .tex-ps, and .pdf depends on
     .tex-pdf (those .tex-foo names are examples, I'm up for alternatives).

     Then duplicate the existing .tex file to give us a .tex-ps and .tex-pdf
     rule.  The body of these rules will have to use "-ioutput.ps" or 
     "-ioutput.pdf" as appropriate.

Make sense?

Our make(1) framework's going to need some work as well.  We're going to
need a way to specify all the images used by a document, so that 
dependencies are calculated correctly.  We also need a way to install
images in the right place.

When the Handbook was broken up in to a <directory>/chapter.sgml files, the
idea was that we could put the images used in each chapter in the chapter's
directory.  This means we don't have to come up with unique names for each
image across the whole document.

However, it also brings up a conundrum.  At the moment we just install
the formatted output in to one directory.  We could stick with that,
and install all the images used by the document in to the same directory
as well.  But if we do that than we'll still need to ensure that the 
image names are unique across the entire document (this was discussed
briefly on -doc are few months back).  

So I think the install process will need to create subdirectories under
the main install directory to hold the images.  This has implications for
the various install targets, as well as the package target.

I haven't looked too deeply at these yet.  Anyone want to volunteer?  I
know pretty much how I'll do it, but I need to sit down and actually
write the code.

Image formats in the repository
-------------------------------

My preference is to store images in EPS and PNG formats in the repository,
but *not* both at the same time.  If it's a vector image (like a network
diagram, for example) then it gets stored in EPS.  If it's a bitmap then
we can use PNG.

PNG is (IMHO) better than the other bitmap formats.  It doesn't have the
licensing implications of GIF, it's not lossy, like JPEG, and it's not
bloated the way most BMP implementations are.

EPS is a de facto standard for vector graphics anyway, so I think that's
pretty much a no-brainer.

However, there is one wrinkle.  I don't know of any graphics packages
we can use that will let you edit EPS images. So I'm going to suggest
that we use "dia" (a free Visio clone) as the source format for our
vector images.  Instead of storing EPS files in the repository we would
store the dia files, and generate the EPS from them at build time.

I'm not completely wedded to the idea (we could use transfig, for example) 
but after using dia for the past couple of weeks on other projects it's 
certainly grown on me.

This has implications for the software needed to build the docs, of which
more in the next section.

Converting images from one format to another
--------------------------------------------

Because I don't think we want to store each image in multiple formats
in the repo, we need a way to convert between formats at build time.

I've been playing with ImageMagick over the past couple of weeks, and
after being a die-hard NetPBM user for the past few years I'm pretty
much converted.

However it, and dia (see previous section), are not small applications,
and have an extensive set of dependencies required in order to build them.
This would have a big impact on the resources required to build the docs,
and I'm uneasy about requiring people to have them.

OTOH, I definitely don't want to store more images in the repository than
we absolutely have to.

So, can anyone recommend any other command line converters we can use,
and/or easy to diagram editors with an output format that can be converted
to EPS?

Sharing images among languages
------------------------------

We may want to share images among languages.  I haven't given this a great
deal of thought yet (it only occured to me a few hours ago).  I suspect
the simplest way to do this is to have targets in the language Makefiles
that symlink images in from the English directory structure.

The downside is that it means the other languages will need the English
hierarchy checked out to build their docs (assuming we end up sharing
images in this way).

The upside is that it's simple.  The only other idea I had was to have
a doc/images directory (or similar), with appropriate subdirectories,
to contain the shared images.  I'm uneasy about this, as I foresee a
future filled with requests for repository copies as images that we 
thought were language specific become language neutral.

The other upside is that the other languages don't need to create images
at the same time as the English docs do.  They can link to the English
images immediately, and then add replacement images to their own 
language directories later, when they've been done.

Thoughts?

N
-- 
Internet connection, $19.95 a month.  Computer, $799.95.  Modem, $149.95.
Telephone line, $24.95 a month.  Software, free.  USENET transmission,
hundreds if not thousands of dollars.  Thinking before posting, priceless.
Somethings in life you can't buy.  For everything else, there's MasterCard.
  -- Graham Reed, in the Scary Devil Monastery


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-doc" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000509143555.A1692>