Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Jan 2008 10:09:18 -0800
From:      "Murray Stokely" <murray@stokely.org>
To:        "=?ISO-8859-1?Q?G=E1bor_K=F6vesd=E1n?=" <gabor@freebsd.org>
Cc:        doceng@freebsd.org, freebsd-doc@freebsd.org
Subject:   Re: [PATCH] docproj port needs to use tidy-devel
Message-ID:  <2a7894eb0801251009w27463cd4n3f0fbbc9e62938cc@mail.gmail.com>
In-Reply-To: <4799A266.2030900@FreeBSD.org>
References:  <2a7894eb0801162124x76d7132y8de9f4a1d314d8aa@mail.gmail.com> <4799A266.2030900@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 1/25/08, G=E1bor K=F6vesd=E1n <gabor@freebsd.org> wrote:
>
> First, sorry for the late answer. Not just the xhtml, but the html
> output of tidy is incorrect as well, it does not validate. (I think
> www/63552 is related, because without tidy, such errors don't appear.)
> But, the newer tidy versions completely mess up character sets. They
> mess the Hungarian characters set surely, but I suspect there are
> others, too. The only reason that we don't disable it in the Hungarian
> project is that builder has an ancient version, which works fine.
> Besides, different versions of tidy have different set of command line
> options, which makes our toolchain less portable.
> But anyway, why we do really need tidy? I made some tests before without
> tidy and the only thing that I had to do for generating valid pages was
> to reinplace-edit the DTD. As sgmlnorm outputs our custom DTD, the
> webpages were not valid, but after replacing them with HTML 4.1
> Transitional DTD, everything validated. I'd prefer see it go away.
> Yes, I know that one reason for tidy is the indenting and line breaking
> in HTML code, the output of sgmlnorm is not for human consumption. But
> cannot we do that in a simpler way?


xsltproc can output nice .html with line breaks and indentation.  For
example I use this for the RSS feeds to make it
nice and human readable without going through tidy :

<xsl:output method=3D"xml" indent=3D"yes"/>

One more idea, which came to my mind about this. Currently, our webpages
> are not uniform. We use HTML 4.1 for our pages generated from .sgml and
> XHTML 1.1 for .xsl output. What do you think about using XHTML 1.1
> uniformly? Obviously, sgmlnorm cannot do that, but there are advantages


Yea, that's a low priority could/should be done sort of item.  I would focu=
s
first on any pages that actually don't validate or where you want to add
some xml feature that can't currently be accomplished with the older sgml
based pages.  Updating old content / adding new content to the Handbook or
something I think would be even more useful if you have the time.

As a result, I think it would be a good idea. Maybe it would be a good
> SoC project for me to polish the pages in this way as I'm interested, I
> want to learn more XML stuff and I want to participate in the upcoming
> SoC again. Another item would be to bring the doc repo to DocBook5 / XML.


Web projects like this I think aren't the main intent of the summer of code
program.  We had one project in this area in 2005, and Emily did an
excellent job with it writing a LOT of xslt code for us and completely
redesigning the web site, but converting the remaining sgml to xml isn't
really a good fit with the summer of code program.

But by all means, please do convert any individual SGML files to XML if tha=
t
is where your interests lay.

                    - Murray



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2a7894eb0801251009w27463cd4n3f0fbbc9e62938cc>