Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Mar 2003 04:07:45 +0200
From:      Giorgos Keramidas <keramida@ceid.upatras.gr>
To:        Jeroen Ruigrok/asmodai <asmodai@wxs.nl>
Cc:        freebsd-doc@FreeBSD.ORG
Subject:   Re: docs/50211: [PATCH] Fix textfile creation
Message-ID:  <20030324020745.GA22656@gothmog.gr>
In-Reply-To: <200303231710.h2NHAGEb024196@freefall.freebsd.org>
References:  <200303231710.h2NHAGEb024196@freefall.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2003-03-23 09:10, Jeroen Ruigrok/asmodai <asmodai@wxs.nl> wrote:
>-On [20030323 15:07], Ceri Davies (ceri@FreeBSD.org) wrote:
>> We discussed this on -doc a month or so ago, and were generally thinking of
>> going back to www/lynx, because this also gets localized text builds working.
>
> Problem I had with lynx was that I was unable to make it parse
> book.html-tex as text/html.
> w3m has a -T flag for this, elinks just looks at the file itself, or
> perhaps just assumes it is HTML.
>
> >Would you happen to know if elinks has this advantage too ?
>
> It does, but I don't know for certain for which languages it all works:
>
> elinks -dump -dump-charset iso-8859-15 http://www.paris.fr/
>
> gives me accent aigus, accent circumflexes, etc.
>
> I would be interested in hearing about non-Latin-based examples and how
> they work out.

Fetching...  I'll try it with a Greek document in a while.

giorgos@gothmog[03:57]/tmp$ elinks -dump-charset ISO-8859-7 -dump 1 lala.html
   AAeec,ieeue eaass`iaaii.
giorgos@gothmog[03:57]/tmp$ grep charset lala.html
    <meta name="http-equiv" content="text/html; charset="ISO-8859-7">

Hrmf... Doesn't quite work.  At least, it doesn't work without
tweaking the ~/.elinks files and stuff.  This is bad, because we can't
use elinks in batch mod conversion of many different languages and
charsets without first configuring it through the curses interface.

There is an -eval command line option that should probably work fine
with non ISO-8859-1 texts, when used as:

	elinks -eval 'set document.codepage.assume = "ISO-8859-7"' \
	    -eval 'set terminal.vt220.charset = "ISO-8859-7"' \
	    -dump 1 lala.html

but I can't seen to find any good way of making this output raw 8-bit
text for Greek :(

And I even have my locale set up for Greek:

	giorgos@gothmog[04:06]/tmp$ env | grep LC
	LC_COLLATE=el_GR.ISO8859-7
	LC_CTYPE=el_GR.ISO8859-7

- Giorgos


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-doc" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030324020745.GA22656>