Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 May 2010 05:03:02 +0200
From:      Polytropon <freebsd@edvax.de>
To:        Gary Kline <kline@thought.org>
Cc:        FreeBSD Mailing List <freebsd-questions@freebsd.org>
Subject:   Re: any shortcuts to doc to ascii?
Message-ID:  <20100527050302.da39c258.freebsd@edvax.de>
In-Reply-To: <20100527013843.GA40751@thought.org>
References:  <20100527013843.GA40751@thought.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 26 May 2010 18:38:47 -0700, Gary Kline <kline@thought.org> wrote:
> 
> 
> guys,
> 
> is there anything that can take these hex triplets such as
> 
> We Don\xe2\x80\x99t
> 
> and render them back to the ascii or keyboard equivalents?
> in this case, the \x99 would be an apostrophe.
> thus:
> 
> 
> We Don't
> 
> tia,
> 
> gsry
> 
> ps: even lynx -dump messes up, i believe.  i'm trying to go from
> DOC  back to typewriter.... 


Yes, even a typewriter is better than DOC. :-)

To process DOC files into ASCII, there are several ways, with
different complexity:

Most complex ones: Use OpenOffice or Abiword, open the file and
save it as ASCII. Included "special characters" should be in
regular ASCII representation now.

Better: Use (from ports) catdoc or antiword.

I'm not sure in how far conflicting codepages may be involved.
It is known that "Windows" does have problems supporting standards,
and this applies to character sets and language variations, too.



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100527050302.da39c258.freebsd>