Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 4 Nov 2007 02:39:14 +0100
From:      cpghost <cpghost@cordula.ws>
To:        freebsd-questions@freebsd.org
Cc:        Gary Kline <kline@tao.thought.org>
Subject:   Re: pdf edit again.
Message-ID:  <20071104023914.3fabd2e7@epia-2.farid-hajji.net>
In-Reply-To: <20071104003851.GA98655@thought.org>
References:  <20071104003851.GA98655@thought.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 3 Nov 2007 16:38:55 -0800
Gary Kline <kline@tao.thought.org> wrote:

> 	A couple weeks ago I skimmed thru the postings on editing PDF
> 	files.  Wasn't entirely clear what the answer it because I
> never thought I would need to edit a GUI file.  I just found a book 
> 	from 1883 in pdf format.  I would like a text/ASCII/ISO_8859-1
> 	version.  Tried pfdtotext, but it doesn't work.   Nutshell: is
> 	there something I can use  to edit/look-at this book and get
> rid of whateveriit is that's causing pdftotext to fail.  (sorry for
> 	the grammar.... )

Old books in PDF are normally scanned bitmaps. There are no characters
or whatever therein; just pixels (EPS files). If you want to convert
that to ASCII, you'd need to extract the EPS files (use something like
pdfimages from the xpdf port), turn them into some bitmap format, and
run some kind of OCR software on that. It's a slow, unreliable,
error-prone and painful process though.

Good luck!

-cpghost.

-- 
Cordula's Web. http://www.cordula.ws/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071104023914.3fabd2e7>