Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 13 Oct 2012 13:19:07 +0200
From:      Polytropon <>
To:        Gary Kline <>
Cc:        FreeBSD Mailing List <>
Subject:   Re: editing pdf files
Message-ID:  <>
In-Reply-To: <>
References:  <> <> <>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
On Fri, 12 Oct 2012 16:46:28 -0700, Gary Kline wrote:
> 	ive got a question that fits in here.  hopefully.
> 	last week  I found a book from 1901 that google had scanned and listed
> 	as a pdf file.  it was text plus photos of the rich/famous of the 
> 	1800s.  somehow, google found the exact string that matched my great
> 	grandfather [from the civil war].  I d'loaded the file (maybe 2mbytes)
> 	and searched using acroread.  nada.  I used the pdftotext utility.
> 	same: nothing but  some 600 page numbers.
> 	my guess is that google just took photos of the book and used other
> 	tools to create a pdf file.  I am not =that= serious  about genealogy,
> 	but I would like to know if there are any tools to edit this kind of
> 	pdf file.

In case the PDF is nothing more than a compilation of images,
there's a way to deal with it for editing:

step 1: disassemble
step 2: edit images
step 3: reassemble

The disassembling can be done with 

	% pdfimages source.pdf .

Then the files can be edited whatever tool you like, e. g. Gimp.
They often come out in PBM format.

Finally the images can be re-converted to PDF and combined to one
PDF file:

	for IMG in .*.pbm; do
		convert ${IMG} ${IMG}.pdf
	pdftk .*.pdf output target.pdf

Note the ".*" prefix for the file specification: The images extracted
by pdfimages match that pattern (at least in the case I tested it for).
If they get other names than .0000001.pbm, change the approach

Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...

Want to link to this message? Use this URL: <>