Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 13 Oct 2012 17:26:42 -0700
From:      Gary Kline <kline@thought.org>
To:        Polytropon <freebsd@edvax.de>
Cc:        FreeBSD Mailing List <freebsd-questions@freebsd.org>
Subject:   Re: editing pdf files
Message-ID:  <20121014002642.GA26447@ethic.thought.org>
In-Reply-To: <20121013231536.c703bc21.freebsd@edvax.de>
References:  <5074A6B9.8040209@dreamchaser.org> <5078641D.4050905@passap.ru> <20121012234628.GA11112@ethic.thought.org> <20121013131907.c666bfc2.freebsd@edvax.de> <20121013204701.GE14155@ethic.thought.org> <20121013231536.c703bc21.freebsd@edvax.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Oct 13, 2012 at 11:15:36PM +0200, Polytropon wrote:
> On Sat, 13 Oct 2012 13:47:01 -0700, Gary Kline wrote:
> > On Sat, Oct 13, 2012 at 01:19:07PM +0200, Polytropon wrote:
> > > On Fri, 12 Oct 2012 16:46:28 -0700, Gary Kline wrote:
> > > 
> > > The disassembling can be done with 
> > > 
> > > 	% pdfimages source.pdf .
> > > 
> > > Then the files can be edited whatever tool you like, e. g. Gimp.
> > > They often come out in PBM format.
> > > 
> > 
> > 
> > 	A qstn I should have asked last time.  this book is a history or
> > 	bio of richland county, ohio:: 	in type, it's like 650 or more
> > 	pages.  SO: Is pdfimages going to spit of 6t50 files?  as noted 
> > 	in last email, only  a couple of these images are of any interest 
> 
> Depends on what actually _is_ in the PDF file. If every page is
> represented as a picture, 650 pictures will be created. If it
> contains text _and_ images, the images will be output, if will
> _only_ output the images, with no real realtion to where they
> have been placed in the text. As suggested by the name "pdfimages"
> it takes the images from the PDF file. :-)
> 
> The easiest way to check for possible text is to install xpdf
> which brings the binary "pdftotext" (if I remember correctly that
> this tool is in _that_ package). You can then use it like this:
> 
> 	% pdftotext source.pdf
> 
> It will create "source.txt" with all actual text (but of course
> without _any_ formatting except line breaks and ^L page breaks),
> including page numbers. But hey, it's pure ASCII text suitable
> for further processing. :-)
> 
> Run "pdftotext" without parameters for a short summary of its
> parameters; "man pdftotext" is also provided.
> 


	Well, then my original instincts were right.  I ran the 
	pdftotext <file.pdf> and nothing but the page numbers were 
	there.   rats.  oh-well, at least I can type in byhhand what 
	I want:)


> 
> -- 
> Polytropon
> Magdeburg, Germany
> Happy FreeBSD user since 4.0
> Andra moi ennepe, Mousa, ...

-- 
 Gary Kline  kline@thought.org  http://www.thought.org  Public Service Unix
              Twenty-six years of service to the Unix community.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121014002642.GA26447>