Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 22 Apr 2020 23:35:53 +0100
From:      "Norman Gray" <norman.gray@glasgow.ac.uk>
To:        Jordan <freebsd@jdev.sent.com>
Cc:        <freebsd-questions@freebsd.org>
Subject:   Re: PDF Documents Manipulation Software options
Message-ID:  <366AA2B3-5107-4336-AFBC-7D1821618289@glasgow.ac.uk>
In-Reply-To: <09e273ff-4d9d-47eb-a6e1-d91f18c8a0ef@www.fastmail.com>
References:  <09e273ff-4d9d-47eb-a6e1-d91f18c8a0ef@www.fastmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

Greetings.

On 22 Apr 2020, at 23:14, Jordan wrote:

> Any suggestions that you use or have heard that works with FreeBSD?

I have gathered a few links here, with (it sounds) similar goals in 
mind, but each time round this loop I've managed to solve my immediate 
problems without investigating what I've found too rigorously.  Also, 
when doing this I've been primarily working on macOS.  Bearing all that 
in mind, however, my notes are below.

_Just_ before sending this message, Polytropon's message appeared 
on-list.  They queried your statement that

> I work
> with hundreds of PDFs each day so I cannot work within a CLI to
> manipulate the pages.

I think that, in drafting my answer, I'd automatically misread what you 
said as 'so I cannot work _without_ a CLI to manipulate the images'.   
Echoing Polytropon, what tools are useful of course depends on just what 
you need to do, but whilst acknowledging that I may be answering a 
question you didn't ask, my notes below focus on programmatic 
manipulation of PDFs.

Good luck,

Norman





There is a Python library called 
[pikepdf](https://github.com/pikepdf/pikepdf).  It looks promising, but 
I had a little trouble building it -- I gave up before trying very hard, 
though.  This tools compares itself (favourably, of course) to PyPDF2, 
which seems to be the conventional suggestion.

However it seems to install happily enough as a python package (via 
venv/pip).  Then:

     from pikepdf import Pdf
     import glob
     pdf = Pdf.new()
     for file in glob.glob('part?.pdf'):
       src = Pdf.open(file)
       pdf.pages.extend(src.pages)

     pdf.save('allparts.pdf')

[pdftk](https://www.pdflabs.com/tools/pdftk-server/) is a ‘toolkit’ 
(not sure just what this means in this context), but it includes a 
[command-line 
interface](https://www.pdflabs.com/docs/pdftk-cli-examples/) which 
includes some useful examples such as

     % pdftk a.pdf b.pdf cat output ab.pdf

This also looks a bit tricky to build from scratch.

-- 
Norman Gray  :  http://www.astro.gla.ac.uk/users/norman/it/
Research IT Coordinator  :  School of Physics and Astronomy
// My current template week for IT tasks is: Monday, Tuesday, and Friday



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?366AA2B3-5107-4336-AFBC-7D1821618289>