By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,767 Members | 1,357 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,767 IT Pros & Developers. It's quick & easy.

PDF library?

P: n/a
I have a big PDF file that I'd like to crunch, i.e. I want to select a
certain rectangular area from each page and make a new PDF combining
the selected areas from adjacent pages. I guess that means I need a
Python wrapper for GhostScript, or something similar. Anyone know if
that exists? Thanks.
Jul 18 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
On Tue, 20 Apr 2004 12:14:03 -0700, Paul Rubin wrote:
I have a big PDF file that I'd like to crunch, i.e. I want to select a
certain rectangular area from each page and make a new PDF combining the
selected areas from adjacent pages. I guess that means I need a Python
wrapper for GhostScript, or something similar. Anyone know if that
exists? Thanks.

http://www.reportlab.org/

handles pdf files.

Simon.

Jul 18 '05 #2

P: n/a
Simon Burton <si****@NOTTHISBIT.webone.com.au> writes:
http://www.reportlab.org/

handles pdf files.


Reportlab generates reports in pdf format, but I want to do the
opposite, namely read in pdf files that have already been generated by
a different program, and crunch on them. Any more ideas? Thanks.
Jul 18 '05 #3

P: n/a
Aloha,

Paul Rubin schrieb:
Simon Burton <si****@NOTTHISBIT.webone.com.au> writes:
http://www.reportlab.org/
handles pdf files.

Reportlab generates reports in pdf format, but I want to do the
opposite, namely read in pdf files that have already been generated by
a different program, and crunch on them. Any more ideas? Thanks.


The commercial version (reportlab.com) mentions a tool named
PageCatcher, that seems to be able to extract pages and page descriptions
out of .pdf documents. There is not that many information on the web-page.

If you read comp.text.tex you will find various solutions for composing
and a few for extracting data/content from .pdf documents. Afaik there
is at the moment (read as: i'm working on it) no free-self-contained-
python solution. But as python is very interface-friendly you can use
general tools like gs easily.

For your problem i would suggest to use gs als a .pdf to .ps filter
in the first place, work on the .ps and distill back with gs.

Wishing a happy day
LOBI
Jul 18 '05 #4

P: n/a
Andreas Lobinger schrieb:
If you read comp.text.pdf you will find various solutions for composing

Jul 18 '05 #5

P: n/a
Paul Rubin <http://ph****@NOSPAM.invalid> wrote in
news:7x************@ruckus.brouhaha.com:
Simon Burton <si****@NOTTHISBIT.webone.com.au> writes:
http://www.reportlab.org/

handles pdf files.


Reportlab generates reports in pdf format, but I want to do the
opposite, namely read in pdf files that have already been generated by
a different program, and crunch on them. Any more ideas? Thanks.


Reportlab does that as well, but you either have to pay them money or live
with a Reportlab watermark added to each page you process. So, if you are
doing this for fun it may not be a useful answer, but if its commercial you
can investigate it for free and pay later to remove the watermark.

Jul 18 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.