Python, Perl & PDF files | |
Are there any plans in the near future to support PDF files in Python as
thoroughly and completely as Perl does? http://cpan.uwinnipeg.ca/search?query=pdf&mode=dist
I love Python's clean syntax and ease of use, etc. But on some things
(PDF for example) as barbaric as Perl's syntax is, it does outshine
Python... I hate having to use Perl just to deal with PDF files. What do
others do??? | | | | re: Python, Perl & PDF files
rbt <rbt@athop1.ath.vt.edu> writes:
[color=blue]
> Are there any plans in the near future to support PDF files in Python
> as thoroughly and completely as Perl does?
>
> http://cpan.uwinnipeg.ca/search?query=pdf&mode=dist[/color]
Claiming that CPAN represents Perl "supporting" something isn't really
accurate. Those are just third party libraries, not support in the
language. There is an extensive set of third party libraries available
for Python as well, but there's no central repository to make finding
them easy.
That said, you can check out both pdflib and reportlab. pdflib is a
library that includes bindings for python, and reportlab is a
python-coded library for generating PDF. Since you don't say what you
want to do with PDF, I can't tell you which, if either, of these will
do what you want.
[color=blue]
> I love Python's clean syntax and ease of use, etc. But on some things
> (PDF for example) as barbaric as Perl's syntax is, it does outshine
> Python... I hate having to use Perl just to deal with PDF files. What
> do others do???[/color]
CPAN is a nice thing, and I'm sure that someone, somewhere, is working
on producing one for Python. Until it shows up, you have to learn to
search multiple places for third party libraries. Google works in this
case - the very first link on a search for "python pdf" is to an
article that talks about using reportlab with python.
<mike
--
Mike Meyer <mwm@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information. | | | | re: Python, Perl & PDF files
On Mon, 25 Apr 2005 09:23:43 -0400, rumours say that rbt
<rbt@athop1.ath.vt.edu> might have written:
[color=blue]
>Are there any plans in the near future to support PDF files in Python as
> thoroughly and completely as Perl does?[/color]
Before we let you know about our plans, what are *your* plans on this
subject? :) [0]
[color=blue]
> http://cpan.uwinnipeg.ca/search?query=pdf&mode=dist[/color]
[color=blue]
>I love Python's clean syntax and ease of use, etc. But on some things
>(PDF for example) as barbaric as Perl's syntax is, it does outshine
>Python... I hate having to use Perl just to deal with PDF files.[/color]
There are two issues here: a) a language, b) its library. You imply
that the Perl syntax outshines Python's because it has _more thorough
and complete support for PDF_, as you say. I don't see a connection,
but rather I see a lure to provoke answers, which doesn't always work
for you. For example, my reply would be more helpful and to the point
if that reasoning was missing from your post.
[color=blue]
>What do others do???[/color]
Search google perhaps? Why do you feel that the first result of the
query "python pdf" does not help you?
[0] there's an ancient myth about a peasant's cart getting stuck in the
mud, so the peasant starts calling out for help from goddess Athena.
Another peasant passing by tells him: "Syn Athena kai kheira kinei",
which means, more or less, "keep on calling Athena, but start also using
your hands."
I don't know any related myth of anglo-saxon origin to quote.
--
TZOTZIOY, I speak England very best.
"Be strict when sending and tolerant when receiving." (from RFC1958)
I really should keep that in mind when talking with people, actually... | | | | re: Python, Perl & PDF files
Christos TZOTZIOY Georgiou wrote:[color=blue]
> On Mon, 25 Apr 2005 09:23:43 -0400, rumours say that rbt
> <rbt@athop1.ath.vt.edu> might have written:
>
>[color=green]
>>Are there any plans in the near future to support PDF files in Python as
>> thoroughly and completely as Perl does?[/color]
>
>
> Before we let you know about our plans, what are *your* plans on this
> subject? :) [0][/color]
I just want to read PDF files in a portable way (windows, linux, mac)
from within Python.
[color=blue]
>
>[color=green]
>> http://cpan.uwinnipeg.ca/search?query=pdf&mode=dist[/color]
>
>[color=green]
>>I love Python's clean syntax and ease of use, etc. But on some things
>>(PDF for example) as barbaric as Perl's syntax is, it does outshine
>>Python... I hate having to use Perl just to deal with PDF files.[/color]
>
>
> There are two issues here: a) a language, b) its library. You imply
> that the Perl syntax outshines Python's because it has _more thorough
> and complete support for PDF_, as you say. I don't see a connection,
> but rather I see a lure to provoke answers[/color]
I do not seek to provoke. Sorry if my question comes across that way to you. | | | | re: Python, Perl & PDF files
On Mon, 25 Apr 2005 10:32:11 -0400, rumours say that rbt
<rbt@athop1.ath.vt.edu> might have written:
[color=blue]
>I do not seek to provoke. Sorry if my question comes across that way to you.[/color]
Thanks for giving attention to my post, no need for apologies.
By the way, you didn't say in which way ReportLab and pdflib are not
helpful to you (packages suggested directly by Mike Meyer and indirectly
by me).
--
TZOTZIOY, I speak England very best.
"Be strict when sending and tolerant when receiving." (from RFC1958)
I really should keep that in mind when talking with people, actually... | | | | re: Python, Perl & PDF files
rbt wrote:
.......[color=blue]
>
> I just want to read PDF files in a portable way (windows, linux, mac)
> from within Python.
>[/color]
.......
I suppose you mean extract PDF pages and do something with them. http://www.reportlab.com does have a tool that handles that in Python. It's not
free though.
There are indeed a number of perl modules which do that and other modules which
allow you to overprint etc etc.
You can always hand translate one of the extract perl modules. They don't seem
that hard. Alternatively put a good case to andy@reportlab.com.
--
Robin Becker | | | | re: Python, Perl & PDF files
rbt wrote:
[color=blue]
> I just want to read PDF files in a portable way (windows, linux, mac)
> from within Python.[/color]
reportlab is an excelent tool for generating pdf files, but as far as I
know, it doesn't "read" pdf's. http://www.reportlab.org/rl_toolkit.html
there's a project in sourceforge called pdf playground. http://sourceforge.net/projects/pdfplayground
it's supposed to read/write pdf files..
I never tried it, you might want to check it out.
I don't think there's something like perl's pdf support on python.
but you can find tools and libs that might help.
regards,
Alex | | | | re: Python, Perl & PDF files
On Mon, 25 Apr 2005 17:24:36 +0300, Christos "TZOTZIOY" Georgiou
<tzot@sil-tec.gr> declaimed the following in comp.lang.python:
[color=blue]
> [0] there's an ancient myth about a peasant's cart getting stuck in the
> mud, so the peasant starts calling out for help from goddess Athena.
> Another peasant passing by tells him: "Syn Athena kai kheira kinei",
> which means, more or less, "keep on calling Athena, but start also using
> your hands."
> I don't know any related myth of anglo-saxon origin to quote.[/color]
The most commonly known phrasing would likely be "God only helps
those who help themselves".
--[color=blue]
> ================================================== ============ <
> wlfraed@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG <
> wulfraed@dm.net | Bestiaria Support Staff <
> ================================================== ============ <
> Home Page: <http://www.dm.net/~wulfraed/> <
> Overflow Page: <http://wlfraed.home.netcom.com/> <[/color] | | | | re: Python, Perl & PDF files
Dennis Lee Bieber wrote:
........[color=blue]
>
>
> The most commonly known phrasing would likely be "God only helps
> those who help themselves".
>[/color]
Of course for politicians and others with troughed snouts it should read "God
punish those that help themselves", never seems to work out in practice though :(
-guzzling-ly yrs-
Robin Becker | | | | re: Python, Perl & PDF files
Dennis Lee Bieber wrote:[color=blue]
> On Mon, 25 Apr 2005 17:24:36 +0300, Christos "TZOTZIOY" Georgiou:[color=green]
>>I don't know any related myth of anglo-saxon origin to quote.[/color]
>
> The most commonly known phrasing would likely be "God only helps
> those who help themselves".[/color]
Google suggests that removing the word "only" produces a
phrase many times more commonly known...
-Peter | | | | re: Python, Perl & PDF files
Peter Hansen wrote:[color=blue]
> Dennis Lee Bieber wrote:[color=green]
>> On Mon, 25 Apr 2005 17:24:36 +0300, Christos "TZOTZIOY" Georgiou:[color=darkred]
>>> I don't know any related myth of anglo-saxon origin to quote.[/color][/color][/color]
[color=blue][color=green]
>> The most commonly known phrasing would likely be "God only helps
>> those who help themselves".[/color]
>
> Google suggests that removing the word "only" produces a
> phrase many times more commonly known...[/color]
And very interesting reading (to spawn another diversion
typical to c.l.p), such as the third link in Google
titled "Vessel of Honour: ..." (content available only
via the "Cached" link), which points out that this
biblical-sounding phrase was never in the bible,
but actually comes *from Greek mythology*, and specifically
(it claims) from the same story as Christos has
quoted, except that the "god" in question was Hercules
and other details differ somewhat...
Of course, the very next link then claims that it was
in fact the great god himself, Benjamin Franklin, who
gave us this phrase...
No doubt this is right up there with the origins of
"may you leave in interesting times". :-)
-Peter | | | | re: Python, Perl & PDF files
Peter Hansen wrote:[color=blue]
> Peter Hansen wrote:
>[color=green]
>> Dennis Lee Bieber wrote:
>>[color=darkred]
>>> On Mon, 25 Apr 2005 17:24:36 +0300, Christos "TZOTZIOY" Georgiou:
>>>
>>>> I don't know any related myth of anglo-saxon origin to quote.[/color][/color]
>
>[color=green][color=darkred]
>>> The most commonly known phrasing would likely be "God only helps
>>> those who help themselves".[/color]
>>
>>
>> Google suggests that removing the word "only" produces a
>> phrase many times more commonly known...[/color]
>
>
> And very interesting reading (to spawn another diversion
> typical to c.l.p)[/color]
OK, I'm seeking to provoke now... why don't you go hijack some other
thread?
OK, I'm done seeking to provoke. So, it's official. Perl has *much*,
*much* better support for dealing with PDF files than does Python.
Hopefully that'll change one day soon. If I had the programming
knowledge, I'd get on it right away, but alas I do not so I cannot ;)
Thanks to all who responded on topic. | | | | re: Python, Perl & PDF files
Christos TZOTZIOY Georgiou wrote:
[color=blue]
> [0] there's an ancient myth about a peasant's cart getting stuck in the
> mud, so the peasant starts calling out for help from goddess Athena.
> Another peasant passing by tells him: "Syn Athena kai kheira kinei",
> which means, more or less, "keep on calling Athena, but start also using
> your hands."
> I don't know any related myth of anglo-saxon origin to quote.[/color]
A man prays to God, very hard for a winning lottery ticket. He tells God
that he will use most of the money to do good works. Some he will use to
make life better for his family. He keeps praying and praying. He never
wins the lottery.
One day he is so angry, he goes to church and rants and raves to God
about not winning the lottery. Finally God comes and says to him "You
have to buy a ticket my son, for me to help you."
--
Michael Hoffman | | | | re: Python, Perl & PDF files
This is highly frustrating !!
Did Athena come to help or not ?
Christos TZOTZIOY Georgiou wrote:
[color=blue]
> On Mon, 25 Apr 2005 10:32:11 -0400, rumours say that rbt
> <rbt@athop1.ath.vt.edu> might have written:
>[color=green]
>>I do not seek to provoke. Sorry if my question comes across that way to
>>you.[/color]
>
> Thanks for giving attention to my post, no need for apologies.
>
> By the way, you didn't say in which way ReportLab and pdflib are not
> helpful to you (packages suggested directly by Mike Meyer and indirectly
> by me).[/color] | | | | re: Python, Perl & PDF files
Philippe C. Martin wrote:[color=blue]
> This is highly frustrating !!
>
> Did Athena come to help or not ?
>
>
>
>
>
> Christos TZOTZIOY Georgiou wrote:
>
>[color=green]
>>On Mon, 25 Apr 2005 10:32:11 -0400, rumours say that rbt
>><rbt@athop1.ath.vt.edu> might have written:
>>
>>[color=darkred]
>>>I do not seek to provoke. Sorry if my question comes across that way to
>>>you.[/color]
>>
>>Thanks for giving attention to my post, no need for apologies.
>>
>>By the way, you didn't say in which way ReportLab and pdflib are not
>>helpful to you (packages suggested directly by Mike Meyer and indirectly
>>by me).[/color]
>
>[/color]
NO. | | | | re: Python, Perl & PDF files
On Mon, 25 Apr 2005 18:54:51 +0100, Michael Hoffman
<cam.ac.uk@mh391.invalid> declaimed the following in comp.lang.python:
[color=blue]
> One day he is so angry, he goes to church and rants and raves to God
> about not winning the lottery. Finally God comes and says to him "You
> have to buy a ticket my son, for me to help you."[/color]
Well, at least that doesn't have the... finality... of the one
about the woman and the flood... Cutting most of the set-up noise, this
one essentially runs:
Flood waters rising severely, woman climbs out onto roof of
home, and prays to be saved. Various rafts, boats, (modernized)
helicopters come by offering to take her off. She turns them all down
saying God will save her...
Pearly Gates scene: woman asks God why he didn't save her. His
response "I sent rafts, boats, and helicopters to you; you didn't take
them"
--[color=blue]
> ================================================== ============ <
> wlfraed@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG <
> wulfraed@dm.net | Bestiaria Support Staff <
> ================================================== ============ <
> Home Page: <http://www.dm.net/~wulfraed/> <
> Overflow Page: <http://wlfraed.home.netcom.com/> <[/color] | | | | re: Python, Perl & PDF files
On Mon, 25 Apr 2005 13:00:51 -0400, rumours say that Peter Hansen
<peter@engcorp.com> might have written:
[color=blue]
>Peter Hansen wrote:[color=green]
>> Dennis Lee Bieber wrote:[color=darkred]
>>> On Mon, 25 Apr 2005 17:24:36 +0300, Christos "TZOTZIOY" Georgiou:
>>>> I don't know any related myth of anglo-saxon origin to quote.[/color][/color][/color]
[color=blue][color=green][color=darkred]
>>> The most commonly known phrasing would likely be "God only helps
>>> those who help themselves".[/color][/color][/color]
[color=blue][color=green]
>> Google suggests that removing the word "only" produces a
>> phrase many times more commonly known...[/color][/color]
[color=blue]
>And very interesting reading (to spawn another diversion
>typical to c.l.p), such as the third link in Google
>titled "Vessel of Honour: ..." (content available only
>via the "Cached" link), which points out that this
>biblical-sounding phrase was never in the bible,
>but actually comes *from Greek mythology*, and specifically
>(it claims) from the same story as Christos has
>quoted, except that the "god" in question was Hercules
>and other details differ somewhat...[/color]
It seems that I mixed two myths (I should pay more attention probably to
mythology lessons at primary school :). The fable with Hercules is the
correct one, as far as a cart and mud is concerned. The one about
Athena and arm-motion, is that Aesop's fable: http://www.mythfolklore.net/aesopica/oxford/480.htm
Now there's another phrase, "the smart bird gets caught by the beak",
and I don't know if I'm a smart bird, but my nose is big...
--
TZOTZIOY, I speak England very best.
"Be strict when sending and tolerant when receiving." (from RFC1958)
I really should keep that in mind when talking with people, actually... | | | | re: Python, Perl & PDF files
On Tue, 26 Apr 2005 11:58:29 +0300, rumours say that Christos "TZOTZIOY"
Georgiou <tzot@sil-tec.gr> might have written:
[color=blue]
> http://www.mythfolklore.net/aesopica/oxford/480.htm[/color]
BTW, does anyone see the connection between:
[color=blue]
> we should make every possible effort on our own behalf
> and only then ask for divine assistance[/color]
and proper netiquette?-) http://www.catb.org/~esr/faqs/smart-questions.html
--
TZOTZIOY, I speak England very best.
"Be strict when sending and tolerant when receiving." (from RFC1958)
I really should keep that in mind when talking with people, actually... | | | | re: Python, Perl & PDF files
Robin Becker wrote:[color=blue]
> rbt wrote:
> ......[color=green]
> >
> > I just want to read PDF files in a portable way (windows, linux,[/color][/color]
mac)[color=blue][color=green]
> > from within Python.
> >[/color]
> ......
>
> I suppose you mean extract PDF pages and do something with them.
> http://www.reportlab.com does have a tool that handles that in
> Python.
> It's not free though.[/color]
I imagine that you pay for a reasonable level of support.
[color=blue]
> There are indeed a number of perl modules which do that and other
> modules which allow you to overprint etc etc.
>
> You can always hand translate one of the extract perl modules. They
> don't seem that hard. Alternatively put a good case to
> andy@reportlab.com.[/color]
Before embarking on that route, it might be worth looking at this page: http://phaseit.net/claird/comp.text....onverters.html
There's a link to a (surprisingly recent) snapshot of my own package,
that can be used to read some PDF files, and another highly
recommended module. In the interests of balance, if not completeness,
I should also mention PDF Playground which has better support for
reading and writing PDF files: http://sourceforge.net/projects/pdfplayground/
Maybe this should also be listed on the above resources page. Cameron?
Are you reading this? ;-)
David | | | | re: Python, Perl & PDF files davidb@mcs.st-and.ac.uk wrote:[color=blue]
> Robin Becker wrote:
>[color=green]
>>rbt wrote:
>>......
>>[color=darkred]
>>>I just want to read PDF files in a portable way (windows, linux,[/color][/color]
>
> mac)
>[color=green][color=darkred]
>>>from within Python.
>>>[/color]
>>
>>......
>>
>>I suppose you mean extract PDF pages and do something with them.
>> http://www.reportlab.com does have a tool that handles that in
>>Python.
>>It's not free though.[/color]
>
>
> I imagine that you pay for a reasonable level of support.
>
>[color=green]
>>There are indeed a number of perl modules which do that and other
>>modules which allow you to overprint etc etc.
>>
>>You can always hand translate one of the extract perl modules. They
>>don't seem that hard. Alternatively put a good case to
>>andy@reportlab.com.[/color]
>
>
> Before embarking on that route, it might be worth looking at this page:
>
> http://phaseit.net/claird/comp.text....onverters.html
>
> There's a link to a (surprisingly recent) snapshot of my own package,
> that can be used to read some PDF files, and another highly
> recommended module. In the interests of balance, if not completeness,
> I should also mention PDF Playground which has better support for
> reading and writing PDF files:
>
> http://sourceforge.net/projects/pdfplayground/
>
> Maybe this should also be listed on the above resources page. Cameron?
> Are you reading this? ;-)
>
> David
>[/color]
Thanks David. I'll see what these tools can do for me. Wouldn't you
agree that it would be in Python's interest to have a standard way in
which it handles (reads, writes, etc) PDF files on multiple platforms?
It'd would *really* help me... I hate using Barbaric Perl scripts just
to deal with PDF files, but if I must, I must. | | | | re: Python, Perl & PDF files
In article <d4j91s$bfp$1@solaris.cc.vt.edu>,
rbt <rbt@athop1.ath.vt.edu> wrote: | | | | re: Python, Perl & PDF files
Cameron Laird wrote:[color=blue]
> In article <d4j91s$bfp$1@solaris.cc.vt.edu>,
> rbt <rbt@athop1.ath.vt.edu> wrote:
> .
> .
> .
>[color=green]
>>OK, I'm done seeking to provoke. So, it's official. Perl has *much*,
>>*much* better support for dealing with PDF files than does Python.[/color]
>
> .
> .
> .
> No, it's not; maybe Perl even has *worse* support. At the very least,
> there's more redundancy in what Perl offers.
>
> I understand that <URL: http://cpan.uwinnipeg.ca/search?query=pdf&mode=dist >
> looks rather convincing; in fact, a couple of years ago, we argued <URL:
> http://www.unixreview.com/documents/s=7822/ur0304g/ > that indeed Perl was
> ahead. I'm not sure that's true now.
>
> Rather than wander into all the details, let's start over: what kinds of
> things do you think you want to do with PDFs? You might be surprised to
> find that, despite all that CPAN *seems* to offer, your needs aren't met
> at all. Or maybe they are. It depends. Let's get specific.[/color]
Read and search them for strings. If I could do that on windows, linux
and mac with the *same* bit of Python code, I'd be very happy ;) | | | | re: Python, Perl & PDF files
In article <d4m9hl$8br$1@solaris.cc.vt.edu>,
rbt <rbt@athop1.ath.vt.edu> wrote: | | | | re: Python, Perl & PDF files
Cameron Laird wrote:[color=blue]
> In article <d4m9hl$8br$1@solaris.cc.vt.edu>,
> rbt <rbt@athop1.ath.vt.edu> wrote:
> .
> .
> .
>[color=green]
>>Read and search them for strings. If I could do that on windows, linux
>>and mac with the *same* bit of Python code, I'd be very happy ;)[/color]
>
>
> Textual content, right? Without regard to font funniness, or
> whether the string is in or out of a table, and so on?[/color]
That's right. More specifically, I've written a script that uses a RE to search
through documents for social security numbers. You can see it here: http://filebox.vt.edu/users/rtilley/...find_ssns.html
This works on Word, Excel, html, rtf or any ANSI based text. I need the ability to
read and make sense of PDF files as well so I can apply the RE to their content. It's
been frustrating to say the least. Nothing at all against Python... mostly just sick
of hearing about the 'Portable' document format that isn't string or RE searchable...
at least not easily anyway.
[color=blue]
> 'Might be a few days before I answer; I'm crashing into end-of-
> the-month deadlines.[/color]
No problem. Thanks for the help. | | | | re: Python, Perl & PDF files
Hopefully, Adobe will choose to support SVG as a response to
Microsoft's "Metro", and take us all off the hook with respect to
cracking open their proprietary format. | | | | re: Python, Perl & PDF files
In article <d4mme5$2ej$1@solaris.cc.vt.edu>,
rbt <rbt@athop1.ath.vt.edu> wrote:[color=blue]
>Cameron Laird wrote:[color=green]
>> In article <d4m9hl$8br$1@solaris.cc.vt.edu>,
>> rbt <rbt@athop1.ath.vt.edu> wrote:
>> .
>> .
>> .
>>[color=darkred]
>>>Read and search them for strings. If I could do that on windows, linux
>>>and mac with the *same* bit of Python code, I'd be very happy ;)[/color]
>>
>>
>> Textual content, right? Without regard to font funniness, or
>> whether the string is in or out of a table, and so on?[/color]
>
>That's right. More specifically, I've written a script that uses a RE to search
>through documents for social security numbers. You can see it here:
>
> http://filebox.vt.edu/users/rtilley/...find_ssns.html
>
>This works on Word, Excel, html, rtf or any ANSI based text. I need the
>ability to
>read and make sense of PDF files as well so I can apply the RE to their
>content. It's
>been frustrating to say the least. Nothing at all against Python...
>mostly just sick
>of hearing about the 'Portable' document format that isn't string or RE
>searchable...
>at least not easily anyway.[/color] |  | | | | /bytes/about
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 226,510 network members.
|