473,883 Members | 1,669 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

convert pdf to png

I need to take the take the pdf output from reportlab and create a preview image
for a web page. so png or something. I am sure ghostscript will be involved.
I am guessing PIL or ImageMagic ?

all sugestions welcome.

Carl K
Dec 24 '07
23 11386
Carl K schrieb:
Grant Edwards wrote:
>On 2007-12-24, Carl K <ca**@personnel ware.comwrote:
>>>If it is a multi page pdf Imagemagick will do:

convert file.pdf page-%03d.png
I need python code to do this. It is going to be run on a
someone else's shared host web server, security and
performance is an issue. So I would rather not run stuff via
popen.

Use subprocess.

Trying to eliminate popen because of the overhead when running
ghostscript to render PDF (I assume convert uses gs?) is about
like trimming an elephants toenails to save weight.

maybe, but I wouldn't be so sure.

currently the pdf is created in a python StringIO buffer and returned to
the browser; so it never becomes a file. using convert means I have to
first save it as a file, convert from file to file, read the file,
delete the 2 files. so 6 file operations where before there were none.
That may be more of a load than the ghostscript part.
So what? I'm not sure about current HD speeds, but a couple of years ago
these were about 30MByte/s - and should be faster today. Which equals
240MBit/s, much more than your user's internet connection. and this is
raw IO speed, not counting disk caches.

In other words: given the overall latency of a network connection, your
file operations shouldn't shave off more than a split-second. So if
you _can_ go the subprocess-road, do it. It's the easiest way. And
withou further knowledge of the GS-library (that you lack, as do I) -
how do you know that it works "in memory", and doesn't actually expect a
file-name or pointer?

Diez
Dec 25 '07 #11
Carl K <ca**@personnel ware.comwrites:
I need to take the take the pdf output from reportlab and create a
preview image for a web page. so png or something. I am sure
ghostscript will be involved. I am guessing PIL or ImageMagic ?

all sugestions welcome.
Did you try to use `reportPM` from rl_addons [1]_?
This is an extension of the reportlab package.

There is also PIL needed and on my linux box
I needed some additional fonts [2]_.

And then I could create PNG directly from reportlab, e.g:

<code>
from reportlab.graph ics.shapes import Drawing, String
from reportlab.graph ics import renderPM

d = Drawing(400, 200)
d.add(String(15 0, 100, 'Hello World', fontSize=18))
renderPM.drawTo File(d, 'test.png', 'PNG')
</code>

... [1] http://www.reportlab.co.uk/svn/publi...unk/rl_addons/
... [2] http://www.reportlab.com/ftp/fonts/pfbfer.zip

HTH,
Rob
Dec 25 '07 #12
Diez B. Roggisch wrote:
Carl K schrieb:
>Grant Edwards wrote:
>>On 2007-12-24, Carl K <ca**@personnel ware.comwrote:

If it is a multi page pdf Imagemagick will do:
>
convert file.pdf page-%03d.png
I need python code to do this. It is going to be run on a
someone else's shared host web server, security and
performanc e is an issue. So I would rather not run stuff via
popen.

Use subprocess.

Trying to eliminate popen because of the overhead when running
ghostscript to render PDF (I assume convert uses gs?) is about
like trimming an elephants toenails to save weight.

maybe, but I wouldn't be so sure.

currently the pdf is created in a python StringIO buffer and returned
to the browser; so it never becomes a file. using convert means I
have to first save it as a file, convert from file to file, read the
file, delete the 2 files. so 6 file operations where before there were
none. That may be more of a load than the ghostscript part.

So what? I'm not sure about current HD speeds, but a couple of years ago
these were about 30MByte/s - and should be faster today. Which equals
240MBit/s, much more than your user's internet connection. and this is
raw IO speed, not counting disk caches.
server is doing a ton of SQL queries (yes, moving to a 2nd box would be nice.
might happen mid 2008) so adding HD is an issue. not sure how much, but enough
to try to avoid it.
>
In other words: given the overall latency of a network connection, your
file operations shouldn't shave off more than a split-second.
those split seconds can add up. The server is aleady overloaded, so adding more
is a big no no.
So if you
_can_ go the subprocess-road, do it. It's the easiest way. And withou
further knowledge of the GS-library (that you lack, as do I) - how do
you know that it works "in memory", and doesn't actually expect a
file-name or pointer?
I am willing to take that chance. much better than the 6 hits I know would
happen using

I have a feeling if I have to create a file, we will go with plan B: send the
client a pdf and let the user deal with it. Not as nice and slick, but won't
bog the server.

Carl K
Dec 25 '07 #13
Rob Wolfe wrote:
Carl K <ca**@personnel ware.comwrites:
>I need to take the take the pdf output from reportlab and create a
preview image for a web page. so png or something. I am sure
ghostscript will be involved. I am guessing PIL or ImageMagic ?

all sugestions welcome.

Did you try to use `reportPM` from rl_addons [1]_?
This is an extension of the reportlab package.

There is also PIL needed and on my linux box
I needed some additional fonts [2]_.

And then I could create PNG directly from reportlab, e.g:

<code>
from reportlab.graph ics.shapes import Drawing, String
from reportlab.graph ics import renderPM

d = Drawing(400, 200)
d.add(String(15 0, 100, 'Hello World', fontSize=18))
renderPM.drawTo File(d, 'test.png', 'PNG')
</code>

.. [1] http://www.reportlab.co.uk/svn/publi...unk/rl_addons/
.. [2] http://www.reportlab.com/ftp/fonts/pfbfer.zip
This sounds like what I was looking for. some how this got missed when I poked
around reportlab land.

Thanks much.

Carl K
Dec 25 '07 #14
On 2007-12-25, Diez B. Roggisch <de***@nospam.w eb.dewrote:
Carl K schrieb:
>Grant Edwards wrote:
>>On 2007-12-24, Carl K <ca**@personnel ware.comwrote:

If it is a multi page pdf Imagemagick will do:
>
convert file.pdf page-%03d.png
I need python code to do this. It is going to be run on a
someone else's shared host web server, security and
performanc e is an issue. So I would rather not run stuff via
popen.

Use subprocess.

Trying to eliminate popen because of the overhead when running
ghostscript to render PDF (I assume convert uses gs?) is about
like trimming an elephants toenails to save weight.

maybe, but I wouldn't be so sure.

currently the pdf is created in a python StringIO buffer and returned to
the browser; so it never becomes a file. using convert means I have to
first save it as a file, convert from file to file, read the file,
delete the 2 files. so 6 file operations where before there were none.
That may be more of a load than the ghostscript part.

So what? I'm not sure about current HD speeds, but a couple of years ago
these were about 30MByte/s - and should be faster today. Which equals
240MBit/s, much more than your user's internet connection. and this is
raw IO speed, not counting disk caches.
Unless the file is really huge (or the server is overloaded),
the bytes will probably never even hit a platter. If you're
using any even remotely modern OS, short-lived tempfiles used
as you desdcribe are basically just memory-buffers with a
filesystem API.

--
Grant

Dec 25 '07 #15
Carl K wrote:
Andrew MacIntyre wrote:
>Grant Edwards wrote:
>>On 2007-12-24, Carl K <ca**@personnel ware.comwrote:

If it is a multi page pdf Imagemagick will do:
>
convert file.pdf page-%03d.png
I need python code to do this. It is going to be run on a
someone else's shared host web server, security and
performanc e is an issue. So I would rather not run stuff via
popen.
Use subprocess.

Trying to eliminate popen because of the overhead when running
ghostscript to render PDF (I assume convert uses gs?) is about
like trimming an elephants toenails to save weight.
Using ctypes to call Ghostscript's API also works well. I've only done
this on Windows, but it should also work on other systems with ctypes
support.

sounds good, but I have 0.0 clue what that actually means.

Can you give me what you did with windows in hopes that I can figure out how to
do it in Linux? I am guessing it shouldn't be to different. (well, hoping...)
ctypes is a foreign function interface (FFI) extension that became part
of the standard library with Python 2.5 (& is available for 2.3 & 2.4).
It is supported on Linux, *BSD & Solaris (I think) in addition to Windows.

Ghostscript for quite some time has had support for being used as a
library (DLL on Windows). There are only a small number of API functions
exported, and there is information about the net for calling these API
functions from Visual Basic. I wrote a wrapper module using ctypes for
the API based on the C header and the VB information.

To get the best rendering, some understanding of Ghostscript options is
required particularly for image format outputs (eg for anti-aliasing text).

--
-------------------------------------------------------------------------
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: an*****@bullsey e.apana.org.au (pref) | Snail: PO Box 370
an*****@pcug.or g.au (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia
Dec 26 '07 #16
Carl K schrieb:
Diez B. Roggisch wrote:
>Carl K schrieb:
>>Grant Edwards wrote:
On 2007-12-24, Carl K <ca**@personnel ware.comwrote:

>If it is a multi page pdf Imagemagick will do:
>>
>convert file.pdf page-%03d.png
I need python code to do this. It is going to be run on a
someone else's shared host web server, security and
performan ce is an issue. So I would rather not run stuff via
popen.

Use subprocess.

Trying to eliminate popen because of the overhead when running
ghostscrip t to render PDF (I assume convert uses gs?) is about
like trimming an elephants toenails to save weight.
maybe, but I wouldn't be so sure.

currently the pdf is created in a python StringIO buffer and returned
to the browser; so it never becomes a file. using convert means I
have to first save it as a file, convert from file to file, read the
file, delete the 2 files. so 6 file operations where before there
were none. That may be more of a load than the ghostscript part.

So what? I'm not sure about current HD speeds, but a couple of years
ago these were about 30MByte/s - and should be faster today. Which
equals 240MBit/s, much more than your user's internet connection. and
this is raw IO speed, not counting disk caches.

server is doing a ton of SQL queries (yes, moving to a 2nd box would be
nice. might happen mid 2008) so adding HD is an issue. not sure how
much, but enough to try to avoid it.
Keeping stuff in memory provoking paging isn't?
>>
In other words: given the overall latency of a network connection,
your file operations shouldn't shave off more than a split-second.

those split seconds can add up. The server is aleady overloaded, so
adding more is a big no no.
So if you
_can_ go the subprocess-road, do it. It's the easiest way. And withou
further knowledge of the GS-library (that you lack, as do I) - how do
you know that it works "in memory", and doesn't actually expect a
file-name or pointer?

I am willing to take that chance. much better than the 6 hits I know
would happen using

I have a feeling if I have to create a file, we will go with plan B:
send the client a pdf and let the user deal with it. Not as nice and
slick, but won't bog the server.
I have the feeling you just go by your feelings. Which is always a bad
idea regarding performance bottlenecks.

http://en.wikipedia.org/wiki/Optimiz...mputer_science)

So instead of jumping through hoops getting something done the hard way
without knowing how the easy solution affects performance, implement the
feature the easiest way. And SEE if it causes trouble.

Diez
Dec 26 '07 #17
Carl K wrote:
Rob Wolfe wrote:
>Carl K <ca**@personnel ware.comwrites:
>>I need to take the take the pdf output from reportlab and create a
preview image for a web page. so png or something. I am sure
ghostscript will be involved. I am guessing PIL or ImageMagic ?

all sugestions welcome.

Did you try to use `reportPM` from rl_addons [1]_? This is an
extension of the reportlab package.

There is also PIL needed and on my linux box
I needed some additional fonts [2]_.

And then I could create PNG directly from reportlab, e.g:

<code>
from reportlab.graph ics.shapes import Drawing, String
from reportlab.graph ics import renderPM

d = Drawing(400, 200)
d.add(String(1 50, 100, 'Hello World', fontSize=18))
renderPM.drawT oFile(d, 'test.png', 'PNG')
</code>

.. [1] http://www.reportlab.co.uk/svn/publi...unk/rl_addons/
.. [2] http://www.reportlab.com/ftp/fonts/pfbfer.zip

This sounds like what I was looking for. some how this got missed when
I poked around reportlab land.

Thanks much.

Carl K
Beware... AFAIK this is only a backend for reportlab graphics drawings, IOW it
will render drawings and charts from the reportlab.graph ics package but will not
render reportlab pdf canvas.

Dec 26 '07 #18
Grant Edwards wrote:
On 2007-12-25, Diez B. Roggisch <de***@nospam.w eb.dewrote:
>Carl K schrieb:
>>Grant Edwards wrote:
On 2007-12-24, Carl K <ca**@personnel ware.comwrote:

>If it is a multi page pdf Imagemagick will do:
>>
>convert file.pdf page-%03d.png
I need python code to do this. It is going to be run on a
someone else's shared host web server, security and
performan ce is an issue. So I would rather not run stuff via
popen.
Use subprocess.

Trying to eliminate popen because of the overhead when running
ghostscrip t to render PDF (I assume convert uses gs?) is about
like trimming an elephants toenails to save weight.
maybe, but I wouldn't be so sure.

currently the pdf is created in a python StringIO buffer and returned to
the browser; so it never becomes a file. using convert means I have to
first save it as a file, convert from file to file, read the file,
delete the 2 files. so 6 file operations where before there were none.
That may be more of a load than the ghostscript part.
So what? I'm not sure about current HD speeds, but a couple of years ago
these were about 30MByte/s - and should be faster today. Which equals
240MBit/s, much more than your user's internet connection. and this is
raw IO speed, not counting disk caches.

Unless the file is really huge (or the server is overloaded),
The server is already overloaded,
the bytes will probably never even hit a platter. If you're
using any even remotely modern OS, short-lived tempfiles used
as you desdcribe are basically just memory-buffers with a
filesystem API.
Good point. Not that I am willing to risk it (just using the pdf is not such a
bad option) but I am wondering if it would make sense to create a ramdrive for
something like this. if memory is needed, swap would happen, which should be
better than creating files.

Carl K
Dec 26 '07 #19
>>>>Carl K <ca**@personnel ware.com(CK) wrote:
>CKHere is what the code looks like that generates the pdf:
>CK buffer = StringIO()
CK rw = dReportWriter(O utputFile=buffe r, ReportFormFile= xmlfile, Cursor=ds)
CK rw.write()
CK pdf = buffer.getvalue ()
CK return pdf
You can pipe the pdf through ghostscript and read the png back from
ghostscript's stdout. Like:

gs -q -sDEVICE=png16m -sOutputFile=- -

Use that command in subprocess with the stdin/stdout as pipes, send
your PDF data to the process and read the PNG output back.

However you must be aware that this can deadlock if the output is large
enough. So putting the input or the output in a real file is probably safer
anyway.

--
Piet van Oostrum <pi**@cs.uu.n l>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C 4]
Private email: pi**@vanoostrum .org
Dec 26 '07 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

19
7306
by: Lauren Quantrell | last post by:
I have a stored procedure using Convert where the exact same Convert string works in the SELECT portion of the procedure but fails in the WHERE portion. The entire SP is listed below. Specifically, I have a problem with this portion in the WHERE clause: DATEADD(Day,tblMyEventTableName.ReminderDays, @DateNow) Between CONVERT(smalldatetime,str(DATEPART(Month, @DateNow)+1) + '/' + str(DATEPART(Day, tblMyEventTableName.TaskDateTime)) + '/'...
1
1787
by: Logan X via .NET 247 | last post by:
It's official....Convert blows. I ran a number of tests converting a double to an integer usingboth Convert & CType. I *ASSUMED* that CType would piggy-back ontop of Convert, and that performance would be identical. I was 100% incorrect. The code below produces the results: CType Took: 0.2187528 seconds. Convert Took: 12.187656 seconds.
4
3643
by: Eric Lilja | last post by:
Hello, I've made a templated class Option (a child of the abstract base class OptionBase) that stores an option name (in the form someoption=) and the value belonging to that option. The value is of the type the object is instantiated with. In my test program I have Option<std::string> and Option<long>. Here's the code for OptionBase and Option along with a small helper function. In the code are comments describing my problem, look closely...
7
7130
by: whatluo | last post by:
Hi, all I'm now working on a program which will convert dec number to hex and oct and bin respectively, I've checked the clc but with no luck, so can anybody give me a hit how to make this done without strtol or s/printf function. Thanks, whatluo.
3
10301
by: Convert TextBox.Text to Int32 Problem | last post by:
Need a little help here. I saw some related posts, so here goes... I have some textboxes which are designed for the user to enter a integer value. In "old school C" we just used the atoi function and there you have it. So I enquired and found the Convert class with it's promising ToInt32 method, great... but it doesn't work. The thing keeps throwing Format Exceptions all over the place. What is the "C#" way to do this??? code int wmin,...
7
29261
by: patang | last post by:
I want to convert amount to words. Is there any funciton available? Example: $230.30 Two Hundred Thirty Dollars and 30/100
4
4529
by: Edwin Knoppert | last post by:
In my code i use the text from a textbox and convert it to a double value. I was using Convert.ToDouble() but i'm used to convert comma to dot. This way i can assure the text is correct. However it seems this convert is determined by the local settings and comma is indeed used as decimal separator. Is there another way to convert a dotted value to a double variable? Like 1234.5 and not 1234,5
1
3608
by: johnlim20088 | last post by:
Hi, Currently I have 6 web projects located in Visual Source Safe 6.0, as usual, everytime I will open solution file located in my local computer, connected to source safe, then check out/check in some files and work on it. Let say, I want add new page to web project named websiteOrder.sln, i will open websiteOrder.sln in my local computer, connected to websiteOrder.sln located in Visual Source Safe 6.0(source safe located in another...
6
4283
by: Ken Fine | last post by:
This is a basic question. What is the difference between casting and using the Convert.ToXXX methods, from the standpoint of the compiler, in terms of performance, and in other ways? e.g. this.ContentID = (int)ci.Conid; vs. this.ContentID = Convert.ToInt32(ci.Conid); I tend to use the latter form because it seems more descriptive to me, but it would be good to know what's best practice. I'm guessing those methods
0
10799
Debadatta Mishra
by: Debadatta Mishra | last post by:
Introduction In this article I will provide you an approach to manipulate an image file. This article gives you an insight into some tricks in java so that you can conceal sensitive information inside an image, hide your complete image as text ,search for a particular image inside a directory, minimize the size of the image. However this is not a new concept, there is a concept called Steganography which enables to conceal your secret...
0
9781
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10734
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10407
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9567
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
7114
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5982
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4606
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4211
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3230
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.