469,596 Members | 2,241 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,596 developers. It's quick & easy.

tempfile Question

Hey group,

I have a command line tool that I want to be able to call from a
Python script. The problem is that this tool only writes to a file.

So my solution is to give the tool a temporary file to write to and
then have Python read that file. I figure that's the safest way to
deal with this sort of thing. (But I'm open to better methods).

Here's my code so far, could anyone tell me the proper way to use
tempfile. This code won't let the tool write to the file because
Python has it locked. But I'm worried that if I close the file then
windows might take it away? I have no idea.

<code>
PDFTOTEXTPATH=r'C:\xpdf-3.01pl2-win32\xpdf-3.01pl2-win32\pdftotext.exe'

def read_pdf_text(filepath):
outfile=tempfile.NamedTemporaryFile()
outfilename=outfile.name
command=r'"%s" "%s" "%s"' % (PDFTOTEXTPATH,filepath,outfilename)
result=os.popen(command).read()
pdftext=outfile.read()
outfile.close()
return pdftext
</code>

Much thanks!

--
Gregory Piñero
Chief Innovation Officer
Blended Technologies
(www.blendedtechnologies.com)
Jun 6 '06 #1
5 1501
On 7/06/2006 7:50 AM, Gregory Piñero wrote:
Hey group,

I have a command line tool that I want to be able to call from a
Python script. The problem is that this tool only writes to a file.

So my solution is to give the tool a temporary file to write to and
then have Python read that file. I figure that's the safest way to
deal with this sort of thing. (But I'm open to better methods).

Here's my code so far, could anyone tell me the proper way to use
tempfile. This code won't let the tool write to the file because
Python has it locked. But I'm worried that if I close the file then
windows might take it away? I have no idea.


Me neither, not having faced this situation before. To acquire an idea,
I'd Read The Fantastic Manual:
(my comments enclosed in [])
"""
TemporaryFile( [mode='w+b'[, bufsize=-1[, suffix[, prefix[, dir]]]]])

Return a file (or file-like) object that can be used as a temporary
storage area. The file is created using mkstemp. It will be destroyed as
soon as it is closed (including an implicit close when the object is
garbage collected).[That seems to answer one question] Under Unix, the
directory entry for the file is removed immediately after the file is
created. Other platforms do not support this; your code should not rely
on a temporary file created using this function having or not having a
visible name in the file system.
The mode parameter defaults to 'w+b' so that the file created can be
read and written without being closed. Binary mode is used so that it
behaves consistently on all platforms without regard for the data that
is stored. bufsize defaults to -1, meaning that the operating system
default is used.

The dir, prefix and suffix parameters are passed to mkstemp().
NamedTemporaryFile( [mode='w+b'[, bufsize=-1[, suffix[, prefix[, dir]]]]])

This function operates exactly as TemporaryFile() does, except that the
file is guaranteed to have a visible name in the file system (on Unix,
the directory entry is not unlinked). That name can be retrieved from
the name member of the file object. Whether the name can be used to open
the file a second time, while the named temporary file is still open,
varies across platforms (it can be so used on Unix; it cannot on Windows
NT or later [That seems to answer the other question]). New in version 2.3.
"""

So I'd be thinking about using the (deprecated) mktemp() instead,
perhaps trying to cut down the chance of conflicts by (a) using a prefix
e.g. "pdf2txttmp" and/or (b) using a dir of "." -- then asking the
cognoscenti what are the drawbacks of this approach.

HTH,
John
Jun 6 '06 #2
On 7/06/2006 7:50 AM, Gregory Piñero wrote:
Hey group,

I have a command line tool that I want to be able to call from a
Python script. The problem is that this tool only writes to a file.


Another Fantastic Manual gives another idea:
"""
Pdftotext reads the PDF file, PDF-file, and writes a text
file, text-file. If text-file is not specified, pdftotext
converts file.pdf to file.txt. If text-file is '-', the
text is sent to stdout.
"""

Why not try reading back the child process's stdout? I would have
thought there would be no shortage of examples and help for this idea ...

Cheers,
John
Jun 7 '06 #3
On 7/06/2006 3:57 PM, Dennis Lee Bieber wrote:
On Wed, 07 Jun 2006 09:56:13 +1000, John Machin <sj******@lexicon.net>
declaimed the following in comp.lang.python:
The dir, prefix and suffix parameters are passed to mkstemp().

<snip>
So I'd be thinking about using the (deprecated) mktemp() instead,


I think you passed over the mkstemp() variation. Granted, it, too,
returns an opened file, along with the full pathname of the file, but it
requires the caller to handle eventual disposal of the file.

Merely close the opened file; pass the pathname to the subprocess,
await completion of subprocess, reopen the file for use in Python...
Then at the end, close the file and use the pathname to delete the file
from the system.


I passed over mkstemp() because (according to my reading of the manual),
mkstemp() requires an *extra* step (close the file), leaving the
situation then *exactly* the same as with mktemp() i.e. some pirate
process may molest the file before the caller's child process can open
the file.

Cheers,
John
Jun 7 '06 #4
John Machin wrote:
On 7/06/2006 3:57 PM, Dennis Lee Bieber wrote:
On Wed, 07 Jun 2006 09:56:13 +1000, John Machin <sj******@lexicon.net>
declaimed the following in comp.lang.python:

The dir, prefix and suffix parameters are passed to mkstemp().


<snip>
So I'd be thinking about using the (deprecated) mktemp() instead,


I think you passed over the mkstemp() variation. Granted, it, too,
returns an opened file, along with the full pathname of the file, but it
requires the caller to handle eventual disposal of the file.

Merely close the opened file; pass the pathname to the subprocess,
await completion of subprocess, reopen the file for use in Python...
Then at the end, close the file and use the pathname to delete the file
from the system.

I passed over mkstemp() because (according to my reading of the manual),
mkstemp() requires an *extra* step (close the file), leaving the
situation then *exactly* the same as with mktemp() i.e. some pirate
process may molest the file before the caller's child process can open
the file.

Surely if you set permissions correctly on /tmp (sticky-but to require
ownership for deletion) and you create your temporary file with sensible
ownership and permissions then rogue processes without root privileges
can't do anything bad to your files. Or am I wrong?

Of course if a rogue process has root privileges then all security bets
are off anyway.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Love me, love my blog http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Jun 7 '06 #5
On 8/06/2006 2:57 AM, Dennis Lee Bieber wrote:
On Wed, 07 Jun 2006 21:53:02 +1000, John Machin <sj******@lexicon.net>
declaimed the following in comp.lang.python:
I passed over mkstemp() because (according to my reading of the manual),
mkstemp() requires an *extra* step (close the file), leaving the
situation then *exactly* the same as with mktemp() i.e. some pirate
process may molest the file before the caller's child process can open
the file.

mktemp() creates ONLY the file name, but not the file itself. This
means another process calling mktemp() has the possibility of generating
the SAME file name before the first opens/creates the named file.


Thanks, Dennis. You are quite correct. I'm a dill: I read """Return an
absolute pathname of a file that did not exist at the time the call is
made""" as implying that the file existed after the call, brushed aside
my brief wonderment about what was deprecable about all that, and didn't
even try it at the interactive prompt.

Cheers,
John
Jun 7 '06 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by Thomas Guettler | last post: by
2 posts views Thread by marco | last post: by
1 post views Thread by Matt Garman | last post: by
reply views Thread by Leo Breebaart | last post: by
6 posts views Thread by James T. Dennis | last post: by
4 posts views Thread by billiejoex | last post: by
9 posts views Thread by billiejoex | last post: by
7 posts views Thread by byte8bits | last post: by
reply views Thread by suresh191 | last post: by
4 posts views Thread by guiromero | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.