471,350 Members | 1,652 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,350 software developers and data experts.

subprocess chokes on spaces in path

Using bash on Debian Etch.

If word_doc = sys.argv[1] and it's a file name like My\ Word.doc this
function reads My and Word as two separate files unless the second
'%s' is quoted. Took me a lot of trial and error to discover. Is this
the most elegant way to do it? I was using popen originally, then saw
some threads suggesting subprocess cured the spaces in path problem.

def get_MSWordDoc_text(word_doc):
"""Harvests text from an MSWord doc using antiword."""
antiword = "/usr/bin/antiword"

# Note the extra single quotes around the second '%s'
# without these quotes, bash chokes on paths with spaces in them
# says can't open My ; can't open Word
# using new subprocess module, the extra '%s' shouldn't be
necessary?
# but I could not get to work
# see Beazley 2nd Ed. page 340

p = subprocess.Popen("%s '%s'" % (antiword, word_doc), shell=True,
stdout=subprocess.PIPE)
doc_text = p.stdout.read()
return doc_text

thx,

rd

Nov 6 '07 #1
3 5703
* BartlebyScrivener (Tue, 06 Nov 2007 20:32:33 -0000)
Using bash on Debian Etch.

If word_doc = sys.argv[1] and it's a file name like My\ Word.doc this
function reads My and Word as two separate files unless the second
'%s' is quoted. Took me a lot of trial and error to discover. Is this
the most elegant way to do it? I was using popen originally, then saw
some threads suggesting subprocess cured the spaces in path problem.

def get_MSWordDoc_text(word_doc):
"""Harvests text from an MSWord doc using antiword."""
antiword = "/usr/bin/antiword"

# Note the extra single quotes around the second '%s'
# without these quotes, bash chokes on paths with spaces in them
# says can't open My ; can't open Word
If bash has problems with spaces then the obvious thing to do is not
to use bash, right?!

Thorsten
Nov 6 '07 #2
En Tue, 06 Nov 2007 17:32:33 -0300, BartlebyScrivener
<bs**********@gmail.comescribió:
Using bash on Debian Etch.

If word_doc = sys.argv[1] and it's a file name like My\ Word.doc this
function reads My and Word as two separate files unless the second
'%s' is quoted. Took me a lot of trial and error to discover. Is this
the most elegant way to do it? I was using popen originally, then saw
some threads suggesting subprocess cured the spaces in path problem.
Use a list of arguments [antiword, word_doc] and let subprocess handle the
spaces the right way.
--
Gabriel Genellina

Nov 6 '07 #3
On Nov 6, 2:48 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.arwrote:
>
Use a list of arguments [antiword, word_doc] and let subprocess handle the
spaces the right way.
Got it working. Thank you both.

p = subprocess.Popen([antiword, word_doc], stdout=subprocess.PIPE)
doc_text = p.stdout.read()

rd


Nov 7 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Edward C. Jones | last post: by
3 posts views Thread by Ivan Vinogradov | last post: by
5 posts views Thread by Grant Edwards | last post: by
12 posts views Thread by Eric_Dexter | last post: by
reply views Thread by Michel Lespinasse | last post: by
reply views Thread by Tim Golden | last post: by
3 posts views Thread by Thomas Jansson | last post: by
3 posts views Thread by Jeremy Sanders | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.