By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,813 Members | 1,130 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,813 IT Pros & Developers. It's quick & easy.

subprocess chokes on spaces in path

P: n/a
Using bash on Debian Etch.

If word_doc = sys.argv[1] and it's a file name like My\ Word.doc this
function reads My and Word as two separate files unless the second
'%s' is quoted. Took me a lot of trial and error to discover. Is this
the most elegant way to do it? I was using popen originally, then saw
some threads suggesting subprocess cured the spaces in path problem.

def get_MSWordDoc_text(word_doc):
"""Harvests text from an MSWord doc using antiword."""
antiword = "/usr/bin/antiword"

# Note the extra single quotes around the second '%s'
# without these quotes, bash chokes on paths with spaces in them
# says can't open My ; can't open Word
# using new subprocess module, the extra '%s' shouldn't be
necessary?
# but I could not get to work
# see Beazley 2nd Ed. page 340

p = subprocess.Popen("%s '%s'" % (antiword, word_doc), shell=True,
stdout=subprocess.PIPE)
doc_text = p.stdout.read()
return doc_text

thx,

rd

Nov 6 '07 #1
Share this Question
Share on Google+
3 Replies


P: n/a
* BartlebyScrivener (Tue, 06 Nov 2007 20:32:33 -0000)
Using bash on Debian Etch.

If word_doc = sys.argv[1] and it's a file name like My\ Word.doc this
function reads My and Word as two separate files unless the second
'%s' is quoted. Took me a lot of trial and error to discover. Is this
the most elegant way to do it? I was using popen originally, then saw
some threads suggesting subprocess cured the spaces in path problem.

def get_MSWordDoc_text(word_doc):
"""Harvests text from an MSWord doc using antiword."""
antiword = "/usr/bin/antiword"

# Note the extra single quotes around the second '%s'
# without these quotes, bash chokes on paths with spaces in them
# says can't open My ; can't open Word
If bash has problems with spaces then the obvious thing to do is not
to use bash, right?!

Thorsten
Nov 6 '07 #2

P: n/a
En Tue, 06 Nov 2007 17:32:33 -0300, BartlebyScrivener
<bs**********@gmail.comescribió:
Using bash on Debian Etch.

If word_doc = sys.argv[1] and it's a file name like My\ Word.doc this
function reads My and Word as two separate files unless the second
'%s' is quoted. Took me a lot of trial and error to discover. Is this
the most elegant way to do it? I was using popen originally, then saw
some threads suggesting subprocess cured the spaces in path problem.
Use a list of arguments [antiword, word_doc] and let subprocess handle the
spaces the right way.
--
Gabriel Genellina

Nov 6 '07 #3

P: n/a
On Nov 6, 2:48 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.arwrote:
>
Use a list of arguments [antiword, word_doc] and let subprocess handle the
spaces the right way.
Got it working. Thank you both.

p = subprocess.Popen([antiword, word_doc], stdout=subprocess.PIPE)
doc_text = p.stdout.read()

rd


Nov 7 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.