By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,766 Members | 1,393 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,766 IT Pros & Developers. It's quick & easy.

Opening MS Word files via Python

P: n/a
Here comes another small question from me :-)

I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?

Thanks,
Jul 18 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Fazer wrote:
I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?


The win32 extensions for python allow you to get at the COM objects for
applications like Word, and that would let you get the text and tables.
google: win32 python.

word = win32com.client.Dispatch('Word.Application')
word.Documents.Open('C:\\myfile.doc')

But I don't know the best way to find out the methods and properties of
the "word" object.

Rob

Jul 18 '05 #2

P: n/a
fa****@jaredweb.com (Fazer) wrote in message
I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?


See http://aspn.activestate.com/ASPN/Coo.../Recipe/279003

Cheers,
Simon B.
Jul 18 '05 #3

P: n/a
Rob Nikander <rn*************@adelphia.net> wrote in message news:<i7********************@adelphia.com>...
Fazer wrote:
I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?


The win32 extensions for python allow you to get at the COM objects for
applications like Word, and that would let you get the text and tables.
google: win32 python.

word = win32com.client.Dispatch('Word.Application')
word.Documents.Open('C:\\myfile.doc')

But I don't know the best way to find out the methods and properties of
the "word" object.

Rob


You can use VBA documentation for Word, and using dot notation and
normal Pythonesque way of calling functions, play with its diverses
objects, methods and attributes...
Here's some pretty straightforward code along these lines:
#************************
import win32com.client
import tkFileDialog

# Launch Word
MSWord = win32com.client.Dispatch("Word.Application")
MSWord.Visible = 0
# Open a specific file
myWordDoc = tkFileDialog.askopenfilename()
MSWord.Documents.Open(myWordDoc)
#Get the textual content
docText = MSWord.Documents[0].Content
# Get a list of tables
listTables= MSWord.Documents[0].Tables
#************************

Happy parsing,

Jean-Marc
Jul 18 '05 #4

P: n/a
jm*********@cvm.qc.ca (jmdeschamps) wrote in message news:<3d**************************@posting.google. com>...
Rob Nikander <rn*************@adelphia.net> wrote in message news:<i7********************@adelphia.com>...
Fazer wrote:
I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?


The win32 extensions for python allow you to get at the COM objects for
applications like Word, and that would let you get the text and tables.
google: win32 python.

word = win32com.client.Dispatch('Word.Application')
word.Documents.Open('C:\\myfile.doc')

But I don't know the best way to find out the methods and properties of
the "word" object.

Rob


You can use VBA documentation for Word, and using dot notation and
normal Pythonesque way of calling functions, play with its diverses
objects, methods and attributes...
Here's some pretty straightforward code along these lines:
#************************
import win32com.client
import tkFileDialog

# Launch Word
MSWord = win32com.client.Dispatch("Word.Application")
MSWord.Visible = 0
# Open a specific file
myWordDoc = tkFileDialog.askopenfilename()
MSWord.Documents.Open(myWordDoc)
#Get the textual content
docText = MSWord.Documents[0].Content
# Get a list of tables
listTables= MSWord.Documents[0].Tables
#************************

Happy parsing,

Jean-Marc

That is Awesome! Thanks!

How would I save something in word format? I am guessing
MSWord.Docments.Save(myWordDoc) or around those lines? where can I
find more documentatin? Thanks.
Jul 18 '05 #5

P: n/a
Fazer wrote...
jm*********@cvm.qc.ca (jmdeschamps) wrote in message news:<3d**************************@posting.google. com>...
Rob Nikander <rn*************@adelphia.net> wrote in message news:<i7********************@adelphia.com>... <snip>

But I don't know the best way to find out the methods and properties of
the "word" object.
<snip>
How would I save something in word format? I am guessing
MSWord.Docments.Save(myWordDoc) or around those lines? where can I
find more documentatin? Thanks.


Open MS Word and press (ALT + F11), then F2

Jul 18 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.