Connecting Tech Pros Worldwide Forums | Help | Site Map

Opening MS Word files via Python

Fazer
Guest
 
Posts: n/a
#1: Jul 18 '05
Here comes another small question from me :-)

I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?

Thanks,

Rob Nikander
Guest
 
Posts: n/a
#2: Jul 18 '05

re: Opening MS Word files via Python


Fazer wrote:[color=blue]
> I am curious as to how I should approach this issue. I would just
> want to parse simple text and maybe perhaps tables in the future.
> Would I have to save the word file and open it in a text editor? That
> would kind of....suck... Has anyone else tackled this issue?[/color]

The win32 extensions for python allow you to get at the COM objects for
applications like Word, and that would let you get the text and tables.
google: win32 python.

word = win32com.client.Dispatch('Word.Application')
word.Documents.Open('C:\\myfile.doc')

But I don't know the best way to find out the methods and properties of
the "word" object.

Rob

Simon Brunning
Guest
 
Posts: n/a
#3: Jul 18 '05

re: Opening MS Word files via Python


faizan@jaredweb.com (Fazer) wrote in message[color=blue]
> I am curious as to how I should approach this issue. I would just
> want to parse simple text and maybe perhaps tables in the future.
> Would I have to save the word file and open it in a text editor? That
> would kind of....suck... Has anyone else tackled this issue?[/color]

See http://aspn.activestate.com/ASPN/Coo.../Recipe/279003

Cheers,
Simon B.
jmdeschamps
Guest
 
Posts: n/a
#4: Jul 18 '05

re: Opening MS Word files via Python


Rob Nikander <rnikaREMOVEnder@adelphia.net> wrote in message news:<i7-dnZNwpJ8TfhjdRVn-jg@adelphia.com>...[color=blue]
> Fazer wrote:[color=green]
> > I am curious as to how I should approach this issue. I would just
> > want to parse simple text and maybe perhaps tables in the future.
> > Would I have to save the word file and open it in a text editor? That
> > would kind of....suck... Has anyone else tackled this issue?[/color]
>
> The win32 extensions for python allow you to get at the COM objects for
> applications like Word, and that would let you get the text and tables.
> google: win32 python.
>
> word = win32com.client.Dispatch('Word.Application')
> word.Documents.Open('C:\\myfile.doc')
>
> But I don't know the best way to find out the methods and properties of
> the "word" object.
>
> Rob[/color]

You can use VBA documentation for Word, and using dot notation and
normal Pythonesque way of calling functions, play with its diverses
objects, methods and attributes...
Here's some pretty straightforward code along these lines:
#************************
import win32com.client
import tkFileDialog

# Launch Word
MSWord = win32com.client.Dispatch("Word.Application")
MSWord.Visible = 0
# Open a specific file
myWordDoc = tkFileDialog.askopenfilename()
MSWord.Documents.Open(myWordDoc)
#Get the textual content
docText = MSWord.Documents[0].Content
# Get a list of tables
listTables= MSWord.Documents[0].Tables
#************************

Happy parsing,

Jean-Marc
Fazer
Guest
 
Posts: n/a
#5: Jul 18 '05

re: Opening MS Word files via Python


jmdeschamps@cvm.qc.ca (jmdeschamps) wrote in message news:<3d06fae9.0404210536.3f277a37@posting.google. com>...[color=blue]
> Rob Nikander <rnikaREMOVEnder@adelphia.net> wrote in message news:<i7-dnZNwpJ8TfhjdRVn-jg@adelphia.com>...[color=green]
> > Fazer wrote:[color=darkred]
> > > I am curious as to how I should approach this issue. I would just
> > > want to parse simple text and maybe perhaps tables in the future.
> > > Would I have to save the word file and open it in a text editor? That
> > > would kind of....suck... Has anyone else tackled this issue?[/color]
> >
> > The win32 extensions for python allow you to get at the COM objects for
> > applications like Word, and that would let you get the text and tables.
> > google: win32 python.
> >
> > word = win32com.client.Dispatch('Word.Application')
> > word.Documents.Open('C:\\myfile.doc')
> >
> > But I don't know the best way to find out the methods and properties of
> > the "word" object.
> >
> > Rob[/color]
>
> You can use VBA documentation for Word, and using dot notation and
> normal Pythonesque way of calling functions, play with its diverses
> objects, methods and attributes...
> Here's some pretty straightforward code along these lines:
> #************************
> import win32com.client
> import tkFileDialog
>
> # Launch Word
> MSWord = win32com.client.Dispatch("Word.Application")
> MSWord.Visible = 0
> # Open a specific file
> myWordDoc = tkFileDialog.askopenfilename()
> MSWord.Documents.Open(myWordDoc)
> #Get the textual content
> docText = MSWord.Documents[0].Content
> # Get a list of tables
> listTables= MSWord.Documents[0].Tables
> #************************
>
> Happy parsing,
>
> Jean-Marc[/color]


That is Awesome! Thanks!

How would I save something in word format? I am guessing
MSWord.Docments.Save(myWordDoc) or around those lines? where can I
find more documentatin? Thanks.
anon
Guest
 
Posts: n/a
#6: Jul 18 '05

re: Opening MS Word files via Python


Fazer wrote...
[color=blue]
> jmdeschamps@cvm.qc.ca (jmdeschamps) wrote in message news:<3d06fae9.0404210536.3f277a37@posting.google. com>...
>[color=green]
>>Rob Nikander <rnikaREMOVEnder@adelphia.net> wrote in message news:<i7-dnZNwpJ8TfhjdRVn-jg@adelphia.com>...[/color][/color]
<snip>[color=blue][color=green][color=darkred]
>>>
>>>But I don't know the best way to find out the methods and properties of
>>>the "word" object.
>>>[/color][/color][/color]
<snip>[color=blue]
>
> How would I save something in word format? I am guessing
> MSWord.Docments.Save(myWordDoc) or around those lines? where can I
> find more documentatin? Thanks.[/color]



Open MS Word and press (ALT + F11), then F2





Closed Thread