473,657 Members | 2,530 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Opening MS Word files via Python

Here comes another small question from me :-)

I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?

Thanks,
Jul 18 '05 #1
5 2872
Fazer wrote:
I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?


The win32 extensions for python allow you to get at the COM objects for
applications like Word, and that would let you get the text and tables.
google: win32 python.

word = win32com.client .Dispatch('Word .Application')
word.Documents. Open('C:\\myfil e.doc')

But I don't know the best way to find out the methods and properties of
the "word" object.

Rob

Jul 18 '05 #2
fa****@jaredweb .com (Fazer) wrote in message
I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?


See http://aspn.activestate.com/ASPN/Coo.../Recipe/279003

Cheers,
Simon B.
Jul 18 '05 #3
Rob Nikander <rn************ *@adelphia.net> wrote in message news:<i7******* *************@a delphia.com>...
Fazer wrote:
I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?


The win32 extensions for python allow you to get at the COM objects for
applications like Word, and that would let you get the text and tables.
google: win32 python.

word = win32com.client .Dispatch('Word .Application')
word.Documents. Open('C:\\myfil e.doc')

But I don't know the best way to find out the methods and properties of
the "word" object.

Rob


You can use VBA documentation for Word, and using dot notation and
normal Pythonesque way of calling functions, play with its diverses
objects, methods and attributes...
Here's some pretty straightforward code along these lines:
#************** **********
import win32com.client
import tkFileDialog

# Launch Word
MSWord = win32com.client .Dispatch("Word .Application")
MSWord.Visible = 0
# Open a specific file
myWordDoc = tkFileDialog.as kopenfilename()
MSWord.Document s.Open(myWordDo c)
#Get the textual content
docText = MSWord.Document s[0].Content
# Get a list of tables
listTables= MSWord.Document s[0].Tables
#************** **********

Happy parsing,

Jean-Marc
Jul 18 '05 #4
jm*********@cvm .qc.ca (jmdeschamps) wrote in message news:<3d******* *************** ****@posting.go ogle.com>...
Rob Nikander <rn************ *@adelphia.net> wrote in message news:<i7******* *************@a delphia.com>...
Fazer wrote:
I am curious as to how I should approach this issue. I would just
want to parse simple text and maybe perhaps tables in the future.
Would I have to save the word file and open it in a text editor? That
would kind of....suck... Has anyone else tackled this issue?


The win32 extensions for python allow you to get at the COM objects for
applications like Word, and that would let you get the text and tables.
google: win32 python.

word = win32com.client .Dispatch('Word .Application')
word.Documents. Open('C:\\myfil e.doc')

But I don't know the best way to find out the methods and properties of
the "word" object.

Rob


You can use VBA documentation for Word, and using dot notation and
normal Pythonesque way of calling functions, play with its diverses
objects, methods and attributes...
Here's some pretty straightforward code along these lines:
#************** **********
import win32com.client
import tkFileDialog

# Launch Word
MSWord = win32com.client .Dispatch("Word .Application")
MSWord.Visible = 0
# Open a specific file
myWordDoc = tkFileDialog.as kopenfilename()
MSWord.Document s.Open(myWordDo c)
#Get the textual content
docText = MSWord.Document s[0].Content
# Get a list of tables
listTables= MSWord.Document s[0].Tables
#************** **********

Happy parsing,

Jean-Marc

That is Awesome! Thanks!

How would I save something in word format? I am guessing
MSWord.Docments .Save(myWordDoc ) or around those lines? where can I
find more documentatin? Thanks.
Jul 18 '05 #5
Fazer wrote...
jm*********@cvm .qc.ca (jmdeschamps) wrote in message news:<3d******* *************** ****@posting.go ogle.com>...
Rob Nikander <rn************ *@adelphia.net> wrote in message news:<i7******* *************@a delphia.com>... <snip>

But I don't know the best way to find out the methods and properties of
the "word" object.
<snip>
How would I save something in word format? I am guessing
MSWord.Docments .Save(myWordDoc ) or around those lines? where can I
find more documentatin? Thanks.


Open MS Word and press (ALT + F11), then F2

Jul 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
4315
by: Daniel Cloutier | last post by:
Hi, is it possible to edit or write Word-files out of a Python-Program? thx in advance daniel
3
1461
by: news.west.cox.net | last post by:
I want to check a directory, see if there are any files in it... then open each one and do something. I have this... files = os.listdir('/direcory') if len(files) > 0: for file in files: f1 = file(file, "r") do some other stuff
5
3176
by: Vinay | last post by:
Hi I have a corrupt word file. I am able to open it with the code given below tr Dim pInfo As System.Diagnostics.ProcessStartInfo = New System.Diagnostics.ProcessStartInfo( pInfo.UseShellExecute = Tru pInfo.FileName = "c:\corrupt.doc Dim p As Process = System.Diagnostics.Process.Start(pInfo Catch ex As Exceptio MsgBox(ex.ToString End Tr
4
1742
by: Thomas Scheiderich | last post by:
I can't seem to open a word document from any of my browsers except the one on my web server. Here is the .aspx file: ***************************************************************** <%@ Import Namespace="System.Data.SqlClient" %> <%@ import Namespace="System.IO" %> <Script Runat="Server">
4
3456
by: ajkadri | last post by:
Folks, I have written a word frequency counter program in python that works well for .txt files; but it cannot handle .DOC files. Can someone help me to resolve this issue???
8
6938
by: gazza67 | last post by:
Hi, I want to do something that I thought would be simple but i cant seem to work it out, perhaps someone out there could help me. I want to browse for a file (it will be a word document), save the file name to a string and then at some later stage open that file with word. The operating system will be windows 2000 (dont know if that makes a difference or not).
5
2416
by: muwie | last post by:
Hello, I was browsing to see if I could find something similair to my problem. But I couldn't find anything.. I have this script that counts every word in a file. And then also says how many times that word occurs. Now I have this directory containing about 60 text files which I need to run this script on. Seeing as I'm not really a star in programming, I made a script that puts all those files in 1 file. And then that 1 file runs trough...
34
5337
by: Alexnb | last post by:
Gerhard Häring wrote: No, it didn't work, but it gave me some interesting feedback when I ran it in the shell. Heres what it told me: Traceback (most recent call last): File "<pyshell#10>", line 1, in <module> os.startfile("C:\Documents and Settings\Alex\My Documents\My
1
2429
navanova
by: navanova | last post by:
Greetings, I have a problem of opening ms word and excel files on my computer. The files are there for a long time. I use to open and modify them. Suddenly, when i try to open the word files, a dialog box appears that says "There was an error opening the file". when i try to open the excel files, a dialog box appears that says "File format not valid". I have tried to create a new word and excel files, save them and when i try to open these...
0
8421
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8325
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8742
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8621
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7354
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6177
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5643
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4330
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2743
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.