By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,353 Members | 1,237 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,353 IT Pros & Developers. It's quick & easy.

using python to edit a word file?

P: n/a
I figured my first step is to install the win32 extension, which I did,
but I can't seem to find any documentation for it. A couple of the links
on Mark Hammond's site don't seem to work.

Anyway, all I need to do is search in the Word document for certain
strings and either delete them or replace them. Easy enough, if only I
knew which function, etc. to use.

Hope someone can push me in the right direction.

Thanks.
Aug 10 '06 #1
Share this Question
Share on Google+
8 Replies


P: n/a
John Salerno <jo******@NOSPAMgmail.comwrites:
I figured my first step is to install the win32 extension, which I
did, but I can't seem to find any documentation for it. A couple of
the links on Mark Hammond's site don't seem to work.

Anyway, all I need to do is search in the Word document for certain
strings and either delete them or replace them. Easy enough, if only I
knew which function, etc. to use.

Hope someone can push me in the right direction.
Maybe this will be helpful:

http://aspn.activestate.com/ASPN/Coo.../Recipe/279003

--
Regards,
Rob
Aug 10 '06 #2

P: n/a
Rob Wolfe wrote:
John Salerno <jo******@NOSPAMgmail.comwrites:
>I figured my first step is to install the win32 extension, which I
did, but I can't seem to find any documentation for it. A couple of
the links on Mark Hammond's site don't seem to work.

Anyway, all I need to do is search in the Word document for certain
strings and either delete them or replace them. Easy enough, if only I
knew which function, etc. to use.

Hope someone can push me in the right direction.

Maybe this will be helpful:

http://aspn.activestate.com/ASPN/Coo.../Recipe/279003
But if I save the file to text, won't it lose its formatting?

More specifically, here's what I have: a four-page calendar, each page
with three months on it. The months are in tables, which is why I don't
think making a text file will help me here, because I'll lose all that.
What I need to do is renumber all the dates, basically replacing a
number with itself minus 1. So it's not a simple find/replace task, and
there doesn't seem to be a way to do this in Word's find/replace feature
(but if there is, please let me know!)
Aug 10 '06 #3

P: n/a
On 2006-08-10 15:15:34, John Salerno wrote:
I figured my first step is to install the win32 extension, which I did,
but I can't seem to find any documentation for it. A couple of the links
on Mark Hammond's site don't seem to work.

Anyway, all I need to do is search in the Word document for certain
strings and either delete them or replace them. Easy enough, if only I
knew which function, etc. to use.

Hope someone can push me in the right direction.
When Word is installed, you have a few COM interfaces to Word. I'm not sure
how to access these with Python (but documentation about using COM with
Python should help you here), and I'm not sure whether what you want is
available (but the Word COM documentation should help you with that).

Gerhard

Aug 10 '06 #4

P: n/a
John Salerno wrote:
But if I save the file to text, won't it lose its formatting?
It looks like I can save it as an XML file and it will retain all the
formatting. Now I just need to decipher where the dates are in all that
mess and replace them, just using a normal text file! :)
Aug 10 '06 #5

P: n/a
John Salerno wrote:
I figured my first step is to install the win32 extension, which I did,
but I can't seem to find any documentation for it. A couple of the links
on Mark Hammond's site don't seem to work.

Anyway, all I need to do is search in the Word document for certain
strings and either delete them or replace them. Easy enough, if only I
knew which function, etc. to use.

Hope someone can push me in the right direction.

Thanks.
The easiest way for me to do things like this is to do it in Word and
record a VB Macro. For instance you will see something like this:

Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "save it"
.Replacement.Text = "dont save it"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchByte = False
.CorrectHangulEndings = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = False
.MatchFuzzy = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

and then hand translate it to Win32 Python, like:

wordApp = Dispatch("Word.Application")
wordDoc=wordApp.Documents.Add(...some word file name...)
wordRange=wordDoc.Range(0,0).Select()
sel=wordApp.Selection
sel.Find.ClearFormatting()
sel.Find.Replacement.ClearFormatting()
sel.Find.Text = "save it"
sel.Find.Replacement.Text = "dont save it"
sel.Find.Forward = True
sel.Find.Wrap = constants.wdFindContinue
sel.Find.Format = False
sel.Find.MatchCase = False
sel.Find.MatchWholeWord = False
sel.Find.MatchByte = False
sel.Find.CorrectHangulEndings = False
sel.Find.MatchAllWordForms = False
sel.Find.MatchSoundsLike = False
sel.Find.MatchWildcards = False
sel.Find.MatchFuzzy = False
sel.Find.Find.Execute(Replace=constants.wdReplaceA ll)
wordDoc.SaveAs(...some word file name...)

Can't say that this works as I typed because I haven't try it myself
but should give you a good start.

Make sure you run the makepy.py program in the
\python23\lib\site-packages\win32com\client directory and install the
"MS Word 11.0 Object Library (8.3)" (or something equivalent). On my
computers, this is not installed automatically and I have to remember
to do it myself or else things won't work.

Good Luck.

Aug 10 '06 #6

P: n/a
John,

I have a notion about translating stuff in a mess and could help you with the translation. But it may be that the conversion
from DOC to formatted test is a bigger problem. Loading the files into Word and saving them in a different format may not be a
practical option if you have many file to do. Googling for batch converters DOC to RTF I couldn't find anything.
If you can solve the conversion problem, pass me a sample file. I'll solve the translation problem for you.

Frederic
----- Original Message -----
From: "John Salerno" <jo******@NOSPAMgmail.com>
Newsgroups: comp.lang.python
To: <py*********@python.org>
Sent: Thursday, August 10, 2006 9:08 PM
Subject: Re: using python to edit a word file?

John Salerno wrote:
But if I save the file to text, won't it lose its formatting?

It looks like I can save it as an XML file and it will retain all the
formatting. Now I just need to decipher where the dates are in all that
mess and replace them, just using a normal text file! :)
--
http://mail.python.org/mailman/listinfo/python-list
Aug 11 '06 #7

P: n/a
Anthra Norell wrote:
John,

I have a notion about translating stuff in a mess and could help you with the translation. But it may be that the conversion
from DOC to formatted test is a bigger problem. Loading the files into Word and saving them in a different format may not be a
practical option if you have many file to do. Googling for batch converters DOC to RTF I couldn't find anything.
If you can solve the conversion problem, pass me a sample file. I'll solve the translation problem for you.

Frederic
What I ended up doing was just saving the Word file as an XML file, and
then writing a little script to process the text file. Then when it
opens back in Word, all the formatting remains. The script isn't ideal,
but it did the bulk of changing the numbers, and then I did a few things
by hand. I love having Python for these chores! :)

import re

xml_file = open('calendar.xml')
xml_data = xml_file.read()
xml_file.close()

pattern = re.compile(r'<w:t>(\d+)</w:t>')

def subtract(match_obj):
date = int(match_obj.group(1)) - 1
return '<w:t>%s</w:t>' % date

new_data = re.sub(pattern, subtract, xml_data)

new_file = open('calendar2007.xml', 'w')
new_file.write(new_data)
new_file.close()
Aug 11 '06 #8

P: n/a
No one could do it any better. Good for you! - Frederic

----- Original Message -----
From: "John Salerno" <jo******@NOSPAMgmail.com>
Newsgroups: comp.lang.python
To: <py*********@python.org>
Sent: Friday, August 11, 2006 4:08 PM
Subject: Re: using python to edit a word file?

Anthra Norell wrote:
John,

I have a notion about translating stuff in a mess and could help you with the translation. But it may be that the
conversion
from DOC to formatted test is a bigger problem. Loading the files into Word and saving them in a different format may not be a
practical option if you have many file to do. Googling for batch converters DOC to RTF I couldn't find anything.
If you can solve the conversion problem, pass me a sample file. I'll solve the translation problem for you.

Frederic

What I ended up doing was just saving the Word file as an XML file, and
then writing a little script to process the text file. Then when it
opens back in Word, all the formatting remains. The script isn't ideal,
but it did the bulk of changing the numbers, and then I did a few things
by hand. I love having Python for these chores! :)

import re

xml_file = open('calendar.xml')
xml_data = xml_file.read()
xml_file.close()

pattern = re.compile(r'<w:t>(\d+)</w:t>')

def subtract(match_obj):
date = int(match_obj.group(1)) - 1
return '<w:t>%s</w:t>' % date

new_data = re.sub(pattern, subtract, xml_data)

new_file = open('calendar2007.xml', 'w')
new_file.write(new_data)
new_file.close()
--
http://mail.python.org/mailman/listinfo/python-list
Aug 11 '06 #9

This discussion thread is closed

Replies have been disabled for this discussion.