473,597 Members | 2,413 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

using python to edit a word file?

I figured my first step is to install the win32 extension, which I did,
but I can't seem to find any documentation for it. A couple of the links
on Mark Hammond's site don't seem to work.

Anyway, all I need to do is search in the Word document for certain
strings and either delete them or replace them. Easy enough, if only I
knew which function, etc. to use.

Hope someone can push me in the right direction.

Thanks.
Aug 10 '06 #1
8 4205
John Salerno <jo******@NOSPA Mgmail.comwrite s:
I figured my first step is to install the win32 extension, which I
did, but I can't seem to find any documentation for it. A couple of
the links on Mark Hammond's site don't seem to work.

Anyway, all I need to do is search in the Word document for certain
strings and either delete them or replace them. Easy enough, if only I
knew which function, etc. to use.

Hope someone can push me in the right direction.
Maybe this will be helpful:

http://aspn.activestate.com/ASPN/Coo.../Recipe/279003

--
Regards,
Rob
Aug 10 '06 #2
Rob Wolfe wrote:
John Salerno <jo******@NOSPA Mgmail.comwrite s:
>I figured my first step is to install the win32 extension, which I
did, but I can't seem to find any documentation for it. A couple of
the links on Mark Hammond's site don't seem to work.

Anyway, all I need to do is search in the Word document for certain
strings and either delete them or replace them. Easy enough, if only I
knew which function, etc. to use.

Hope someone can push me in the right direction.

Maybe this will be helpful:

http://aspn.activestate.com/ASPN/Coo.../Recipe/279003
But if I save the file to text, won't it lose its formatting?

More specifically, here's what I have: a four-page calendar, each page
with three months on it. The months are in tables, which is why I don't
think making a text file will help me here, because I'll lose all that.
What I need to do is renumber all the dates, basically replacing a
number with itself minus 1. So it's not a simple find/replace task, and
there doesn't seem to be a way to do this in Word's find/replace feature
(but if there is, please let me know!)
Aug 10 '06 #3
On 2006-08-10 15:15:34, John Salerno wrote:
I figured my first step is to install the win32 extension, which I did,
but I can't seem to find any documentation for it. A couple of the links
on Mark Hammond's site don't seem to work.

Anyway, all I need to do is search in the Word document for certain
strings and either delete them or replace them. Easy enough, if only I
knew which function, etc. to use.

Hope someone can push me in the right direction.
When Word is installed, you have a few COM interfaces to Word. I'm not sure
how to access these with Python (but documentation about using COM with
Python should help you here), and I'm not sure whether what you want is
available (but the Word COM documentation should help you with that).

Gerhard

Aug 10 '06 #4
John Salerno wrote:
But if I save the file to text, won't it lose its formatting?
It looks like I can save it as an XML file and it will retain all the
formatting. Now I just need to decipher where the dates are in all that
mess and replace them, just using a normal text file! :)
Aug 10 '06 #5
John Salerno wrote:
I figured my first step is to install the win32 extension, which I did,
but I can't seem to find any documentation for it. A couple of the links
on Mark Hammond's site don't seem to work.

Anyway, all I need to do is search in the Word document for certain
strings and either delete them or replace them. Easy enough, if only I
knew which function, etc. to use.

Hope someone can push me in the right direction.

Thanks.
The easiest way for me to do things like this is to do it in Word and
record a VB Macro. For instance you will see something like this:

Selection.Find. ClearFormatting
Selection.Find. Replacement.Cle arFormatting
With Selection.Find
.Text = "save it"
.Replacement.Te xt = "dont save it"
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchByte = False
.CorrectHangulE ndings = False
.MatchAllWordFo rms = False
.MatchSoundsLik e = False
.MatchWildcards = False
.MatchFuzzy = False
End With
Selection.Find. Execute Replace:=wdRepl aceAll

and then hand translate it to Win32 Python, like:

wordApp = Dispatch("Word. Application")
wordDoc=wordApp .Documents.Add( ...some word file name...)
wordRange=wordD oc.Range(0,0).S elect()
sel=wordApp.Sel ection
sel.Find.ClearF ormatting()
sel.Find.Replac ement.ClearForm atting()
sel.Find.Text = "save it"
sel.Find.Replac ement.Text = "dont save it"
sel.Find.Forwar d = True
sel.Find.Wrap = constants.wdFin dContinue
sel.Find.Format = False
sel.Find.MatchC ase = False
sel.Find.MatchW holeWord = False
sel.Find.MatchB yte = False
sel.Find.Correc tHangulEndings = False
sel.Find.MatchA llWordForms = False
sel.Find.MatchS oundsLike = False
sel.Find.MatchW ildcards = False
sel.Find.MatchF uzzy = False
sel.Find.Find.E xecute(Replace= constants.wdRep laceAll)
wordDoc.SaveAs( ...some word file name...)

Can't say that this works as I typed because I haven't try it myself
but should give you a good start.

Make sure you run the makepy.py program in the
\python23\lib\s ite-packages\win32c om\client directory and install the
"MS Word 11.0 Object Library (8.3)" (or something equivalent). On my
computers, this is not installed automatically and I have to remember
to do it myself or else things won't work.

Good Luck.

Aug 10 '06 #6
John,

I have a notion about translating stuff in a mess and could help you with the translation. But it may be that the conversion
from DOC to formatted test is a bigger problem. Loading the files into Word and saving them in a different format may not be a
practical option if you have many file to do. Googling for batch converters DOC to RTF I couldn't find anything.
If you can solve the conversion problem, pass me a sample file. I'll solve the translation problem for you.

Frederic
----- Original Message -----
From: "John Salerno" <jo******@NOSPA Mgmail.com>
Newsgroups: comp.lang.pytho n
To: <py*********@py thon.org>
Sent: Thursday, August 10, 2006 9:08 PM
Subject: Re: using python to edit a word file?

John Salerno wrote:
But if I save the file to text, won't it lose its formatting?

It looks like I can save it as an XML file and it will retain all the
formatting. Now I just need to decipher where the dates are in all that
mess and replace them, just using a normal text file! :)
--
http://mail.python.org/mailman/listinfo/python-list
Aug 11 '06 #7
Anthra Norell wrote:
John,

I have a notion about translating stuff in a mess and could help you with the translation. But it may be that the conversion
from DOC to formatted test is a bigger problem. Loading the files into Word and saving them in a different format may not be a
practical option if you have many file to do. Googling for batch converters DOC to RTF I couldn't find anything.
If you can solve the conversion problem, pass me a sample file. I'll solve the translation problem for you.

Frederic
What I ended up doing was just saving the Word file as an XML file, and
then writing a little script to process the text file. Then when it
opens back in Word, all the formatting remains. The script isn't ideal,
but it did the bulk of changing the numbers, and then I did a few things
by hand. I love having Python for these chores! :)

import re

xml_file = open('calendar. xml')
xml_data = xml_file.read()
xml_file.close( )

pattern = re.compile(r'<w :t>(\d+)</w:t>')

def subtract(match_ obj):
date = int(match_obj.g roup(1)) - 1
return '<w:t>%s</w:t>' % date

new_data = re.sub(pattern, subtract, xml_data)

new_file = open('calendar2 007.xml', 'w')
new_file.write( new_data)
new_file.close( )
Aug 11 '06 #8
No one could do it any better. Good for you! - Frederic

----- Original Message -----
From: "John Salerno" <jo******@NOSPA Mgmail.com>
Newsgroups: comp.lang.pytho n
To: <py*********@py thon.org>
Sent: Friday, August 11, 2006 4:08 PM
Subject: Re: using python to edit a word file?

Anthra Norell wrote:
John,

I have a notion about translating stuff in a mess and could help you with the translation. But it may be that the
conversion
from DOC to formatted test is a bigger problem. Loading the files into Word and saving them in a different format may not be a
practical option if you have many file to do. Googling for batch converters DOC to RTF I couldn't find anything.
If you can solve the conversion problem, pass me a sample file. I'll solve the translation problem for you.

Frederic

What I ended up doing was just saving the Word file as an XML file, and
then writing a little script to process the text file. Then when it
opens back in Word, all the formatting remains. The script isn't ideal,
but it did the bulk of changing the numbers, and then I did a few things
by hand. I love having Python for these chores! :)

import re

xml_file = open('calendar. xml')
xml_data = xml_file.read()
xml_file.close( )

pattern = re.compile(r'<w :t>(\d+)</w:t>')

def subtract(match_ obj):
date = int(match_obj.g roup(1)) - 1
return '<w:t>%s</w:t>' % date

new_data = re.sub(pattern, subtract, xml_data)

new_file = open('calendar2 007.xml', 'w')
new_file.write( new_data)
new_file.close( )
--
http://mail.python.org/mailman/listinfo/python-list
Aug 11 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
4312
by: Daniel Cloutier | last post by:
Hi, is it possible to edit or write Word-files out of a Python-Program? thx in advance daniel
2
3743
by: Russell E. Owen | last post by:
I'm trying to build Python 2.3.4 from source on a RedHat Enterprise machine for installation in a net-wide accessible directory /net/python. I tried all of the following variants of ./configure (the first was required for Python 2.3.3 on RedHat 9): ../configure --prefix=/net/python --enable-unicode=ucs4 ../configure --prefix=/net/python ../configure --prefix=/net/python --enable-unicode=ucs2 All of these result in the ominous message...
10
2021
by: Paul Kooistra | last post by:
I need a tool to browse text files with a size of 10-20 Mb. These files have a fixed record length of 800 bytes (CR/LF), and containt records used to create printed pages by an external company. Each line (record) contains an 2-character identifier, like 'A0' or 'C1'. The identifier identifies the record format for the line, thereby allowing different record formats to be used in a textfile. For example: An A0 record may consist of:
17
3856
by: Paul Rubin | last post by:
Dumb question from a Windows ignoramus: I find myself needing to write a Python app (call it myapp.py) that uses tkinter, which as it happens has to be used under (ugh) Windows. That's Windows XP if it makes any difference. I put a shortcut to myapp.py on the desktop and it shows up as a little green snake icon, which is really cool and Pythonic. When I double click the icon, the app launches just fine and the tkinter interface does...
1
6914
by: j | last post by:
Hi, I've been trying to do line/character counts on documents that are being uploaded. As well as the "counting" I also have to remove certain sections from the file. So, firstly I was working with uploaded MS WORD .doc files. Using code like that below: strLine = sr.ReadLine While Not IsNothing(strLine) 'Not eof If Trim(strLine) <> "" Then 'Not blank
0
2040
by: napolpie | last post by:
DISCUSSION IN USER nappie writes: Hello, I'm Peter and I'm new in python codying and I'm using parsying to extract data from one meteo Arpege file. This file is long file and it's composed by word and number arguments like this: GRILLE EURAT5 Coin Nord-Ouest : 46.50/ 0.50 Coin Sud-E Hello, I'm Peter and I'm new in python codying and I'm using parsying to extract data from one meteo Arpege file.
1
2250
by: Chris Carlen | last post by:
Hi: I'm writing a Python program, a hex line editor, which takes in a line of input from the user such as: -e 01 02 "abc def" 03 04 Trouble is, I don't want to split the quoted part where the space occurs.
3
1819
by: Eric_Dexter | last post by:
I am trying to take some data in file that looks like this command colnum_1 columnum_2 and look for the command and then cange the value in the collum(word) number indicated. I am under the impression I need enumerate but I am not sure what to do with it any help would be nice. import sys
3
3907
by: J-Burns | last post by:
Hello. Im a bit new to using Tkinter and im not a real pro in programming itself... :P. Need some help here. Problem 1: How do I make something appear on 2 separate windows using Tkinter? By this I mean that the format would be something like this: You have Page1 : This has 2-3 buttons on it. Clicking on each button opens up a new window respectively having
0
7959
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8263
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8379
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8021
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8254
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6677
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
3917
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2393
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1492
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.