473,398 Members | 2,389 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

Analysing Word documents (slow) What's wrong with this code please!

Anyone has a hint how else to get faster results?
(This is to find out what was bold in the document, in order to grab
documents ptoduced in word and generate html (web pages) and xml
(straight data) versions)

# START ========================
import win32com.client
import tkFileDialog, time

# Launch Word
MSWord = win32com.client.Dispatch("Word.Application")

myWordDoc = tkFileDialog.askopenfilename()

MSWord.Documents.Open(myWordDoc)

boldRanges=[] #list of bold ranges
boldStart = -1
boldEnd = -1
t1= time.clock()
for i in range(len(MSWord.Documents[0].Content.Text)):
if MSWord.Documents[0].Range(i,i+1).Bold : # testing for bold
property
if boldStart == -1:
boldStart=i
else:
boldEnd= i
else:
if boldEnd != -1:
boldRanges.append((boldStart,boldEnd))
boldStart= -1
boldEnd = -1
t2 = time.clock()
MSWord.Quit()

print boldRanges #see what we got
print "Analysed in ",t2-t1
# END =====================================

Thanks in advance
Jul 18 '05 #1
4 2124
jmdeschamps wrote:
Anyone has a hint how else to get faster results?
(This is to find out what was bold in the document, in order to grab
documents ptoduced in word and generate html (web pages) and xml
(straight data) versions) [...] for i in range(len(MSWord.Documents[0].Content.Text)):
if MSWord.Documents[0].Range(i,i+1).Bold : # testing for bold


Perhaps you can search for bold text. The Word search dialog allows this.
And when you use the keybord macro recording feature of Word, you can
probably figure out how to use that search feature from Python.

Daniel

Jul 18 '05 #2
jmdeschamps wrote:
Anyone has a hint how else to get faster results?
(This is to find out what was bold in the document, in order to grab
documents ptoduced in word and generate html (web pages) and xml
(straight data) versions)

# START ========================
import win32com.client
import tkFileDialog, time

# Launch Word
MSWord = win32com.client.Dispatch("Word.Application")

myWordDoc = tkFileDialog.askopenfilename()

MSWord.Documents.Open(myWordDoc)

boldRanges=[] #list of bold ranges
boldStart = -1
boldEnd = -1
t1= time.clock()
for i in range(len(MSWord.Documents[0].Content.Text)):
if MSWord.Documents[0].Range(i,i+1).Bold : # testing for bold
property
Vaguely knowing how pythoncom works, you'd really better avoid asking for
MSWord.Documents[0] at each loop step: pythoncom will fetch the COM objects
corresponding to all attributes and methods you ask for dynamically and it may
cost a lot of time. So doing:

doc = MSWord.Documents[0]
for i in range(len(doc.Content.text)):
if doc.Range(i,i+1).Bold: ...

may greatly improve performances.
if boldStart == -1:
boldStart=i
else:
boldEnd= i
else:
if boldEnd != -1:
boldRanges.append((boldStart,boldEnd))
boldStart= -1
boldEnd = -1
t2 = time.clock()
MSWord.Quit()

print boldRanges #see what we got
print "Analysed in ",t2-t1
# END =====================================

Thanks in advance

--
- Eric Brunel <eric dot brunel at pragmadev dot com> -
PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com

Jul 18 '05 #3
"Daniel Dittmar" <da************@sap.com> wrote in message news:<bu**********@news1.wdf.sap-ag.de>...
jmdeschamps wrote:
Anyone has a hint how else to get faster results?
(This is to find out what was bold in the document, in order to grab
documents ptoduced in word and generate html (web pages) and xml
(straight data) versions) [...]

.... Perhaps you can search for bold text. The Word search dialog allows this.
And when you use the keybord macro recording feature of Word, you can
probably figure out how to use that search feature from Python.

Daniel


Thanks Paul Prescod suggested this also, works great!

Jean-Marc
Jul 18 '05 #4
Eric Brunel <er*********@N0SP4M.com> wrote in message news:<bu*********@news-reader4.wanadoo.fr>...
jmdeschamps wrote:
Anyone has a hint how else to get faster results?
(This is to find out what was bold in the document, in order to grab
documents ptoduced in word and generate html (web pages) and xml
(straight data) versions)

# START ========================
import win32com.client
import tkFileDialog, time

# Launch Word
MSWord = win32com.client.Dispatch("Word.Application")

myWordDoc = tkFileDialog.askopenfilename()

MSWord.Documents.Open(myWordDoc)

boldRanges=[] #list of bold ranges
boldStart = -1
boldEnd = -1
t1= time.clock()
for i in range(len(MSWord.Documents[0].Content.Text)):
if MSWord.Documents[0].Range(i,i+1).Bold : # testing for bold
property


Vaguely knowing how pythoncom works, you'd really better avoid asking for
MSWord.Documents[0] at each loop step: pythoncom will fetch the COM objects
corresponding to all attributes and methods you ask for dynamically and it may
cost a lot of time. So doing:

doc = MSWord.Documents[0]
for i in range(len(doc.Content.text)):
if doc.Range(i,i+1).Bold: ...

may greatly improve performances.

....
Thanks, it does! And using builtin Find object also.

Jean-Marc
Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Otis Hunter | last post by:
I have been given an Access Database which contains a table that has an OLE object field that contains a Word document. That table contains hundreds of records. I would like to find out how I can...
8
by: sudha | last post by:
Hi, To open a word doc from c#, i use the following code : Word.ApplicationClass WordApp = new Word.ApplicationClass (); // give any file name of your choice. object fileName =...
4
by: Daniel | last post by:
Hello, i have a problem with the word automation from c#. First, i want to mention, that i don't have any dependencies from word in my c#-project, i want to use the system.reflection model to...
5
by: Mark Jerde | last post by:
Sorry if these are the wrong newsgroups -- so many to choose from! The first draft of a Word document is about 200 pages. The second draft is about 275 pages. I am required to produce a 3rd...
7
by: Dave | last post by:
Apologies for the newbie question. I have created a vb.net program for my company that is designed to work with Word Templates (about forty of them that we commonly use) that are selected by the...
4
by: JensB | last post by:
I have VB.Net VS2005 App which creates MS Word documents. Clients are using Word 2000 and Word2003. Project refers to MS Word 9.0 Object library, declaring Word as an object. On the Word 2000...
5
by: Carstonio | last post by:
I use ASP to display links to Word documents on an intranet. Is there a way in ASP to do text searches on the documents' contents? I want the results to have the link to the Word document plus two...
2
by: tamaker | last post by:
I have a registration form where a user is able to, upon submission of the form, have their submission entered into a simple database... now Im looking to create a word document on the fly from...
4
by: etuncer | last post by:
Hello All, I have Access 2003, and am trying to build a database for my small company. I want to be able to create a word document based on the data entered through a form. the real question is...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.