473,467 Members | 1,587 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

How to Search for a string pattern in a MS Word doc using python

2 New Member
Hi All,

I have been trying to read word doc and search for a particluar string pattern using python.I have used the py32win API .

I am succesfull till opening the file like

import win32com.client
word = win32com.client.Dispatch('Word.Application')
f = word.Documents.Open("<filename>.doc")

I am not sure how to proceed further reading each line and compare for the pattern.

If anyone has any idea , plz respond ...

Thanks in advance.
Anil.
Sep 18 '08 #1
1 5705
bvdet
2,851 Recognized Expert Moderator Specialist
You don't have to open the word document to search for a string. For example, to find email addresses in a word document:
Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. fn = 'sample.doc'
  4. fStr = open(fn, 'rb').read()
  5. patt = re.compile(r'\b[a-zA-Z0-9.]+@[a-zA-Z0-9]+\.[a-z]{3}\b')
  6. print patt.findall(fStr)
Sep 18 '08 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

8
by: Sharif T. Karim | last post by:
I am trying to do the following with my search script that looks for records in a mysql table. The following is an example of what I am trying to do. Text being searched: -- The brown fox...
10
by: Anand Pillai | last post by:
To search a word in a group of words, say a paragraph or a web page, would a string search or a regexp search be faster? The string search would of course be, if str.find(substr) != -1:...
10
by: Case Nelson | last post by:
Hi there I've just been playing around with some python code and I've got a fun little optimization problem I could use some help with. Basically, the program needs to take in a random list of no...
5
by: Vamsee Krishna Gomatam | last post by:
Hello, I'm having some problems understanding Regexps in Python. I want to replace "<google>PHRASE</google>" with "<a href=http://www.google.com/search?q=PHRASE>PHRASE</a>" in a block of text....
2
by: Babu Mannaravalappil | last post by:
Hi, I want to replace some words in my text files (actually transpose). For example, I have a whole lot of expressions (words) in my files as follows: TABLECUSTOMERS TABLEORDERS...
6
by: Rizyak | last post by:
******************** alt.php.sql,comp databases.ms-sqlserver microsoft.public.sqlserver.programming *********************************** Why doesn't this work: SELECT * FROM 'Events'
5
by: Gallagher, Tim (NE) | last post by:
I am new to python and I want to compare 2 strings, here is my code: import active_directory import re lstUsers = users = active_directory.root() for user in users.search...
3
by: Chung Leong | last post by:
Here's the rest of the tutorial I started earlier: Aside from text within a document, Indexing Service let you search on meta information stored in the files. For example, MusicArtist and...
1
by: Michael Yanowitz | last post by:
Hello: I am hoping someone knows if there is an easier way to do this or someone already implemented something that does this, rather than reinventing the wheel: I have been using the...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.