I have successfully created a program that searches for a word in multiple files but now I need to be able to search by more than one word. I have add code from a previous discussion to my original program but I am unsure how they should fit together. Can someone clear this up for me? -
#!C:\PYTHON25\PYTHON.EXE
-
-
import os
-
import re
-
dir_name= r'c:\Python25\books\books\books'
-
word=raw_input("Enter a word to search for: ")
-
word2=raw_input("Enter a second word to search for: ")
-
keyList = ['word', 'word2']
-
entryList = [os.path.join(dir_name, fn) for fn in os.listdir(dir_name) if os.path.isfile(os.path.join(dir_name, fn))]
-
for file_name in entryList:
-
for line in file(file_name).readlines():
-
if word in line:
-
print line
-
patt = re.compile('|'.join(keyList), re.IGNORECASE)
-
for fn in dir_name:
-
f = open(fn)
-
for line in f:
-
if patt.search(line.lower()):
-
print line
-
f.close()
-
11 2102
I have successfully created a program that searches for a word in multiple files but now I need to be able to search by more than one word. I have add code from a previous discussion to my original program but I am unsure how they should fit together. Can someone clear this up for me? -
#!C:\PYTHON25\PYTHON.EXE
-
-
import os
-
import re
-
dir_name= r'c:\Python25\books\books\books'
-
word=raw_input("Enter a word to search for: ")
-
word2=raw_input("Enter a second word to search for: ")
-
keyList = ['word', 'word2']
-
entryList = [os.path.join(dir_name, fn) for fn in os.listdir(dir_name) if os.path.isfile(os.path.join(dir_name, fn))]
-
for file_name in entryList:
-
for line in file(file_name).readlines():
-
if word in line:
-
print line
-
patt = re.compile('|'.join(keyList), re.IGNORECASE)
-
for fn in dir_name:
-
f = open(fn)
-
for line in f:
-
if patt.search(line.lower()):
-
print line
-
f.close()
-
Here's one way: - import os
-
import re
-
dir_name = r'c:\Python25\books\books\books'
-
-
##word2 = raw_input("Enter a second word to search for: ")
-
###removed quotes#
-
##keyList = [word, word2]
-
-
def FindWord(word, fileList):
-
for file_name in fileList:
-
for line in file(file_name).readlines():
-
if word in line:
-
print line
-
-
def FindWords(wordList, fileList):
-
patt = re.compile('|'.join(wordList), re.IGNORECASE)
-
for fn in dir_name:
-
f = open(fn)
-
for line in f.readlines(): # added .readlines()
-
if patt.search(line.lower()): # probably don't need .lower()
-
print line
-
f.close()
-
-
-
entryList = [os.path.join(dir_name, fn) for fn in os.listdir(dir_name)
-
if os.path.isfile(os.path.join(dir_name, fn))]
-
-
words = raw_input("Enter one or more words to search for: ")
-
keyList = words.split()
-
if len(keylist) > 1:
-
FindWords(keylist, entryList)
-
else:
-
FindWord(words, entryList)
-
Here is what I have now. It searchs on just fine but when I add a second word it gives me an error. I added a print statement to see if it was splitting the input and it is. I have listed the error I keep getting at the bottom. I can't figure out what is wrong.
Thanks for all your help. -
-
import os
-
import re
-
dir_name = r'c:\Python25\books\books\books'
-
-
-
def FindWord(word, fileList):
-
for file_name in fileList:
-
for line in file(file_name).readlines():
-
if word in line:
-
print line
-
-
def FindWords(wordList, fileList):
-
patt = re.compile('|'.join(wordList), re.IGNORECASE)
-
for fn in dir_name:
-
f = open(fn)
-
for line in f.readlines():
-
if patt.search(line):
-
print line
-
f.close()
-
-
-
entryList = [os.path.join(dir_name, fn) for fn in os.listdir(dir_name)
-
if os.path.isfile(os.path.join(dir_name, fn))]
-
-
words = raw_input("Enter one or more words to search for: ")
-
keyList = words.split()
-
print keyList
-
if len(keyList) > 1:
-
FindWords(words, entryList)
-
else:
-
FindWord(words, entryList)
-
Enter one or more words to search for: bird goat tree
['bird', 'goat', 'tree']
Traceback (most recent call last):
File "C:/Python25/searchtest.py", line 29, in <module>
FindWords(words, entryList)
File "C:/Python25/searchtest.py", line 15, in FindWords
f = open(fn)
IOError: [Errno 2] No such file or directory: 'c'
bvdet 2,851
Expert Mod 2GB
Here is what I have now. It searchs on just fine but when I add a second word it gives me an error. I added a print statement to see if it was splitting the input and it is. I have listed the error I keep getting at the bottom. I can't figure out what is wrong.
Thanks for all your help. -
-
import os
-
import re
-
dir_name = r'c:\Python25\books\books\books'
-
-
-
def FindWord(word, fileList):
-
for file_name in fileList:
-
for line in file(file_name).readlines():
-
if word in line:
-
print line
-
-
def FindWords(wordList, fileList):
-
patt = re.compile('|'.join(wordList), re.IGNORECASE)
-
for fn in dir_name:
-
f = open(fn)
-
for line in f.readlines():
-
if patt.search(line):
-
print line
-
f.close()
-
-
-
entryList = [os.path.join(dir_name, fn) for fn in os.listdir(dir_name)
-
if os.path.isfile(os.path.join(dir_name, fn))]
-
-
words = raw_input("Enter one or more words to search for: ")
-
keyList = words.split()
-
print keyList
-
if len(keyList) > 1:
-
FindWords(words, entryList)
-
else:
-
FindWord(words, entryList)
-
Enter one or more words to search for: bird goat tree
['bird', 'goat', 'tree']
Traceback (most recent call last):
File "C:/Python25/searchtest.py", line 29, in <module>
FindWords(words, entryList)
File "C:/Python25/searchtest.py", line 15, in FindWords
f = open(fn)
IOError: [Errno 2] No such file or directory: 'c'
You have left out some code. Look at this and then look at your code: - >>> dir_name = r'c:\Python25\books\books\books'
-
>>> for fn in dir_name:
-
... print fn
-
...
-
c
-
:
-
\
-
P
-
y
-
t
-
h
-
o
-
n
-
2
-
5
-
\
-
b
-
o
-
o
-
k
-
s
-
\
-
b
-
o
-
o
-
k
-
s
-
\
-
b
-
o
-
o
-
k
-
s
-
>>>
Here is what I have now. It searchs on just fine but when I add a second word it gives me an error. I added a print statement to see if it was splitting the input and it is. I have listed the error I keep getting at the bottom. I can't figure out what is wrong.
Thanks for all your help. -
-
import os
-
import re
-
dir_name = r'c:\Python25\books\books\books'
-
-
-
def FindWord(word, fileList):
-
for file_name in fileList:
-
for line in file(file_name).readlines():
-
if word in line:
-
print line
-
-
def FindWords(wordList, fileList):
-
patt = re.compile('|'.join(wordList), re.IGNORECASE)
-
for fn in dir_name:
-
f = open(fn)
-
for line in f.readlines():
-
if patt.search(line):
-
print line
-
f.close()
-
-
-
entryList = [os.path.join(dir_name, fn) for fn in os.listdir(dir_name)
-
if os.path.isfile(os.path.join(dir_name, fn))]
-
-
words = raw_input("Enter one or more words to search for: ")
-
keyList = words.split()
-
print keyList
-
if len(keyList) > 1:
-
FindWords(words, entryList)
-
else:
-
FindWord(words, entryList)
-
Enter one or more words to search for: bird goat tree
['bird', 'goat', 'tree']
Traceback (most recent call last):
File "C:/Python25/searchtest.py", line 29, in <module>
FindWords(words, entryList)
File "C:/Python25/searchtest.py", line 15, in FindWords
f = open(fn)
IOError: [Errno 2] No such file or directory: 'c'
My bad. Sorry. It should be: -
-
def FindWords(wordList, fileList):
-
patt = re.compile('|'.join(wordList), re.IGNORECASE)
-
for fn in fileList:
-
f = open(fn)
-
for line in f.readlines():
-
if patt.search(line):
-
print line
-
f.close()
-
Thanks guys.
I got it where it is searching for all the words but I need to fine tune it some more.
First of all, when I put in a word like "eat", it is finding everything with those letters in it like "beat". Is there a way to make it only pull up the exact word?
Also it is bring up all of the lines that have one of the words in it. Is there a way to change it so that it only prints the lines that have all of the words in it?
#!C:\PYTHON25\PYTHON.EXE
Shuld always be
#!/usr/bin/env python
And never anything else.
Unlike windows who executes files after ther file extention *nix systems reads the first line of every file before its executed.
And if that line begins with #! the rest of the file is sent as an argument to the specified enviroment. in our case, python.
Thanks guys.
I got it where it is searching for all the words but I need to fine tune it some more.
First of all, when I put in a word like "eat", it is finding everything with those letters in it like "beat". Is there a way to make it only pull up the exact word?
Also it is bring up all of the lines that have one of the words in it. Is there a way to change it so that it only prints the lines that have all of the words in it?
It would be very helpful if you would get in the habit of posting the working code (especially if you still have questions). It helps others figure out this type of problem when they get stuck and it helps us see what the heck you're talking about.
That said:
The "fine tuning" comes down to learning the Regular Expression language and I'm not sure that I'm reading to start calling this the Python/Regex Forum, just yet. Regular-Expression.info is a good place to start with that.
It would be very helpful if you would get in the habit of posting the working code (especially if you still have questions). It helps others figure out this type of problem when they get stuck and it helps us see what the heck you're talking about.
That said:
The "fine tuning" comes down to learning the Regular Expression language and I'm not sure that I'm reading to start calling this the Python/Regex Forum, just yet. Regular-Expression.info is a good place to start with that.
It's not much different that is listed above: -
import os
-
import re
-
dir_name = r'c:\Python25\books\books\books'
-
-
def FindWord(word, fileList):
-
for file_name in fileList:
-
for line in file(file_name).readlines():
-
if word in line:
-
print line
-
-
def FindWords(wordList, fileList):
-
patt = re.compile('|'.join(wordList), re.IGNORECASE)
-
for fn in fileList:
-
f = open(fn)
-
for line in f.readlines():
-
if patt.search(line):
-
print line
-
f.close()
-
entryList = [os.path.join(dir_name, fn) for fn in os.listdir(dir_name)
-
if os.path.isfile(os.path.join(dir_name, fn))]
-
-
words = raw_input("Enter one or more words to search for: ")
-
keyList = words.split()
-
if len(keyList) > 1:
-
FindWords(keyList, entryList)
-
else:
-
FindWord(words, entryList)
-
bvdet 2,851
Expert Mod 2GB
Given a file name and a key word list, this function will print any line that contains a word in the key word list: - def matchAnyWord(fn, keyList):
-
patt = re.compile('(?<![a-z])%s(?![a-z])' % '(?![a-z])|(?<![a-z])'.join(keyList), re.IGNORECASE)
-
f = open(fn)
-
for line in f:
-
if patt.search(line.lower()):
-
print line
-
f.close()
Given a file name and a key word list, this function will print any line that contains all of the words in the key word list: -
def matchAllWords(fn, keyList):
-
pattList = [re.compile('(?<![a-z])%s(?![a-z])' % key) for key in keyList]
-
f = open(fn)
-
for line in f:
-
matchList = []
-
for patt in pattList:
-
matchList.append(patt.search(line.lower()))
-
print matchList
-
if None not in matchList:
-
print line
-
f.close()
Great, thanks, bvdet. I'll try to incorporate that.
Given a file name and a key word list, this function will print any line that contains a word in the key word list: - def matchAnyWord(fn, keyList):
-
patt = re.compile('(?<![a-z])%s(?![a-z])' % '(?![a-z])|(?<![a-z])'.join(keyList), re.IGNORECASE)
-
f = open(fn)
-
for line in f:
-
if patt.search(line.lower()):
-
print line
-
f.close()
Given a file name and a key word list, this function will print any line that contains all of the words in the key word list: -
def matchAllWords(fn, keyList):
-
pattList = [re.compile('(?<![a-z])%s(?![a-z])' % key) for key in keyList]
-
f = open(fn)
-
for line in f:
-
matchList = []
-
for patt in pattList:
-
matchList.append(patt.search(line.lower()))
-
print matchList
-
if None not in matchList:
-
print line
-
f.close()
bvdet 2,851
Expert Mod 2GB
Here is an interactive exercise: - >>> keyList = ['thread', 'needle']
-
>>> patt = re.compile('(?<![a-z])%s(?![a-z])' % '(?![a-z])|(?<![a-z])'.join(keyList), re.IGNORECASE)
-
>>> patt.search('The thread was threaded through several needles')
-
<_sre.SRE_Match object at 0x00DB6138>
-
>>> print patt.search('The threads were threaded through several needles')
-
None
-
>>> pattList = [re.compile('(?<![a-z])%s(?![a-z])' % key) for key in keyList]
-
>>> for patt in pattList:
-
... print patt.search('The thread was threaded through several needles')
-
...
-
<_sre.SRE_Match object at 0x00DB62F8>
-
None
-
>>> for patt in pattList:
-
... print patt.search('The thread was threaded through a needle')
-
...
-
<_sre.SRE_Match object at 0x00DB6288>
-
<_sre.SRE_Match object at 0x00DB6288>
-
>>>
The re expression was modified to exclude matches if a key word was preceded or followed by any letter in the set '[a-z]'.
Sign in to post your reply or Sign up for a free account.
Similar topics
by: dpg |
last post by:
How do site searches work?
I want to create a MySQL database with a field called "keywords". Then a
form with a search phrase input box.
I can't figure how to get the results with multiple...
|
by: kindermaxiz |
last post by:
hey yall
I want to read a text file and check for a special word on it, how can
I do that? Also I want to search for a special word such as "?>" and
write something on the line that preceeds it if...
|
by: Derek Mortimer |
last post by:
This is my first attempt to join a user group.
I run an iMac and have recently upgraded to OSX, OfficeX, after years
of problems.
I am having difficulties with Find, or maybe do not understand its...
|
by: mike420 |
last post by:
In the context of LATEX, some Pythonista asked what the big
successes of Lisp were. I think there were at least three *big*
successes.
a. orbitz.com web site uses Lisp for algorithms, etc.
b....
|
by: Robert Oschler |
last post by:
I read a while back that MySQL will only use one index per query. (If this
is not so, please tell me and point me to a doc that gives a good
explanation of MySQL's current index usage policy). ...
|
by: prabha |
last post by:
Hello Everybody,
I have to conert the word doc to multiple html files,according to the templates in the word doc.
I had converted the word to xml.Also through Exsl ,had finished the multiple...
|
by: jayjay |
last post by:
I'm trying to help a friend setup a database to track resumes. The
candidates will submit their resume in a Word doc format, and I'd like to
make a search that will do a context search of the...
|
by: Frost |
last post by:
Hi All,
I am a newbie i have written a c program on unix for line by line
comparison for two files now could some one help on how i could do word
by word comparison in case both lines have the...
|
by: veer |
last post by:
Hi
i made program on searching and if a word is present in a file more than one time this program search it one time and exit the file but i want to show all the locations of the searched word in...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
| |