473,379 Members | 1,257 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,379 software developers and data experts.

Searching a string in a multiple-file directory

11
Hello again!

I've been discussing the advantages of Python over C-Shell. I was asked if I can replace the use of "grep" in searching for a specific string in a multi-file directory - I have to go through all the files, and find all the occurrences of a specific string.
Can anyone suggest an idea?

Thanks!
Sep 2 '07 #1
5 2570
bartonc
6,596 Expert 4TB
Hello again!

I've been discussing the advantages of Python over C-Shell. I was asked if I can replace the use of "grep" in searching for a specific string in a multi-file directory - I have to go through all the files, and find all the occurrences of a specific string.
Can anyone suggest an idea?

Thanks!
(IMO) Python can be used as a great adjunct to your OS's tools by writing very powerful scripts. But if the OS already has a tool that will do the job, then the only reason to write such a function would be for academic purposes. In that light, Python does have:
  • The glob module (actually only one function: glob())
  • Use glob() to get a list of files from a pattern according to the rules used by the Unix shell
  • The re module
  • Do Regular Expression matching on the entire contents fo the file
And there you'd have grep. Good practice, maybe on *nix, but very handy on an OS that doesn't have a native grep().
Sep 2 '07 #2
peruron
11
Okay, thanks, but I think we have a little misunderstanding: I'm interested in searching for a certain string inside the files, and not as a path name. If I understood correctly the glob module, it searches a given path.
My purpose is a bit more than academic - I'm trying to convert people to Python (:
Sep 2 '07 #3
bartonc
6,596 Expert 4TB
Okay, thanks, but I think we have a little misunderstanding: I'm interested in searching for a certain string inside the files, and not as a path name. If I understood correctly the glob module, it searches a given path.
My purpose is a bit more than academic - I'm trying to convert people to Python (:
Any misunderstanding is my fault. I have very little experience with unix style commands (maybe I'm thinking of something called egrep that searches a set of files (glob-style) and then applies a regex to each file).

Anyway, this type of script takes very few lines of python code and I find the docs (on Windows, anyway) to be very friendly.
The trickiest part is working with the re module which employs a pythonic interface (far from Perl-style syntax) and has a few odd flags for working with multi-line text (re.DOTALL, for example). You'll tend to find the usage:
Expand|Select|Wrap|Line Numbers
  1. import re
  2. myPattObj = re.compile('some[regex]pattern')
  3. resList = myPattObj.findall(someTextObject)
which in turn yields a bunch of match objects (assuming a match) that have their own set of attributes. It's all so very OOP - perhaps the bigger conversion issue..?

[EDIT: maybe we could just write a people-to-python converter-LOL]
Sep 2 '07 #4
ghostdog74
511 Expert 256MB
interested in searching for a certain string inside the files, and not as a path name.
you can use os.walk, or os.listdir() to get a list of files. then iterate over them using for loop...while iterating over each file, open the file for reading and using a for loop to iterate the files line by line. You can check for string patterns using the "in" operator..eg if "string_to_search" in line: .....for more complex string matching, you can use regular expressions (as mentioned by barton). that's what happens to grep at the back end too...regular expressions..however do note that not every problem needs to be solved by regular expression.
If I understood correctly the glob module, it searches a given path.
yes it does.
My purpose is a bit more than academic - I'm trying to convert people to Python (:
wise move.
Sep 3 '07 #5
bartonc
6,596 Expert 4TB
Okay, thanks, but I think we have a little misunderstanding: I'm interested in searching for a certain string inside the files, and not as a path name. If I understood correctly the glob module, it searches a given path.
My purpose is a bit more than academic - I'm trying to convert people to Python (:
Here's a great link for you: Python Advocacy HOWTO.
Sep 14 '07 #6

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Christopher R. Barry | last post by:
I need to search and replace multiple words in one pass of an input stream or string. For example, given the input: "The quick brown fox jumped over the lazy dog's back" and given the...
1
by: Adrian | last post by:
This script works well for searching thru a SELECT list but doesnt work so well with the MULTIPLE option. I need to modify it to work for a SELECT MULTIPLE but I have no idea where to begin, any...
7
by: Brian Mitchell | last post by:
Is there an easy way to pull a date/time stamp from a string? The DateTime stamp is located in different parts of each string and the DateTime stamp could be in different formats (mm/dd/yy or...
33
by: Geoff Jones | last post by:
Hiya I have a DataTable containing thousands of records. Each record has a primary key field called "ID" and another field called "PRODUCT" I want to retrieve the rows that satisy the following...
8
by: Allan Ebdrup | last post by:
What would be the fastest way to search 18,000 strings of an average size of 10Kb, I can have all the strings in memory, should I simply do a instr on all of the strings? Or is there a faster way?...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.