473,791 Members | 2,973 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

efficient text file search.

Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'

?

BTW:
does "for line in f: " read a block of line to te memory or is it
simply calls f.readline() many times?

thanks
amit

Sep 11 '06 #1
10 3773

"noro" <am******@gmail .comschreef in bericht
news:11******** **************@ h48g2000cwc.goo glegroups.com.. .
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'

yes, more efficient would be:
grep (http://www.gnu.org/software/grep/)

Sep 11 '06 #2
:)

via python...

Luuk wrote:
"noro" <am******@gmail .comschreef in bericht
news:11******** **************@ h48g2000cwc.goo glegroups.com.. .
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'


yes, more efficient would be:
grep (http://www.gnu.org/software/grep/)
Sep 11 '06 #3

"noro" <am******@gmail .comschreef in bericht
news:11******** ************@h4 8g2000cwc.googl egroups.com...
:)

via python...

Luuk wrote:
>"noro" <am******@gmail .comschreef in bericht
news:11******* *************** @h48g2000cwc.go oglegroups.com. ..
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'


yes, more efficient would be:
grep (http://www.gnu.org/software/grep/)
ok, a more serious answer:

some googling turned op the following.
Second paragraph of chapter 14 of http://www.amk.ca/python/2.1/

a.. The speed of line-oriented file I/O has been improved because people
often complain about its lack of speed, and because it's often been used as
a naïve benchmark. The readline() method of file objects has therefore been
rewritten to be much faster. The exact amount of the speedup will vary from
platform to platform depending on how slow the C library's getc() was, but
is around 66%, and potentially much faster on some particular operating
systems. Tim Peters did much of the benchmarking and coding for this change,
motivated by a discussion in comp.lang.pytho n.
A new module and method for file objects was also added, contributed by Jeff
Epler. The new method, xreadlines(), is similar to the existing xrange()
built-in. xreadlines() returns an opaque sequence object that only supports
being iterated over, reading a line on every iteration but not reading the
entire file into memory as the existing readlines() method does. You'd use
it like this:
for line in sys.stdin.xread lines():
# ... do something for each line ...
...
For a fuller discussion of the line I/O changes, see the python-dev summary
for January 1-15, 2001 at http://www.amk.ca/python/dev/2001-01-1.html.

Sep 11 '06 #4
noro wrote:
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'
Probably better to read the whole file at once if it isn't too big:
f = file('somefile' )
data = f.read()
if 'string' in data:
print 'FOUND'
Sep 11 '06 #5
Ant

noro wrote:
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'
break
^^^
Add a 'break' after the print statement - that way you won't have to
read the entire file unless the string isn't there. That's probably not
the sort of advice you're after though :-)

Can't see why reading the entire file in as the other poster suggested
would help, and seeing as "for line in f:" is now regarded as the
pythonic way of working with lines of text in a file, then I'd assume
that the implementation would be at least as fast as "for line in
f.xreadlines(): "

Sep 11 '06 #6

Luuk wrote:
[snip]
some googling turned op the following.
Second paragraph of chapter 14 of http://www.amk.ca/python/2.1/
[snip]
For a fuller discussion of the line I/O changes, see the python-dev summary
for January 1-15, 2001 at http://www.amk.ca/python/dev/2001-01-1.html.
That is *HISTORY*. That is Python 2.1. That is the year 2001.
xreadlines is as dead as a dodo.

Sep 11 '06 #7

"John Machin" <sj******@lexic on.netschreef in bericht
news:11******** **************@ d34g2000cwd.goo glegroups.com.. .
>
Luuk wrote:
[snip]
>some googling turned op the following.
Second paragraph of chapter 14 of http://www.amk.ca/python/2.1/
[snip]
>For a fuller discussion of the line I/O changes, see the python-dev
summary
for January 1-15, 2001 at http://www.amk.ca/python/dev/2001-01-1.html.

That is *HISTORY*. That is Python 2.1. That is the year 2001.
xreadlines is as dead as a dodo.
Thats's why i started my reply with:
"some googling turned op the following."
i did not state that further googling was unneeded ;-)
Sep 11 '06 #8
noro wrote:
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'

?
Is this something you want to do only once for a given file ? The
replies so far seem to imply so and in this case I doubt that you can
do anything more efficient. OTOH, if the same file is to be searched
repeatedly for different strings, an appropriate indexing scheme can
speed things up considerably on average.

George

Sep 11 '06 #9
OK, am not sure why, but

fList=file('som efile').read()
if fList.find('str ing') != -1:
print 'FOUND'

works much much faster.

it is strange since i thought 'for line in file('somefile' )' is
optemized and read pages to the memory,
i guess not..

George Sakkis wrote:
noro wrote:
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'

?

Is this something you want to do only once for a given file ? The
replies so far seem to imply so and in this case I doubt that you can
do anything more efficient. OTOH, if the same file is to be searched
repeatedly for different strings, an appropriate indexing scheme can
speed things up considerably on average.

George
Sep 11 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

19
5708
by: Materialised | last post by:
Hi everyone, What I am wanting to do, is to copy, a simple plain text file, to another file, but omitting duplicate items. The way I thought of doing this, involved copying all the items into a array, and then looping through that array looking for duplicates, removing them, and then writing to another file. This seems a very long and drawn out way of doing this to me, and also I
60
49200
by: Julie | last post by:
What is the *fastest* way in .NET to search large on-disk text files (100+ MB) for a given string. The files are unindexed and unsorted, and for the purposes of my immediate requirements, can't be indexed/sorted. I don't want to load the entire file into physical memory, memory-mapped files are ok (and preferred). Speed/performance is a requirement -- the target is to locate the string in 10 seconds or less for a 100 MB file. The...
1
1276
by: rodchar | last post by:
hey all, is there a quick way to read.all the contents of a text file, search for the text, and if it finds that text to read the entire line? thanks, rodchar
0
4239
by: Chung Leong | last post by:
Here's a short tutorial on how to the OLE-DB extension to access Windows Indexing Service. Impress your office-mates with a powerful full-text search feature on your intranet. It's easier than you think. First, download and install the extension (http://sourceforge.net/project/showfiles.php?group_id=171247&package_id=198554). Simply unzip the file and copy the correct version of php_oledb.dll into the PHP extensions folder. Then add the...
3
9560
by: Chung Leong | last post by:
Here's the rest of the tutorial I started earlier: Aside from text within a document, Indexing Service let you search on meta information stored in the files. For example, MusicArtist and MusicAlbum let you find MP3 and other music files based on the singer and album name; DocAuthor let you find Office documents created by a certain user; DocAppName let you find files of a particular program, and so on. Indexing Service uses plug-ins...
0
9515
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10427
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10207
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10155
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9995
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9029
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6776
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5431
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4110
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.