473,895 Members | 2,293 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

efficient text file search.

Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'

?

BTW:
does "for line in f: " read a block of line to te memory or is it
simply calls f.readline() many times?

thanks
amit

Sep 11 '06 #1
10 3781

"noro" <am******@gmail .comschreef in bericht
news:11******** **************@ h48g2000cwc.goo glegroups.com.. .
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'

yes, more efficient would be:
grep (http://www.gnu.org/software/grep/)

Sep 11 '06 #2
:)

via python...

Luuk wrote:
"noro" <am******@gmail .comschreef in bericht
news:11******** **************@ h48g2000cwc.goo glegroups.com.. .
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'


yes, more efficient would be:
grep (http://www.gnu.org/software/grep/)
Sep 11 '06 #3

"noro" <am******@gmail .comschreef in bericht
news:11******** ************@h4 8g2000cwc.googl egroups.com...
:)

via python...

Luuk wrote:
>"noro" <am******@gmail .comschreef in bericht
news:11******* *************** @h48g2000cwc.go oglegroups.com. ..
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'


yes, more efficient would be:
grep (http://www.gnu.org/software/grep/)
ok, a more serious answer:

some googling turned op the following.
Second paragraph of chapter 14 of http://www.amk.ca/python/2.1/

a.. The speed of line-oriented file I/O has been improved because people
often complain about its lack of speed, and because it's often been used as
a naïve benchmark. The readline() method of file objects has therefore been
rewritten to be much faster. The exact amount of the speedup will vary from
platform to platform depending on how slow the C library's getc() was, but
is around 66%, and potentially much faster on some particular operating
systems. Tim Peters did much of the benchmarking and coding for this change,
motivated by a discussion in comp.lang.pytho n.
A new module and method for file objects was also added, contributed by Jeff
Epler. The new method, xreadlines(), is similar to the existing xrange()
built-in. xreadlines() returns an opaque sequence object that only supports
being iterated over, reading a line on every iteration but not reading the
entire file into memory as the existing readlines() method does. You'd use
it like this:
for line in sys.stdin.xread lines():
# ... do something for each line ...
...
For a fuller discussion of the line I/O changes, see the python-dev summary
for January 1-15, 2001 at http://www.amk.ca/python/dev/2001-01-1.html.

Sep 11 '06 #4
noro wrote:
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'
Probably better to read the whole file at once if it isn't too big:
f = file('somefile' )
data = f.read()
if 'string' in data:
print 'FOUND'
Sep 11 '06 #5
Ant

noro wrote:
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'
break
^^^
Add a 'break' after the print statement - that way you won't have to
read the entire file unless the string isn't there. That's probably not
the sort of advice you're after though :-)

Can't see why reading the entire file in as the other poster suggested
would help, and seeing as "for line in f:" is now regarded as the
pythonic way of working with lines of text in a file, then I'd assume
that the implementation would be at least as fast as "for line in
f.xreadlines(): "

Sep 11 '06 #6

Luuk wrote:
[snip]
some googling turned op the following.
Second paragraph of chapter 14 of http://www.amk.ca/python/2.1/
[snip]
For a fuller discussion of the line I/O changes, see the python-dev summary
for January 1-15, 2001 at http://www.amk.ca/python/dev/2001-01-1.html.
That is *HISTORY*. That is Python 2.1. That is the year 2001.
xreadlines is as dead as a dodo.

Sep 11 '06 #7

"John Machin" <sj******@lexic on.netschreef in bericht
news:11******** **************@ d34g2000cwd.goo glegroups.com.. .
>
Luuk wrote:
[snip]
>some googling turned op the following.
Second paragraph of chapter 14 of http://www.amk.ca/python/2.1/
[snip]
>For a fuller discussion of the line I/O changes, see the python-dev
summary
for January 1-15, 2001 at http://www.amk.ca/python/dev/2001-01-1.html.

That is *HISTORY*. That is Python 2.1. That is the year 2001.
xreadlines is as dead as a dodo.
Thats's why i started my reply with:
"some googling turned op the following."
i did not state that further googling was unneeded ;-)
Sep 11 '06 #8
noro wrote:
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'

?
Is this something you want to do only once for a given file ? The
replies so far seem to imply so and in this case I doubt that you can
do anything more efficient. OTOH, if the same file is to be searched
repeatedly for different strings, an appropriate indexing scheme can
speed things up considerably on average.

George

Sep 11 '06 #9
OK, am not sure why, but

fList=file('som efile').read()
if fList.find('str ing') != -1:
print 'FOUND'

works much much faster.

it is strange since i thought 'for line in file('somefile' )' is
optemized and read pages to the memory,
i guess not..

George Sakkis wrote:
noro wrote:
Is there a more efficient method to find a string in a text file then:

f=file('somefil e')
for line in f:
if 'string' in line:
print 'FOUND'

?

Is this something you want to do only once for a given file ? The
replies so far seem to imply so and in this case I doubt that you can
do anything more efficient. OTOH, if the same file is to be searched
repeatedly for different strings, an appropriate indexing scheme can
speed things up considerably on average.

George
Sep 11 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

19
5715
by: Materialised | last post by:
Hi everyone, What I am wanting to do, is to copy, a simple plain text file, to another file, but omitting duplicate items. The way I thought of doing this, involved copying all the items into a array, and then looping through that array looking for duplicates, removing them, and then writing to another file. This seems a very long and drawn out way of doing this to me, and also I
60
49250
by: Julie | last post by:
What is the *fastest* way in .NET to search large on-disk text files (100+ MB) for a given string. The files are unindexed and unsorted, and for the purposes of my immediate requirements, can't be indexed/sorted. I don't want to load the entire file into physical memory, memory-mapped files are ok (and preferred). Speed/performance is a requirement -- the target is to locate the string in 10 seconds or less for a 100 MB file. The...
1
1278
by: rodchar | last post by:
hey all, is there a quick way to read.all the contents of a text file, search for the text, and if it finds that text to read the entire line? thanks, rodchar
0
4246
by: Chung Leong | last post by:
Here's a short tutorial on how to the OLE-DB extension to access Windows Indexing Service. Impress your office-mates with a powerful full-text search feature on your intranet. It's easier than you think. First, download and install the extension (http://sourceforge.net/project/showfiles.php?group_id=171247&package_id=198554). Simply unzip the file and copy the correct version of php_oledb.dll into the PHP extensions folder. Then add the...
3
9565
by: Chung Leong | last post by:
Here's the rest of the tutorial I started earlier: Aside from text within a document, Indexing Service let you search on meta information stored in the files. For example, MusicArtist and MusicAlbum let you find MP3 and other music files based on the singer and album name; DocAuthor let you find Office documents created by a certain user; DocAppName let you find files of a particular program, and so on. Indexing Service uses plug-ins...
0
9836
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
11249
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10847
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10473
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
8028
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6070
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4693
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4288
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3298
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.