By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
449,075 Members | 946 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 449,075 IT Pros & Developers. It's quick & easy.

Look for a string on a file and get its line number

P: n/a
Hi,

I have to search for a string on a big file. Once this string is
found, I would need to get the number of the line in which the string
is located on the file. Do you know how if this is possible to do in
python ?

Thanks
Jan 8 '08 #1
Share this Question
Share on Google+
11 Replies


P: n/a
-On [20080108 09:21], Horacius ReX (ho**********@gmail.com) wrote:
>I have to search for a string on a big file. Once this string is
found, I would need to get the number of the line in which the string
is located on the file. Do you know how if this is possible to do in
python ?
(Assuming ASCII, otherwise check out codecs.open().)

big_file = open('bigfile.txt', 'r')

line_nr = 0
for line in big_file:
line_nr += 1
has_match = line.find('my-string')
if has_match 0:
print 'Found in line %d' % (line_nr)

Something to this effect.

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org/ asmodai
イェルーン ラウフ*ック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/
If you think that you know much, you know little...
Jan 8 '08 #2

P: n/a
On Jan 8, 7:33 pm, Jeroen Ruigrok van der Werven <asmo...@in-
nomine.orgwrote:
-On [20080108 09:21], Horacius ReX (horacius....@gmail.com) wrote:
I have to search for a string on a big file. Once this string is
found, I would need to get the number of the line in which the string
is located on the file. Do you know how if this is possible to do in
python ?

(Assuming ASCII, otherwise check out codecs.open().)

big_file = open('bigfile.txt', 'r')

line_nr = 0
for line in big_file:
line_nr += 1
has_match = line.find('my-string')
if has_match 0:
Make that >=

| >>'fubar'.find('fu')
| 0
| >>>
print 'Found in line %d' % (line_nr)
Jan 8 '08 #3

P: n/a
-On [20080108 09:51], John Machin (sj******@lexicon.net) wrote:
>Make that >=
Right you are. Sorry, was doing it quickly from work. ;)

And I guess the find will also be less precise if the word you are looking is
a smaller part of a bigger word. E.g. find 'door' in a line that has 'doorway'
in it.

So 't is merely for inspiration. ;)

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org/ asmodai
イェルーン ラウフ*ック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/
>From morning to night I stayed out of sight / Didn't recognise I'd become
No more than alive I'd barely survive / In a word, overrun...
Jan 8 '08 #4

P: n/a
-On [20080108 09:51], John Machin (sj******@lexicon.net) wrote:
>Make that >=

| >>'fubar'.find('fu')
Or even just:

if 'my-string' in line:
...

Same caveat emptor applies though.

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org/ asmodai
イェルーン ラウフ*ック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/
We're walking this earth. We're walking this shining earth...
Jan 8 '08 #5

P: n/a
Jeroen Ruigrok van der Werven wrote:
-On [20080108 09:21], Horacius ReX (ho**********@gmail.com) wrote:
>>I have to search for a string on a big file. Once this string is
found, I would need to get the number of the line in which the string
is located on the file. Do you know how if this is possible to do in
python ?

(Assuming ASCII, otherwise check out codecs.open().)

big_file = open('bigfile.txt', 'r')

line_nr = 0
for line in big_file:
line_nr += 1
has_match = line.find('my-string')
if has_match 0:
print 'Found in line %d' % (line_nr)

Something to this effect.
apart from that look at the linecache module. If it's a big file it could
help you with subsequent access to the line in question

hth
martin

--
http://noneisyours.marcher.name
http://feeds.feedburner.com/NoneIsYours

You are not free to read this message,
by doing so, you have violated my licence
and are required to urinate publicly. Thank you.

Jan 8 '08 #6

P: n/a
On Behalf Of Horacius ReX
I have to search for a string on a big file. Once this string
is found, I would need to get the number of the line in which
the string is located on the file. Do you know how if this is
possible to do in python ?
This should be reasonable:
>>for num, line in enumerate(open("/python25/readme.txt")):
if "Guido" in line:
print "Found Guido on line", num
break
Found Guido on line 1296
>>>
Regards,
Ryan Ginstrom

Jan 8 '08 #7

P: n/a
Jeroen Ruigrok van der Werven wrote:
line_nr = 0
for line in big_file:
line_nr += 1
has_match = line.find('my-string')
if has_match 0:
print 'Found in line %d' % (line_nr)
Style note:
May I suggest enumerate (I find the explicit counting somewhat clunky)
and maybe turning it into a generator (I like generators):

def lines(big_file, pattern="my string"):
for n, line in enumerate(big_file):
if pattern in line:
print 'Found in line %d' % (n)
yield n

or for direct use, how about a simple list comprehension:

lines = [n for (n, line) in enumerate(big_file) if "my string" in line]

(If you're just going to iterate over the result, that is you do not
need indexing, replace the brackets with parenthesis. That way you get a
generator and don't have to build a complete list. This is especially
useful if you expect many hits.)

Just a note.

regards
/W
Jan 8 '08 #8

P: n/a
-On [20080108 12:59], Wildemar Wildenburger (la*********@klapptsowieso.net) wrote:
>Style note:
May I suggest enumerate (I find the explicit counting somewhat clunky)
and maybe turning it into a generator (I like generators):
Sure, I still have a lot to discover myself with Python.

I'll study your examples, thanks. :)

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org/ asmodai
イェルーン ラウフ*ック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/
A conclusion is simply the place where you got tired of thinking...
Jan 8 '08 #9

P: n/a
>I have to search for a string on a big file. Once this string
>is found, I would need to get the number of the line in which
the string is located on the file. Do you know how if this is
possible to do in python ?

This should be reasonable:
>>>for num, line in enumerate(open("/python25/readme.txt")):
if "Guido" in line:
print "Found Guido on line", num
break
Found Guido on line 1296
Just a small caveat here: enumerate() is zero-based, so you may
actually want add one to the resulting number:

s = "Guido"
for num, line in enumerate(open("file.txt")):
if s in line:
print "Found %s on line %i" % (s, num + 1)
break # optionally stop looking

Or one could use a tool made for the job:

grep -n Guido file.txt

or if you only want the first match:

sed -n '/Guido/{=;p;q}' file.txt

-tkc

Jan 8 '08 #10

P: n/a
Hi, thanks for the help. Then I got running the following code;

#!/usr/bin/env python

import os, sys, re, string, array, linecache, math

nlach = 12532

lach_list = sys.argv[1]
lach_list_file = open(lach_list,"r")
lach_mol2 = sys.argv[2] # name of the lachand mol2 file
lach_mol2_file = open(lach_mol2,"r")
n_lach_read=int(sys.argv[3])

# Do the following for the total number of lachands

# 1. read the list with the ranked lachands
for i in range(1,n_lach_read+1):
line = lach_list_file.readline()
ll = string.split (line)
#print i, ll[0]
lach = int(ll[0])
# 2. for each lachand, print mol2 file
# 2a. find lachand header in lachand mol2 file (example; kanaka)
# and return line number
line_nr = 0
for line in lach_mol2_file:
line_nr += 1
has_match = line.find('kanaka')
if has_match >= 0:
print 'Found in line %d' % (line_nr)
# 2b. print on screen all the info for this lachand
# (but first need to read natoms and nbonds info)
# go to line line_nr + 1
ltr=linecache.getline(lach_mol2, line_nr + 1)
ll=ltr.split()
#print ll[0],ll[1]
nat=int(ll[0])
nb=int(ll[1])
# total lines to print:
# header, 8
# at, na
# b header, 1
# n
# lastheaders, 2
# so; nat + nb + 11
ntotal_lines = nat + nb + 11
# now we go to the beginning of the lachand
# and print ntotal_lines
for j in range(0,ntotal_lines):
print linecache.getline(lach_mol2, line_nr - 1 + j )
which almost works. In the last "for j" loop, i expected to obtain an
output like:

sdsdsdsdsdsd
sdsdsfdgdgdgdg
hdfgdgdgdg

but instead of this, i get:

sdsdsdsdsdsd

sdsdsfdgdgdgdg

hdfgdgdgdg

and also the program is very slow. Do you know how could i solve
this ?

thanks

Tim Chase wrote:
I have to search for a string on a big file. Once this string
is found, I would need to get the number of the line in which
the string is located on the file. Do you know how if this is
possible to do in python ?
This should be reasonable:
>>for num, line in enumerate(open("/python25/readme.txt")):
if "Guido" in line:
print "Found Guido on line", num
break
Found Guido on line 1296

Just a small caveat here: enumerate() is zero-based, so you may
actually want add one to the resulting number:

s = "Guido"
for num, line in enumerate(open("file.txt")):
if s in line:
print "Found %s on line %i" % (s, num + 1)
break # optionally stop looking

Or one could use a tool made for the job:

grep -n Guido file.txt

or if you only want the first match:

sed -n '/Guido/{=;p;q}' file.txt

-tkc
Jan 8 '08 #11

P: n/a
On 8 jan, 03:19, Horacius ReX <horacius....@gmail.comwrote:
Hi,

I have to search for a string on a big file. Once this string is
found, I would need to get the number of the line in which the string
is located on the file. Do you know how if this is possible to do in
python ?

Thanks
hi, i'm no python whizzkid, but you can do a lot with the .index
syntax. If you do something like this, it'll return the index of the
first character of your string as it is found in your file. Note that
if you read a file like this there will be some special characters for
new lines and the list structure that python uses, so if you want to
know the exact line you will have to find that out by playing with a
small file. You can always use a command like: print s[12]
if you want to know the exact 12th character in your file
--------------------------------------------------------
infile = open("C:\\Users\\yourname\\Desktop\\", 'r')

f= yourfile.readlines()
s=str(f)
yourstring = s.index('mystring')
--------------------------------------------------------

good luck,

)a((o
Jan 8 '08 #12

This discussion thread is closed

Replies have been disabled for this discussion.