Hi, thanks for the help. Then I got running the following code;
#!/usr/bin/env python
import os, sys, re, string, array, linecache, math
nlach = 12532
lach_list = sys.argv[1]
lach_list_file = open(lach_list,"r")
lach_mol2 = sys.argv[2] # name of the lachand mol2 file
lach_mol2_file = open(lach_mol2,"r")
n_lach_read=int(sys.argv[3])
# Do the following for the total number of lachands
# 1. read the list with the ranked lachands
for i in range(1,n_lach_read+1):
line = lach_list_file.readline()
ll = string.split (line)
#print i, ll[0]
lach = int(ll[0])
# 2. for each lachand, print mol2 file
# 2a. find lachand header in lachand mol2 file (example; kanaka)
# and return line number
line_nr = 0
for line in lach_mol2_file:
line_nr += 1
has_match = line.find('kanaka')
if has_match >= 0:
print 'Found in line %d' % (line_nr)
# 2b. print on screen all the info for this lachand
# (but first need to read natoms and nbonds info)
# go to line line_nr + 1
ltr=linecache.getline(lach_mol2, line_nr + 1)
ll=ltr.split()
#print ll[0],ll[1]
nat=int(ll[0])
nb=int(ll[1])
# total lines to print:
# header, 8
# at, na
# b header, 1
# n
# lastheaders, 2
# so; nat + nb + 11
ntotal_lines = nat + nb + 11
# now we go to the beginning of the lachand
# and print ntotal_lines
for j in range(0,ntotal_lines):
print linecache.getline(lach_mol2, line_nr - 1 + j )
which almost works. In the last "for j" loop, i expected to obtain an
output like:
sdsdsdsdsdsd
sdsdsfdgdgdgdg
hdfgdgdgdg
but instead of this, i get:
sdsdsdsdsdsd
sdsdsfdgdgdgdg
hdfgdgdgdg
and also the program is very slow. Do you know how could i solve
this ?
thanks
Tim Chase wrote:
I have to search for a string on a big file. Once this string
is found, I would need to get the number of the line in which
the string is located on the file. Do you know how if this is
possible to do in python ?
This should be reasonable:
>>for num, line in enumerate(open("/python25/readme.txt")):
if "Guido" in line:
print "Found Guido on line", num
break
Found Guido on line 1296
Just a small caveat here: enumerate() is zero-based, so you may
actually want add one to the resulting number:
s = "Guido"
for num, line in enumerate(open("file.txt")):
if s in line:
print "Found %s on line %i" % (s, num + 1)
break # optionally stop looking
Or one could use a tool made for the job:
grep -n Guido file.txt
or if you only want the first match:
sed -n '/Guido/{=;p;q}' file.txt
-tkc