By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
444,035 Members | 1,388 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 444,035 IT Pros & Developers. It's quick & easy.

Iterating over a file in python

P: 19
Hello everyone, I wrote a post awhile ago about automating a local client to access a BLAT webserver, but today I have a much easier one. I want to take a batch file and delete every odd line. See below:

Sample File:
>MusmusculusmiR-344
UGAUCUAGCCAAAGCCUGACUGU
>MusmusculusmiR-345
UGCUGACCCCUAGUCCAGUGC
>MusmusculusmiR-346
UGUCUGCCCGAGUGCCUGCCUCU
>MusmusculusmiR-350
UUCACAAAGCCCAUACACUUUCA

I need to delete all the lines that start with '>' and end with a '\n'. I have some code below, but it just isolates the part of the string I want to delete. I need to do the reverse... I know there is some easy way that I am totally missing here!

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/env python
  2. # written 7/28/2007
  3. # by Mark O'Connor
  4.  
  5. def Resize( filename ):
  6.     line = 0
  7.     collect = []
  8.     fp = file( filename )
  9.     data = fp.read()
  10.     fp.close()
  11.     #print data
  12.     while line != -1:
  13.         start = data.find('>', line+1)
  14.         end = data.find ('/n', start)
  15.         chunk = data[start:end]
  16.     return chunk
  17.  
  18.  
Thanks,

Mark
Jul 28 '07 #1
Share this Question
Share on Google+
6 Replies


bartonc
Expert 5K+
P: 6,596
Hello everyone, I wrote a post awhile ago about automating a local client to access a BLAT webserver, but today I have a much easier one. I want to take a batch file and delete every odd line. See below:

Sample File:
>MusmusculusmiR-344
UGAUCUAGCCAAAGCCUGACUGU
>MusmusculusmiR-345
UGCUGACCCCUAGUCCAGUGC
>MusmusculusmiR-346
UGUCUGCCCGAGUGCCUGCCUCU
>MusmusculusmiR-350
UUCACAAAGCCCAUACACUUUCA

I need to delete all the lines that start with '>' and end with a '\n'. I have some code below, but it just isolates the part of the string I want to delete. I need to do the reverse... I know there is some easy way that I am totally missing here!

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/env python
  2. # written 7/28/2007
  3. # by Mark O'Connor
  4.  
  5. def Resize( filename ):
  6.     line = 0
  7.     collect = []
  8.     fp = file( filename )
  9.     data = fp.read()
  10.     fp.close()
  11.     #print data
  12.     while line != -1:
  13.         start = data.find('>', line+1)
  14.         end = data.find ('/n', start)
  15.         chunk = data[start:end]
  16.     return chunk
  17.  
  18.  
Thanks,

Mark
Hey Mark...
I'd use something like this:
Expand|Select|Wrap|Line Numbers
  1. outList = []
  2. f = open(fileName)
  3. for line in f:
  4.     if line.startswith('>'):
  5.         continue
  6.     outList.append(line)
  7. f.close()
  8. f = open(newFileName, 'w') # or old one to replace it
  9. f.writelines(outLIst)
  10. f.close()
Untested, but generally sound.
Jul 29 '07 #2

P: 19
Thanks! That did the trick

Mark
Jul 29 '07 #3

bartonc
Expert 5K+
P: 6,596
Thanks! That did the trick

Mark
Files, like all iterators have some pretty cool methods hung on them. That example just scratches the surface.

Any time,
Barton
Jul 29 '07 #4

P: 5
Or even:

Expand|Select|Wrap|Line Numbers
  1. out = []
  2. with open(fileName) as f:
  3.     for line in f:
  4.         if line.startswith('>'):
  5.             continue
  6.         out.append(line)
  7.  
  8. with open(newFileName, 'w') as f:
  9.     f.writelines(out)
  10.  
Aug 9 '07 #5

P: 8
@ bartonc

Files, like all iterators have some pretty cool methods hung on them. That example just scratches the surface.
Now I'm listening. What are those cool methods?
Dec 27 '10 #6

P: 2
Personally, I love python for many reasons, especially when it comes to parsing things and shifting things. This next piece of code should take you to the next level in python development.

Expand|Select|Wrap|Line Numbers
  1. input = open("inputfile.txt")
  2. output = open("outputfile.txt", 'w')
  3.  
  4. output.writelines([(line) for line in input if not line.startswith('>')])
  5.  
And thats it!
Dec 31 '10 #7

Post your reply

Sign in to post your reply or Sign up for a free account.