By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
439,957 Members | 1,960 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 439,957 IT Pros & Developers. It's quick & easy.

Complement to DNA in output

P: 7
Write a Python script that computes the complement
of a DNA sequence. In other words, your script should convert all
A's to T's, C's to G's, G's to C's, and T's to A's.

The input is one sequence in FASTA format in a file called "dna.txt".

For example if the file contains

>human
ACCGT

then the output of the program should be TGGCA. Note that your program should work for
any sequence in this format and not just the given example.
----------------------------------------------------------
ive been trying to figure out how to do this, and its really starting to bother me. i cant figure out what code to write to input ACCGT and receive an output of TGGCA. this is not a code to reverse the input, i need to tell the program to look for A's and replace them with T's and replace C's with G's. this is what ive tried so far, but still havent gotten it

Expand|Select|Wrap|Line Numbers
  1. with open("/Users/homemac/classes/bnfo135/dna.txt", "r") as myfile:
  2.     seq = myfile.readlines()
  3.     str(seq)
  4.     str.replace("A", "T")
  5.     str.replace("C", "G")
  6.     print(seq)
  7.  
  8.  
  9. "['>human\\n', 'ACCGT\\n']"
  10. Traceback (most recent call last):
  11.   File "<pyshell#8>", line 4, in <module>
  12.     str.replace("A", "T")
  13. TypeError: replace() takes at least 2 arguments (1 given)
  14.  
  15. >>> with open("/Users/homemac/classes/bnfo135/dna.txt", "r") as myfile:
  16.     seq = myfile.readlines()
  17.     str(seq)
  18.     str.replace("A", "T" + "C","G")
  19.     print(seq)
  20.  
  21.  
  22. "['>human\\n', 'ACCGT\\n']"
  23. 'A'
  24. ['>human\n', 'ACCGT\n']
  25.  
  26. >>> with open("/Users/homemac/classes/bnfo135/dna.txt", "r") as myfile:
  27.     seq = myfile.readlines()
  28.     str(seq)
  29.     str.replace("A", "T" + "C","G")
  30.     print(str())
  31.  
  32.  
  33. "['>human\\n', 'ACCGT\\n']"
  34. 'A'
Sep 14 '10 #1
Share this Question
Share on Google+
7 Replies


P: 7
tried this, im getting closer and closer. but still not getting a single input value as the correct one.

with open("/Users/homemac/classes/bnfo135/dna.txt", "r") as myfile:
seq = myfile.readlines()
str(seq)
str.replace(str(seq), "A","T")
str.replace(str(seq), "C","G")
print(str(seq))


"['>human\\n', 'ACCGT\\n']"
"['>human\\n', 'TCCGT\\n']"
"['>human\\n', 'AGGGT\\n']"
['>human\n', 'ACCGT\n']

>>> with open("/Users/homemac/classes/bnfo135/dna.txt", "r") as myfile:
seq = myfile.readlines()
str(seq)
str.replace(str(seq), "A","T")
str.replace(str(seq), "C","G")
print(str.replace)


"['>human\\n', 'ACCGT\\n']"
"['>human\\n', 'TCCGT\\n']"
"['>human\\n', 'AGGGT\\n']"
<method 'replace' of 'str' objects>
>>>
Sep 14 '10 #2

Expert 100+
P: 621
Take a look at the "Getting a certain output from a file" thread three more down. This is a common homework question and there are several other threads as well. A search for something like "DNA sequence" should produce some hits.
Sep 14 '10 #3

P: 7
i looked at that one and tried all those ways. none of them worked /: i think that guy is in my class, but i have no clue who he is. do you know what i could do to solve my problem ?
Sep 14 '10 #4

bvdet
Expert Mod 2.5K+
P: 2,851
Here's a few hints. Please use code tags next time!
Expand|Select|Wrap|Line Numbers
  1. >>> lineList = ['>human\n', 'ACCGT\n']
  2. >>> seq = lineList[1].strip()
  3. >>> seq
  4. 'ACCGT'
  5. >>> seq.replace("A", 't').replace('T', 'a').upper()
  6. 'TCCGA'
  7. >>> 
Sep 14 '10 #5

P: 7
any idea how to do it starting with the first line I use ?
meaning I can't write out the dna sequence, I have to use the filepath
Sep 14 '10 #6

bvdet
Expert Mod 2.5K+
P: 2,851
You need to take a good look at the hints I gave you. Your first and second lines of code open the file and read the file contents into a list of lines. I just changed the name of the variable to be more descriptive. You want the second item in the list which is accessed by list index.
Sep 14 '10 #7

Expert 100+
P: 621
meaning I can't write out the dna sequence
A link to a tutorial for reading and writing files.
Sep 17 '10 #8

Post your reply

Sign in to post your reply or Sign up for a free account.