By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,404 Members | 1,074 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,404 IT Pros & Developers. It's quick & easy.

How to write a program in python 2.6 that decodes DNA codons?

P: 1
My name is Eric and I am using python version 2.6
I am trying to write a program that decodes DNA codons. I have to prompt the user to enter a DNA sequence that is a multiple of 3 and have the program spit out the letters that each 3 letter string represents. Here's an example of what it should look like:
Input a DNA sequence: ACTCTAGCTTTG
---> TLAL
Using the DNA codon table at Wikipedia to find what proteins the 3 letters give. I am very lost on how to do this.
Please help!
Feb 2 '11 #1
Share this Question
Share on Google+
4 Replies

Expert Mod 10K+
P: 12,382
You could use a loop to loop through the string 3 characters at a time and use a switch to do the translation.
Feb 2 '11 #2

Expert Mod 2.5K+
P: 2,851
Create a list of each triplet by iterating on the string "ACTCTAGCTTTG" Hint:
Expand|Select|Wrap|Line Numbers
  1. for i in range(0, len(seq), 3)
Define a mapping object (dictionary) to associate each triplet with the corresponding letter.
Expand|Select|Wrap|Line Numbers
  1. {"ACT": "T", "CTA": "L", "GCT": "A", "TTG": "L"}
Iterate on the list of triplets, mapping each triplet to the corresponding letter. Join the letters into one string.
Expand|Select|Wrap|Line Numbers
  1. "".join([mappedList])
Alternatively, you could concatenate the letters into one string.
Expand|Select|Wrap|Line Numbers
  1. >>> result = ""
  2. >>> for letter in ["T", "L", "A", "L"]:
  3. ...     result += letter
  4. ...     
  5. >>> result
  6. 'TLAL'
  7. >>> 
Feb 2 '11 #3

P: 1

I am having difficulty writing the same program. Being new to python and programming, I do not fully understand your post. It would be great if you can give me a more specific example or show me how to start the code. Right now I am assigning each amino acid to a DNA codon and then using an if statement to print the single-leter data-base code. For example:

Expand|Select|Wrap|Line Numbers
  1. Isoleucine = ("ATT") , ("ATC") , ("ATA")
  2. Leucine = ("CTT") , ("CTC") , ("CTA") , ("CTG") , ("TTA") , ("TTG")
  3. Valine = ("GTT") , ("GTC") , ("GTA") , ("GTG")
  4. Phenylalanine = ("TTT") , ("TTC")
  6. while True:
  7.     DNA = raw_input("Please enter a DNA sequence in multiples of three:")
  9.     if Isoleucine:
  10.         print "I"
  11.     if Leucine: 
  12.         print "L"
  13.     if Valine:
  14.         print "V"
  15.     if Phenylalanine:
  16.         print "F"

here is the result I get

Please enter a DNA sequence in multiples of three:ATT
Please enter a DNA sequence in multiples of three:

If someone can guide me in the right direction or tell me how to start that would be great.
Feb 6 '11 #4

Expert Mod 2.5K+
P: 2,851
You would begin by defining the dictionary that maps a given triplet to a protein. Example:
Expand|Select|Wrap|Line Numbers
  1. tripletDict = {"ATT": "Isoleucine", "CTT": "Leucine"}
The code I posted suggested splitting the input string into parts using slicing. The resulting list of triplets can then be mapped to the proteins in the dictionary in a for loop.
Expand|Select|Wrap|Line Numbers
  1. >>> seq = "ATTCTTATT"
  2. >>> [seq[i:i+3] for i in range(0, len(seq), 3)]
  3. ['ATT', 'CTT', 'ATT']
  4. >>> for triplet in [seq[i:i+3] for i in range(0, len(seq), 3)]:
  5. ...     print tripletDict[triplet]
  6. ...     
  7. Isoleucine
  8. Leucine
  9. Isoleucine
  10. >>> 
Feb 7 '11 #5

Post your reply

Sign in to post your reply or Sign up for a free account.