473,406 Members | 2,273 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

recombination variations

The problem I'm solving is to take a sequence like 'ATSGS' and make all
the DNA sequences it represents. The A, T, and G are fine but the S
represents C or G. I want to take this input:

[ [ 'A' ] , [ 'T' ] , [ 'C' , 'G' ], [ 'G' ] , [ 'C' , 'G' ] ]

and make the list:

[ 'ATCGC' , 'ATCGG' , 'ATGGC' , 'ATGGG' ]

The code below is what I have so far: 'alphabet' is a dictionary that
designates the set oif base pairs that each letter represents (for
example for S above it gives C and G). I call these ambiguous base
pairs because they could be more then one. Thus the function name
'unambiguate'. It makes a list of sequences with only A T C and Gs and
none of the ambiguous base pair designations.

The function 'unambiguate_bp' takes a sequence and a base pair in it
and returns a set of sequences with that base pair replaced by each of
it's unambiguous possibilities.

The function unambiguate_seq takes a sequence and runs unambiguate_bp
on each base pair in the sequence. Each time it does a base pair it
replaces the set of things it's working on with the output from the
unambiguate_bp. It's a bit confusing. I'd like it to be clearer.

Is there a better way to do this?
--
David Siedband
generation-xml.com

def unambiguate_bp(seq, bp):
seq_set = []
for i in alphabet[seq[bp]]:
seq_set.append(seq[:bp]+i+seq[bp+1:])
return seq_set

def unambiguate_seq(seq):
result = [seq]
for i in range(len(seq)):
result_tmp=[]
for j in result:
result_tmp = result_tmp + unambiguate_bp(j,i)
result = result_tmp
return result

alphabet = {
'A' : ['A'],
'T' : ['T'],
'C' : ['C'],
'G' : ['G'],
'W' : ['A','T'],
'M' : ['A','C'],
'R' : ['A','G'],
'Y' : ['T','C'],
'K' : ['T','G'],
'S' : ['C','G'],
'H' : ['A','T','C'],
'D' : ['A','T','G'],
'V': ['A','G','C'],
'B' : ['C','T','G'],
'N' : ['A','T','C','G']
}

Jul 18 '05 #1
4 1809
David Siedband wrote:
[...]
Is there a better way to do this?
[...]


Take a look at Biopython: http://biopython.org/

Your problem may be solved there already.
Jul 18 '05 #2
David Siedband wrote:
The problem I'm solving is to take a sequence like 'ATSGS' and make all
the DNA sequences it represents. The A, T, and G are fine but the S
represents C or G. I want to take this input:

[ [ 'A' ] , [ 'T' ] , [ 'C' , 'G' ], [ 'G' ] , [ 'C' , 'G' ] ]

and make the list:

[ 'ATCGC' , 'ATCGG' , 'ATGGC' , 'ATGGG' ]


[...]

The code you provide only addresses the first part of your problem, and so
does mine:
def disambiguate(seq, alphabet): .... return[list(alphabet.get(c, c)) for c in seq]
.... alphabet = { .... "W": "AT",
.... "S": "CG"
.... } disambiguate("ATSGS", alphabet)

[['A'], ['T'], ['C', 'G'], ['G'], ['C', 'G']]

Note that "identity entries" (e. g. mapping "A" to "A") in the alphabet
dictionary are no longer necessary. The list() call in disambiguate() is
most likely superfluous, but I put it in to meet your spec accurately.

Now on to the next step :-)

Peter

Jul 18 '05 #3
alphabet = {
'A': 'A',
'T': 'T',
'C': 'C',
'G': 'G',
'W': 'AT',
'M': 'AC',
'R': 'AG',
'Y': 'TC',
'K': 'TG',
'S': 'CG',
'H': 'ATC',
'D': 'ATG',
'V': 'AGC',
'B': 'CTG',
'N': 'ATCG'
}

expand = lambda t: reduce(lambda r, s: [x+y for x in r for y in
alphabet[s]], t, [''])

print expand('ATSGS')

--------------

['ATCGC', 'ATCGG', 'ATGGC', 'ATGGG']
Jul 18 '05 #4
Hung Jung Lu wrote:
... expand = lambda t: reduce(lambda r, s: [x+y for x in r
for y in alphabet[s]], t, [''])
print expand('ATSGS')


Or, for a more verbose version:

multis = dict(W='AT', M='AC', R='AG', Y='TC', K='TG', S='CG',
H='ATC', D='ATG', V='AGC', B='CTG', N='ATCG')

def expanded(string, expansions=multis):
result = ''
for pos, char in enumerate(string):
if char in multis:
break
else:
yield string
raise StopIteration
parts = multis[char]
prelude, string = string[:pos], string[pos+1:]
for expansion in expanded(string, multis):
for middle in parts:
yield prelude + middle + expansion
--Scott David Daniels
Sc***********@Acm.Org
Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Alex Bell | last post by:
There have been several postings recently dealing with bugs in Internet Explorer. Can anyone point me to a review of these bugs and how to work around them? Regards, Alex
0
by: Oliver Douglas | last post by:
adapted from: http://www.gotmono.com/docs/gnome/bindings/gtk-sharp/buttonwidget.html // togglebut.cs using Gtk; using GtkSharp; using System;
2
by: John Baker | last post by:
Hi: I have two systems, one a W98 and the other XP Home. I have Access 2000 installed on both, and have run into a difference in the way the two behave. I have a table on which I wish to reset...
1
by: Otto Blomqvist | last post by:
Hello! I have a small database (10MB gz dump). When I do a pg_dump -l -s (to list the schema) of the original database it takes below 1 second. But when I do dump of a copy of the database...
3
by: emmba | last post by:
The premise for my assignment is to write variations of the read_line functions. The variations are: a) skips leading whitespace before storing input b) stops reading at first whitespace...
4
by: Jordan S. | last post by:
Using .NET 2.0 (C#) I'm writing a small app that will have a "Person" class that exposes FirstName and LastName properties. At runtime, a "People" collection will be populated with a few thousand...
0
by: Showjumper | last post by:
Is one better than the other with handling xml files? Ashok
0
by: Vcskiran | last post by:
Hi Friends, Can anyone help me y the textbox height is reduced or changed in the aspx page, if i response.write in the .vb page and how can i avoid this anyone plzz plzz help. thanks KČ
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.