473,408 Members | 2,734 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

Creating combination of sequences

Hello, python lovers!!

I'm trying to create combinations of sequences.

For example, if the sequence is 'acgt' and the length is 8,
then I would like to have 4^8 results such as
'aaaaaaaa', 'aaaaaaac', 'aaaaaaag', 'aaaaaaat', ... 'tttttttt'

Is there an easy way doing this?

Any help will be greatly appreciated.
Minho
Jul 18 '05 #1
5 2421
Minho Chae wrote:
Hello, python lovers!!

I'm trying to create combinations of sequences.

A simple recursive generator should produce what you want while being
fairly easy to read...
def permute( choices, length=8 ):

.... if length > 1:
.... for suffix in permute(choices,length-1):
.... for choice in choices:
.... yield choice + suffix
.... else:
.... for choice in choices:
.... yield choice

That's not the same ordering you wanted, but that's merely a matter of
swapping the order of choice and suffix.

HTH,
Mike

________________________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://www.vrplumber.com
http://blog.vrplumber.com

Jul 18 '05 #2

"Mike C. Fletcher" <mc******@rogers.com> wrote:

Minho Chae wrote:
Hello, python lovers!!

I'm trying to create combinations of sequences.

A simple recursive generator should produce what you want while being
fairly easy to read...
>>> def permute( choices, length=8 ): ... if length > 1:
... for suffix in permute(choices,length-1):
... for choice in choices:
... yield choice + suffix
... else:
... for choice in choices:
... yield choice

That's not the same ordering you wanted, but that's merely a matter of
swapping the order of choice and suffix.


Or even the better non-recursive 'base-k' version...

Note that a permutation is just a reordering of the set of elements that
one has. There are only n! number of permutations of any arbitrary
sequence (assuming the n items are unique).

I can't remember the mathematical name of what the OP wanted (because my
brain is dumb today).
def blah(choices, length, force_string=0):
lch = len(choices)
length = long(length)
if lch <= 0:
raise ValueError, "choices must be a sequence of at least 1 item"
if length <= 0:
raise ValueError, "length must be an integer >= 1"
#use longs and a while loop because we may be dealing with sequences
#>= 2**31 in size, and we may be using Python < 2.4
cur = 0L
m = lch**length
while cur < m:
l = [choices[(cur//(lch**i))%lch] for i in xrange(length-1, -1, -1)]
if force_string:
yield ''.join(map(str, l))
else:
yield l
cur += 1
for i in blah('atcg', 2, 1): ... print i
...
aa
at
ac
ag
ta
tt
tc
tg
ca
ct
cc
cg
ga
gt
gc
gg

- Josiah

Jul 18 '05 #3
Minho Chae wrote:
I'm trying to create combinations of sequences.

For example, if the sequence is 'acgt' and the length is 8,
then I would like to have 4^8 results such as
'aaaaaaaa', 'aaaaaaac', 'aaaaaaag', 'aaaaaaat', ... 'tttttttt'


Given the DNA base characters you use for your test case, do you
know about biopython.org?

Here is a solution to your problem.

def cycle(chars, length):
"""Generate 'length' copies of chars[0], then length of chars[1]
then of chars[2], etc. When chars is exhausted, start again.
"""
counter = xrange(length)
while 1:
for c in chars:
for _ in counter:
yield c

def once(chars, length):
"""Generate 'length' copies of chars[0], then of chars[1], etc.
then of chars[-1]. When chars is exhausted, stop.
"""
counter = xrange(length)
for c in chars:
for _ in counter:
yield c

def all_combinations(chars, size):
"""Generate all words combinations using items from chars as
the characters for each position. If chars is in lexiographic
order then so is the list of generated words.
for word in all_combinations("01", 3):

... print word
...
000
001
010
011
100
101
110
111
"""
if size == 0:
return

N = len(chars)
gens = [once(chars, N**(size-1))]
for i in range(size-2, -1, -1):
gens.append(cycle(chars, N**i))

def next(gen):
return gen.next()

while 1:
yield "".join(map(next, gens))

def test():
text = "acgt"
n = 8
i = -1
for i, term in enumerate(all_combinations(text, n)):
print term
print (i+1), "combinations found"
if n:
assert (i+1) == len(text) ** n, (i, text, n)
else:
assert i == -1, i

if __name__ == "__main__":
test()
Andrew
da***@dalkescientific.com
Jul 18 '05 #4
Minho Chae wrote:
I'm trying to create combinations of sequences.

For example, if the sequence is 'acgt' and the length is 8,
then I would like to have 4^8 results such as
'aaaaaaaa', 'aaaaaaac', 'aaaaaaag', 'aaaaaaat', ... 'tttttttt'

Is there an easy way doing this?


You are basically counting in base 4 here.

import array

def generate(digits='acgt', length=8):
base = len(digits)
positions = range(length)
positions.reverse()
r = array.array('c', digits[0] * length)
for number in range(base**length):
for digit in positions:
number, v = divmod(number, base)
r[digit] = digits[v]
yield r.tostring()

for result in generate('ab', 2):
print result
-Scott David Daniels
Sc***********@Acm.Org
Jul 18 '05 #5
mi********@yahoo.com (Minho Chae) wrote:
I'm trying to create combinations of sequences.
For example, if the sequence is 'acgt' and the length is 8,
then I would like to have 4^8 results such as
'aaaaaaaa', 'aaaaaaac', 'aaaaaaag', 'aaaaaaat', ... 'tttttttt'


strings = lambda Xs, k: reduce(lambda r, i: [p + x for p in r for x
in Xs], range(k), [''])

print strings('acgt', 8)

This has been recently discussed. See

http://groups.google.com/groups?hl=e...ing.google.com
http://groups.google.com/groups?hl=e...it%40yahoo.com

and other previous threads and follow-ups. This question seems to pop
up more often than I thought.

Kamsahapnida,

Hung Jung
Jul 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: temp | last post by:
Hi All, I wonder could someone help me with this? What I want to do is search through a list of letters and look for adjacent groups of letters that form sequences, not in the usual way of...
1
by: Eric Sasser | last post by:
I'm searching for anyone that has tried working with creating containersin shared memory using the April 2003 article by Grum Ketema in C/C++ UserJournal. I have a project up and running but am...
10
by: Vilson farias | last post by:
Greetings, I'm getting a big performance problem and I would like to ask you what would be the reason, but first I need to explain how it happens. Let's suppose I can't use sequences (it seams...
7
by: Micheal Artindale | last post by:
I am looking at creating list of letter combinations. letters a-h 6 letters per combination letter can repeat its self in the combination, but not next to its self and, a series of letter can...
2
by: Ross | last post by:
Is there any combination parsing algorithm/source code available to handle the combination problem of inputting a sequence: YGEQLGMREMVRNCMQDL and generating sequences: YGEQLGMREMVRNCMQDL...
18
by: Bruno Baguette | last post by:
Hello, I have to design a table wich will store some action reports. Each report have an ID like this 1/2004, 2/2004, ... and each years, they restart to 1 (1/2004, 1/2005, 1/2006,...). So, I...
4
by: JJ | last post by:
Is there a way of checking that a line with escape sequences in it, has no strings in it (apart from the escape sequences)? i.e. a line with \n\t\t\t\t\t\t\t\r\n would have no string in it a...
1
by: mmm | last post by:
I wrote the code below to create simple arithmetic sequences that are iter-able I.e., this would basically combine the NUMPY arange(start,end,step) to range(start,end), with step not necessarily...
8
by: metaperl.com | last post by:
Pyparsing has a really nice feature that I want in PLY. I want to specify a list of strings and have them converted to a regular expression. A Perl module which does an aggressively optimizing...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.