By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,400 Members | 903 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,400 IT Pros & Developers. It's quick & easy.

slicing functionality for strings / Python suitability for bioinformatics

P: n/a
>>> rs='AUGCUAGACGUGGAGUAG'
rs[12:15]='GAG'

Traceback (most recent call last):
File "<pyshell#119>", line 1, in ?
rs[12:15]='GAG'
TypeError: object doesn't support slice assignment

You can't assign to a section of a sliced string in
Python 2.3 and there doesn't seem to be mention of this
as a Python 2.4 feature (don't have time to actually try
2.4 yet).

Q1. Does extended slicing make use of the Sequence protocol?
Q2. Don't strings also support the Sequence protcol?
Q3. Why then can't you make extended slicing assignment work
when dealing with strings?

This sort of operation (slicing/splicing of sequences represented
as strings) would seem to be a very fundamental oepration when doing
rna/dna/protein sequencing algorithms, and it would greatly enhance
Python's appeal to those doing bioinformatics work if the slicing
and extended slicing operators worked to their logical limit.

Doing a cursory search doesn't seem to reveal any current PEPs
dealing with extending the functionality of slicing/extended
slicing operators.

Syntax and feature-wise, is there a reason why Python can't kick
Perl's butt as the dominant language for bioinformatics and
eventually become the lingua franca of this fast-growing and
funding-rich field?

Sep 19 '05 #1
Share this Question
Share on Google+
11 Replies


P: n/a
jb********@yahoo.com wrote:
rs='AUGCUAGACGUGGAGUAG'
rs[12:15]='GAG'

Traceback (most recent call last):
File "<pyshell#119>", line 1, in ?
rs[12:15]='GAG'
TypeError: object doesn't support slice assignment

You can't assign to a section of a sliced string in
Python 2.3 and there doesn't seem to be mention of this
as a Python 2.4 feature (don't have time to actually try
2.4 yet).


Strings are immutable in Python, which is why assignment to
slices won't work.

But why not use lists?

rs = list('AUGC...')
rs[12:15] = list('GAG')

Reinhold
Sep 19 '05 #2

P: n/a

"Reinhold Birkenfeld" <re************************@wolke7.net> wrote in
message news:3p************@individual.net...
jb********@yahoo.com wrote:
> rs='AUGCUAGACGUGGAGUAG'
> rs[12:15]='GAG'

Traceback (most recent call last):
File "<pyshell#119>", line 1, in ?
rs[12:15]='GAG'
TypeError: object doesn't support slice assignment

You can't assign to a section of a sliced string in
Python 2.3 and there doesn't seem to be mention of this
as a Python 2.4 feature (don't have time to actually try
2.4 yet).


Strings are immutable in Python, which is why assignment to
slices won't work.

But why not use lists?

rs = list('AUGC...')
rs[12:15] = list('GAG')


Or arrays of characters: see the array module.

Terry J. Reedy

Sep 19 '05 #3

P: n/a
right, i forgot about that...

Sep 20 '05 #4

P: n/a
Having to do an array.array('c',...):
x=array.array('c','ATCTGACGTC')
x[1:9:2]=array.array('c','AAAA')
x.tostring()

'AACAGACATC'

is a bit klunkier than one would want, but I guess
the efficient performance is the silver lining here.

Sep 20 '05 #5

P: n/a
On 19 Sep 2005 12:25:16 -0700, jb********@yahoo.com
<jb********@yahoo.com> wrote:
rs='AUGCUAGACGUGGAGUAG'
rs[12:15]='GAG'

Traceback (most recent call last):
File "<pyshell#119>", line 1, in ?
rs[12:15]='GAG'
TypeError: object doesn't support slice assignment


You should try Biopython (www.biopython.org). There is a sequence
method you could try.

--
<a href="http://www.spreadfirefox.com/?q=affiliates&id=24672&t=1">La
web sin popups ni spyware: Usa Firefox en lugar de Internet
Explorer</a>
Sep 20 '05 #6

P: n/a
Great suggestion... I was naively trying to turn the string into a list
and slice
that which I reckon would be significantly slower.

Sep 20 '05 #7

P: n/a
On Mon, 19 Sep 2005 19:40:12 -0700, jbperez808 wrote:
Having to do an array.array('c',...):
>>> x=array.array('c','ATCTGACGTC')
>>> x[1:9:2]=array.array('c','AAAA')
>>> x.tostring()

'AACAGACATC'

is a bit klunkier than one would want, but I guess
the efficient performance is the silver lining here.


There are a number of ways to streamline that. The simplest is to merely
create an alias to array.array:

from array import array as str

Then you can say x = str('c', 'ATCTGACGTC').

A little more sophisticated would be to use currying:

def str(value):
return array.array('c', value)

x = str('ATCTGACGTC')

although to be frank I'm not sure that something as simple as this
deserves to be dignified with the name currying.
Lastly, you could create a wrapper class that implements everything you
want. For a serious application, this is probably what you want to do
anyway:

class DNA_Sequence:
alphabet = 'ACGT'

def __init__(self, value):
for c in value:
if c not in self.__class__.alphabet:
raise ValueError('Illegal character "%s".' % c)
self.value = array.array('c', value)

def __repr__(self):
return self.value.tostring()

and so on. Obviously you will need more work than this, and it may be
possible to subclass array directly.
--
Steven.

Sep 21 '05 #8

P: n/a
On Wed, 21 Sep 2005, Steven D'Aprano wrote:
On Mon, 19 Sep 2005 19:40:12 -0700, jbperez808 wrote:
Having to do an array.array('c',...):
>>> x=array.array('c','ATCTGACGTC')
>>> x[1:9:2]=array.array('c','AAAA')
>>> x.tostring() 'AACAGACATC'

is a bit klunkier than one would want, but I guess the efficient
performance is the silver lining here.


There are a number of ways to streamline that. The simplest is to merely
create an alias to array.array:

from array import array as str

Then you can say x = str('c', 'ATCTGACGTC').

A little more sophisticated would be to use currying:

def str(value):
return array.array('c', value)

x = str('ATCTGACGTC')


There's a special hell for people who override builtins.
although to be frank I'm not sure that something as simple as this
deserves to be dignified with the name currying.
It's definitely not currying - it doesn't create a new function. Currying
would be:

def arraytype(kind):
def mkarray(value):
return array.array(kind, value)
return mkarray

chars = arraytype('c')
seq = chars("tacatcgtcgacgtcgatcagtaccc")
Lastly, you could create a wrapper class that implements everything you
want. For a serious application, this is probably what you want to do
anyway:


Definitely - there are lots of things to know about DNA molecules or parts
of them that aren't captured by the sequence.

tom

--
If it ain't Alberta, it ain't beef.
Sep 21 '05 #9

P: n/a
Tom Anderson wrote:
There's a special hell for people who override builtins.


which is, most likely, chock full of highly experienced python programmers.

</F>

Sep 21 '05 #10

P: n/a
On Wed, 21 Sep 2005 11:37:38 +0100, Tom Anderson wrote:
There's a special hell for people who override builtins.


[slaps head]

Of course there is, and I will burn in it for ever...

--
Steven.

Sep 21 '05 #11

P: n/a
On Wed, 21 Sep 2005, Fredrik Lundh wrote:
Tom Anderson wrote:
There's a special hell for people who override builtins.


which is, most likely, chock full of highly experienced python
programmers.


You reckon? I've never felt the need to do it myself, and instinctively,
it seems like a bad idea. Perhaps i've been missing something, though -
could you give me some examples of when overriding a builtin is a good
thing to do?

tom

--
Fitter, Happier, More Productive.
Sep 21 '05 #12

This discussion thread is closed

Replies have been disabled for this discussion.