473,405 Members | 2,141 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,405 software developers and data experts.

ATTN : Georges ( gry@ll.mit.edu)

First of all thanks for helping me out.

I have to admit I dont understand some of your suggestiosn, sorry.
I dont know what is the "3D" thing... Is there another way to make it
work something more simple for a newbie like me? Thanks

What I want to do is:
First check all the files from a folder and analyze only the one with the .Seq extension.
What I want to do is to get the reverse complement of the DNA sequence. If their is a problem
with some characters in the DNA Sequence I want the function to tell it to me.

Here are the comp and iupac:

iupac ="GgAaTtCcRrYyMmKkSsWwHhBbVvDdNn"

comp={"A":"T", "T":"A", "G":"C", "C":"G", "R":"Y", "Y":"R", "M":"K",
"K":"M", "S":"W", "W":"S", "B":"V", "V":"B", "D":"H", "H":"D", "r":"y",
"y":"r", "m":"k", "k":"m", "s":"w", "w":"s", "b":"v", "v":"b", "d":"h",
"h":"d", "a":"t", "t":"a", "g":"c", "c":"g", "N":"N","n":"n"}

So if a $ or Z appears in the DNA sequence, I want to know it.

My code so far:
# -*- coding: iso-8859-1 -*-
import sys
import os
from progadn import *

ab1seq = raw_input("Entrez le répertoire où sont les fichiers à analyser: ") or None
if ab1seq == None :
print "Erreur: Pas de répertoire! \n" \
"\nAu revoir \n"
sys.exit()

listrep = os.listdir(ab1seq)
#print listrep

extseq=[]

for f in listrep:
if f[-4:]==".Seq":
extseq.append(f)
#print extseq

for x in extseq:
f=open(x, "r")
seq=f.read()
f.close()
#s=seq
def checkDNA(seq):
"""Retourne une liste des caractères non conformes à l'IUPAC."""

junk=[]
for c in range (len(seq)):
if seq[c] not in iupac:
junk.append([seq[c],c])
#print junk
print "ATTN: Il y a le caractère %s en position %s " % (seq[c],c)
if junk == []:
indinv=range(len(seq))
indinv.reverse()
resultat=""
for i in indinv:
resultat +=comp[seq[i]]
return resultat

seq=checkDNA(seq)

-------------------------------------------------------------------------------------------------------------------------

Path: news3!feeder.news-service.com!news.glorb.com!postnews.google.com!o13 g2000cwo.googlegroups.com!not-for-mail
From: gr*@ll.mit.edu
Newsgroups: comp.lang.python
Subject: Re: problem with the logic of read files
Date: 12 Apr 2005 10:47:17 -0700
Organization: http://groups.google.com
Lines: 104
Message-ID: <11**********************@o13g2000cwo.googlegroups .com>
References: <68***************************@nf1.news-service.com>
NNTP-Posting-Host: 129.55.200.20
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1113328069 32347 127.0.0.1 (12 Apr 2005 17:47:49 GMT)
X-Complaints-To: gr**********@google.com
NNTP-Posting-Date: Tue, 12 Apr 2005 17:47:49 +0000 (UTC)
In-Reply-To: <68***************************@nf1.news-service.com>
User-Agent: G2/0.2
Complaints-To: gr**********@google.com
Injection-Info: o13g2000cwo.googlegroups.com; posting-host=129.55.200.20;
posting-account=tzIXbQwAAACT3z3X4eITVLtksgiDRxhx
Xref: news-x2.support.nl comp.lang.python:438583
<m_t...@yahoo.com> wrote:
I am new to python and I am not in computer science. In fact I am a biologist and I ma trying to learn python. So if someone can help me, I
will appreciate it. Thanks
#!/cbi/prg/python/current/bin/python
# -*- coding: iso-8859-1 -*-
import sys
import os
from progadn import *

ab1seq =3D raw_input("Entrez le r=E9pertoire o=F9 sont les fichiers =E0 analyser: ") or None if ab1seq =3D=3D None :
print "Erreur: Pas de r=E9pertoire! \n"
"\nAu revoir \n"
sys.exit()

listrep =3D os.listdir(ab1seq)
#print listrep

extseq=3D[]

for f in listrep: ###### Minor -- this is better said as: if f.endswith(".Seq"): if f[-4:]=3D=3D".Seq":
extseq.append(f)
# print extseq

for x in extseq:
f =3D open(x, "r") ###### seq=3D... discards previous data and refers only to that just
read.
###### It would be simplest to process each file as it is read:
@@@@@@ seq=3Df.read()
@@@@@@ checkDNA(seq) seq=3Df.read()
f.close()
s=3Dseq

def checkDNA(seq):
"""Retourne une liste des caract=E8res non conformes =E0 l'IUPAC."""
junk=3D[]
for c in range (len(seq)):
if seq[c] not in iupac:
junk.append([seq[c],c])
#print junk
print "ATTN: Il y a le caract=E8re %s en position %s " % (seq[c],c) if junk =3D=3D []:
indinv=3Drange(len(seq))
indinv.reverse()
resultat=3D""
for i in indinv:
resultat +=3Dcomp[seq[i]]
return resultat

seq=3DcheckDNA(seq)
print seq
##### The program segment you posted did not define "comp" or "iupac",
##### so it's a little hard to guess how it's supposed to work. It
would
##### be helpful if you gave a concise description of what you want the

##### program to do, as well as brief sample of input data.
##### I hope this helps! -- George
#I got the following ( as you see only one file is proceed by the function even if more files is in extseq
['B1-11_win3F_B04_04.ab1.Seq']
['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq']
['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq', 'B1-18_win3F_D04_08.ab1.Seq'] ['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq', 'B1-18_win3F_D04_08.ab1.Seq', 'B1-18_win3R_E04_10.ab1.Seq'] ['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq', 'B1-18_win3F_D04_08.ab1.Seq', 'B1-18_win3R_E04_10.ab1.Seq',
'B1-19_win3F_F04_12.ab1.Seq'] ..
['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq', 'B1-18_win3F_D04_08.ab1.Seq', 'B1-18_win3R_E04_10.ab1.Seq',
'B1-19_win3F_F04_12.ab1.Seq', 'B1-19_win3R_G04_14.ab1.Seq',
'B90_win3F_H04_16.ab1.Seq', 'B90_win3R_A05_01.ab1.Seq',
'DL2-11_win3F_H03_15.ab1.Seq', 'DL2-11_win3R_A04_02.ab1.Seq',
'DL2-12_win3F_F03_11.ab1.Seq', 'DL2-12_win3R_G03_13.ab1.Seq',
'M7757_win3F_B05_03.ab1.Seq', 'M7757_win3R_C05_05.ab1.Seq',
'M7759_win3F_D05_07.ab1.Seq', 'M7759_win3R_E05_09.ab1.Seq',
'TCR700-114_win3F_H05_15.ab1.Seq', 'TCR700-114_win3R_A06_02.ab1.Seq',
'TRC666-100_win3F_F05_11.ab1.Seq', 'TRC666-100_win3R_G05_13.ab1.Seq']
after this listing my programs proceed only the last element of this listing (TRC666-100_win3R_G05_13.ab1.Seq)

NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCCCGAAGTGTCCCAGAGCA AATAAATGGACCAAAACGTTTTTAG=
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCCCGAAGTGTCCCAGAGCA AATAAATGGACCAAAACGTTTTTAG=
AATACTTGAACGTGTAATCTCATTTTAA

**********End Of Post*************


Jul 18 '05 #1
1 1881

My code so far:
# -*- coding: iso-8859-1 -*-
import sys
import os
from progadn import *

ab1seq = raw_input("Entrez le répertoire où sont les fichiers à
analyser: ") or None
Ce serait mieux d'utiliser sys.argv pour spécifier le répertoire dans la
ligne de commande du programme :
import sys
help(sys.argv)
if ab1seq == None :
print "Erreur: Pas de répertoire! \n" \
"\nAu revoir \n"
sys.exit()
je propose :

import os, os.path, sys

def usage():
print "documentation..."
sys.exit(-1)
args = sys.argv[1:]

if not args:
usage()

files = []
for path in args:
if os.path.isfile( path ):
files.append( path )
elif os.path.isdir( path ):
files.extend( [os.path.join( path, fname ) for fname in os.listdir( path
)] )
else:
print "%s n'est ni un fichier ni un répertoire..." % path
usage()

files = [ fname for fname in files if fname.endswith( ".Seq" ) ]
88
if not files:
print "Aucun fichier a traiter."
usage()

print "Fichier à traiter :"
print ", ".join( files )

for path in files:
print path
checkDNA( open( path ).read() )
def checkDNA(seq):
"""Retourne une liste des caractères non conformes à l'IUPAC."""

junk=[]
for c in range (len(seq)):
if seq[c] not in iupac:
junk.append([seq[c],c])
#print junk
print "ATTN: Il y a le caractère %s en position %s " %
(seq[c],c)
if junk == []:
indinv=range(len(seq))
indinv.reverse()
resultat=""
for i in indinv:
resultat +=comp[seq[i]]
return resultat
Je réécris un peu votre fonction d'une manière plus "python", à placer
dans le programme avant son appel bien sûr !

def checkDNA( seq ):
seq = seq.strip()
if not seq:
print "Fichier vide."
return
resultat = []
for i,c in enumerate(seq):
try:
resultat.append( comp[c] )
except KeyError:
print "Catactère <%s> en position <%d> invalide" % (c,i)
resultat.reverse()
return ''.join( resultat )

seq=checkDNA(seq)

-------------------------------------------------------------------------------------------------------------------------

Path:
news3!feeder.news-service.com!news.glorb.com!postnews.google.com!o13 g2000cwo.googlegroups.com!not-for-mail
From: gr*@ll.mit.edu
Newsgroups: comp.lang.python
Subject: Re: problem with the logic of read files
Date: 12 Apr 2005 10:47:17 -0700
Organization: http://groups.google.com
Lines: 104 2> Message-ID: <11**********************@o13g2000cwo.googlegroups .com> References: <68***************************@nf1.news-service.com>
NNTP-Posting-Host: 129.55.200.20
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1113328069 32347 127.0.0.1 (12 Apr 2005
17:47:49 GMT)
X-Complaints-To: gr**********@google.com
NNTP-Posting-Date: Tue, 12 Apr 2005 17:47:49 +0000 (UTC)
In-Reply-To: <68***************************@nf1.news-service.com>
User-Agent: G2/0.2
Complaints-To: gr**********@google.com
Injection-Info: o13g2000cwo.googlegroups.com; posting-host=129.55.200.20;
posting-account=tzIXbQwAAACT3z3X4eITVLtksgiDRxhx
Xref: news-x2.support.nl comp.lang.python:438583
<m_t...@yahoo.com> wrote:
I am new to python and I am not in computer science. In fact I am a

biologist and I ma trying to learn python. So if someone can help me, I
will appreciate it.
Thanks
#!/cbi/prg/python/current/bin/python
# -*- coding: iso-8859-1 -*-
import sys
import os
from progadn import *

ab1seq =3D raw_input("Entrez le r=E9pertoire o=F9 sont les fichiers =E0

analyser: ") or None
if ab1seq =3D=3D None :
print "Erreur: Pas de r=E9pertoire! \n"
"\nAu revoir \n"
sys.exit()

listrep =3D os.listdir(ab1seq)
#print listrep

extseq=3D[]

for f in listrep:

###### Minor -- this is better said as: if f.endswith(".Seq"):
if f[-4:]=3D=3D".Seq":
extseq.append(f)
# print extseq

for x in extseq:
f =3D open(x, "r")

###### seq=3D... discards previous data and refers only to that just
read.
###### It would be simplest to process each file as it is read:
@@@@@@ seq=3Df.read()
@@@@@@ checkDNA(seq)
seq=3Df.read()
f.close()
s=3Dseq

def checkDNA(seq):
"""Retourne une liste des caract=E8res non conformes =E0

l'IUPAC."""

junk=3D[]
for c in range (len(seq)):
if seq[c] not in iupac:
junk.append([seq[c],c])
#print junk
print "ATTN: Il y a le caract=E8re %s en position %s " %

(seq[c],c)
if junk =3D=3D []:
indinv=3Drange(len(seq))
indinv.reverse()
resultat=3D""
for i in indinv:
resultat +=3Dcomp[seq[i]]
return resultat

seq=3DcheckDNA(seq)
print seq


##### The program segment you posted did not define "comp" or "iupac",
##### so it's a little hard to guess how it's supposed to work. It
would
##### be helpful if you gave a concise description of what you want the

##### program to do, as well as brief sample of input data.
##### I hope this helps! -- George

#I got the following ( as you see only one file is proceed by the

function even if more files is in extseq

['B1-11_win3F_B04_04.ab1.Seq']
['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq']
['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq',

'B1-18_win3F_D04_08.ab1.Seq']
['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq',

'B1-18_win3F_D04_08.ab1.Seq', 'B1-18_win3R_E04_10.ab1.Seq']
['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq',

'B1-18_win3F_D04_08.ab1.Seq', 'B1-18_win3R_E04_10.ab1.Seq',
'B1-19_win3F_F04_12.ab1.Seq']
..
['B1-11_win3F_B04_04.ab1.Seq', 'B1-11_win3R_C04_06.ab1.Seq',

'B1-18_win3F_D04_08.ab1.Seq', 'B1-18_win3R_E04_10.ab1.Seq',
'B1-19_win3F_F04_12.ab1.Seq', 'B1-19_win3R_G04_14.ab1.Seq',
'B90_win3F_H04_16.ab1.Seq', 'B90_win3R_A05_01.ab1.Seq',
'DL2-11_win3F_H03_15.ab1.Seq', 'DL2-11_win3R_A04_02.ab1.Seq',
'DL2-12_win3F_F03_11.ab1.Seq', 'DL2-12_win3R_G03_13.ab1.Seq',
'M7757_win3F_B05_03.ab1.Seq', 'M7757_win3R_C05_05.ab1.Seq',
'M7759_win3F_D05_07.ab1.Seq', 'M7759_win3R_E05_09.ab1.Seq',
'TCR700-114_win3F_H05_15.ab1.Seq', 'TCR700-114_win3R_A06_02.ab1.Seq',
'TRC666-100_win3F_F05_11.ab1.Seq', 'TRC666-100_win3R_G05_13.ab1.Seq']

after this listing my programs proceed only the last element of this

listing (TRC666-100_win3R_G05_13.ab1.Seq)

NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCCCGAAGTGTCCCAGAGCA AATAAATGGACCAAAACGTTTTTAG=
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCCCGAAGTGTCCCAGAGCA AATAAATGGACCAAAACGTTTTTAG=
AATACTTGAACGTGTAATCTCATTTTAA

**********End Of Post*************


Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Manoj K.S | last post by:
Hi I am very new to perl script. I am in the process of making an application to automate tesing of serial port in linux. I like to know whether some sample programs are available for sending...
2
by: ChrisWinterscheid | last post by:
Occasionally we see these messages in the db2diag.log file. Can someone tell me what they mean and are they anything to worry about? 2005-03-03-10.42.04.286220 Instance:db2inst1 Node:000...
6
by: please | last post by:
I need your help with the thread "Understanding ++" TIA.
0
by: Joeyej | last post by:
Hi - I'm trying to move/use a web form (containing some javascript field checks) previously hosted on a Windows 2000 server. However, the FORM METHOD="post..." command in the form (shown below)...
102
by: Xah Lee | last post by:
i had the pleasure to read the PHP's manual today. http://www.php.net/manual/en/ although Pretty Home Page is another criminal hack of the unix lineage, but if we are here to judge the quality...
0
by: AJAY SHARMA | last post by:
rpasken@eas.slu.edu , You have made comments that Momentum is not Conserved in my PAPER. It is not CORRECT. It is explained in 100 Years of E=mc2 rpasken@eas.slu.edu Robert Pasken ...
7
by: soccertl | last post by:
We have a program running on AIX and uses DB/2. Every once in a while, while our program is just running along, DB/2 dies due to a signal 9 being sent to one of the EDU processes: ...
1
by: James | last post by:
Database was no responding when user did some query operation. I checked db2diag.log and found EDU error mentioned in it. I pasted them below. I am not sure how to sovle it. Any advice? Thanks in...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.