473,327 Members | 1,892 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,327 software developers and data experts.

Matching parts of lines in a file (python)

3
I currently have a list of genes in a file. Each line has a chromosome with it's information. Such an entry appears as:

NM_198212 chr7 + 115926679 115935830 115927071 11593344 2 115926679,'115933260', 115927221,'115935830',

The sequence for the chromosome starts at base 115926679 and continues up to(but not including) base 115935830

If we want the spliced sequence, we use the exons.The first extends from 115926679 to 155927221, and the second goes from '115933260' to '115935830'

However, I have run across a problem when on a complementary sequence such as:

NM_001005286 chr1 - 245941755 245942680 245941755 245942680 1 245941755, '245942680'

Since column 3 is a '-', these coordinates are in reference to the anti-sense strand (the complement to the strand). The first base (in bold) matches the last base on the sense strand (in italics). Since the file only has the sense stand, I need to try to translate coordinates on the anti-sense strand to the sense strand, pick out the right sequence and then reverse-complement it.

That said, I have only been programming for about half a year and and not sure how to starts going about doing this.

I have written a regular expression:
'(NM_\d+)\s+(chr\d+)([(\+)|(-)])\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+), (\d+),s+(\d+),(\d+),'

I just made some bold, some italics, and some in quotes to show the different parts I was trying to use

but am now unsure as to how to start this function... If anyone can help me get started at all on this, perhaps making me see how to do this, I would very much appreciate it.
Apr 21 '12 #1
1 1499
dwblas
626 Expert 512MB
You want to split on the empty space and go from there.
Expand|Select|Wrap|Line Numbers
  1. rec="NM_198212 chr7 + 115926679 115935830 115927071 11593344 2"
  2. print rec.split() 
Tutorial on using lists.
Apr 22 '12 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

1
by: Srijit Kumar Bhadra | last post by:
Hello, I see that it is possible to use mmapfile.pyd of win32all. The same is mentioned in http://www.python.org/windows/win32/#mmapfile. Unfortunately I could not trace any example using...
12
by: Karlo Lozovina | last post by:
I've been Googling around for _small_, flat file (no server processes), SQL-like database which can be easily access from Python. Speed and perforamnce are of no issue, most important is that all...
3
by: lltaylor | last post by:
Hello All, I am writing a small program that allows me to scan inside a file and extract data. However I need to extract specific references within that document. e.g. X100-DB1975 What...
19
by: diffuser78 | last post by:
Like in C we comment like /* Bunch of lines of code */ Should we use docstring """ """ Or there is something else too ?? Every help is appreciated.
1
by: gowda gopala | last post by:
hi all i need help regarding file reading in C/C++. i have data file containg bitstream, i just want to read first 1byte and leave 54 bytes, again read 55th byte and so on... plz help me in this...
3
Elias Alhanatis
by: Elias Alhanatis | last post by:
Hello everybody!! Could you please inform me how can i add a sound file to an app. which animates cartoons ( ..or whatever..). I am working with Vista,Python2.5.1,IDLE,Tkinter. Thank you all in...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.