473,441 Members | 2,588 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,441 software developers and data experts.

Looping two files and count string occurrences of 2nd file in lines of first file

1
I need to generate permutation of some words (A T G C ) actually nucleotides for di-composition (eg AA AT AG AC), tri-composition (AAA AAT AAC AAG), tetra, penta etc (one at a time) and then check in the other file that contains sequences with some values the count of occurrences of each permutation. I generated the permutation list. Now I need to loop through the sequences only (splitting the sequences from values) for counting each of the permutation generated above and get the output in new file. But I'm getting the answer for only one sequence and not for the other sequences.

Logic of the programme i tried to follow is :

Generate the permutations of ATCG in a file1 (e.g. AT AG AC AA ...)
Read the generated file1 and sequence#value file (DNA_seq_val.txt)
Read the sequences and separate the sequences form values
Loop through the sequences for the permutations and print their occurrence with values (each separated with comma) in results file.
Input test file name is DNA_seq_val.txt
AAAATTTT#99
CCCCGGGG#77
ATATATCGCGCG#88

*Output I got is --
2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,99 AAAATTTT
77 CCCCGGGG
88 ATATATCGCGCG

Output Needed is 2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,99 AAAATTTT
x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,77 CCCCGGGGx
x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,88 ATATATCGCGCG
(where x= corresponding counts as in first line)

Expand|Select|Wrap|Line Numbers
  1. from itertools import product
  2. import os
  3.  
  4. f2 = open('TRYYY', 'a')
  5.  
  6. #********Generate the permutations start********
  7. per = product('ACGT', repeat=2)    # ATGC =nucleotides; 2= for di ntd(replace 2 with 3 fir tri ntds and so on)
  8. f = open('myfile', 'w')
  9. p = ""
  10. for p in per:
  11.     p = "".join(p)
  12.     f.write(p + "\n")
  13. f.close()
  14.  
  15. #********Generate the permutations ENDS********
  16.  
  17. with open('DNA_seq_val.txt', 'r+') as SEQ, open('myfile', 'r+') as TET: #open two files
  18.     SEQ_lines = sum(1 for line in open('DNA_seq_val.txt'))        #count lines in sequences file
  19.     #print (SEQ_lines)
  20.     compo_lines = sum(1 for line in open('myfile'))        #count lines in composition
  21.     #print (compo_lines)
  22.     for lines in SEQ:
  23.         line,val1 = lines.split("#")
  24.         val2 = val1.rstrip('\n')
  25.         val = str(val2)
  26.         line = line.rstrip('\n')
  27.         length =len(line)
  28.         #print (line)        
  29.         #print (val)
  30.         LIN = line, val
  31.         #print (LIN)
  32.         newstr = "".join((line))
  33.         print (newstr)
  34.         #while True:        # infinte loop
  35.         for PER in TET:
  36.             #print (line)
  37.             PER = PER.rstrip('\n')
  38.             length2 =len(PER)
  39.             #print (length2)
  40.             #print (line)
  41. #            print (PER)
  42.             C_PER  = str(line.count(PER))
  43. #            print (C_PER)
  44.             for R in C_PER:
  45.                 R1 = "".join(R)
  46.                 f2.write(R1+ ",")
  47.         f2.write(val,)
  48.         f2.write('\t')
  49.         f2.write(line)
  50.         f2.write('\n')
  51.     #exit()
  52.  
Mar 1 '18 #1
1 1159
dwblas
626 Expert 512MB
*Output I got is --
2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,99 AAAATTTT
77 CCCCGGGG
88 ATATATCGCGCG

Output Needed is 2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,99 AAAATTTT
x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,77 CCCCGGGGx
x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,88 ATATATCGCGCG
(where x= corresponding counts as in first line)
That's nice, but how are we to help you get this from an unknown input and what do all these numbers mean, 2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2, and what about x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x??? Counting occurrences is relatively simple but there just isn't enough info here.
Mar 1 '18 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Pernell Williams | last post by:
Hi all: I am new to Python, and this is my first post (and it won't be my last!), so HELLO EVERYONE!! I am attempting to use "xreadlines", an outer loop and an inner loop in conjunction with...
12
by: Woodster | last post by:
I currently have some code for an application that is running on Win32. I have tried to keep anything not directly gui related as separate as possible for portability reasons, including file...
6
by: vasilijepetkovic | last post by:
Hello All, I have a problem with the program that should generate x number of txt files (x is the number of records in the file datafile.txt). Once I execute the program (see below) only one...
9
by: Paul Kuebler | last post by:
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????...
2
by: OutdoorGuy | last post by:
Greetings, I have a "newbie" question in relation to opening files from C#. I have a Windows form where I allow the user to type in a file extension in a text box (e.g., "xls"). I then take...
2
by: Jeff Kish | last post by:
Hi. I need to give my customer an sql file that they can run in query analyzer. All the stuff they need to run is in a set of existing files. I'd like to just tell them to load this file (this...
9
by: Morris Neuman | last post by:
Im working with VS 2005 and trying to use a Hyperlink field in a datagrid to play a wave file that is not located in the website folders but is in a plain folder on the same machine, windows 2003...
6
by: notnorwegian | last post by:
i have a big file with sentences, the first file of each sentence contains a colon(:) somewher eon that line i want to jump past that sentence. if all(x != ':' for x in line): this way i can...
5
by: DeepNik | last post by:
Hi Perl Experts: I am relatively new to perl and did try to solve the problem by searching books and web but could not exactly solve it. Here is the problem. I want to gather "*.igf" files and put...
1
by: Ormazd | last post by:
Hello, I was wondering if anyone might be able to help me with a little PERL script? I'm very new and I have been given a task to write a simple Perl script that prints out the file names and...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.