Looping two files and count string occurrences of 2nd file in lines of first file

I need to generate permutation of some words (A T G C ) actually nucleotides for di-composition (eg AA AT AG AC), tri-composition (AAA AAT AAC AAG), tetra, penta etc (one at a time) and then check in the other file that contains sequences with some values the count of occurrences of each permutation. I generated the permutation list. Now I need to loop through the sequences only (splitting the sequences from values) for counting each of the permutation generated above and get the output in new file. But I'm getting the answer for only one sequence and not for the other sequences.

Logic of the programme i tried to follow is :

Generate the permutations of ATCG in a file1 (e.g. AT AG AC AA ...)
Read the generated file1 and sequence#value file (DNA_seq_val.txt)
Read the sequences and separate the sequences form values
Loop through the sequences for the permutations and print their occurrence with values (each separated with comma) in results file.
Input test file name is DNA_seq_val.txt
AAAATTTT#99
CCCCGGGG#77
ATATATCGCGCG#88

*Output I got is --
2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,99 AAAATTTT
77 CCCCGGGG
88 ATATATCGCGCG

Output Needed is 2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,99 AAAATTTT
x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,77 CCCCGGGGx
x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,88 ATATATCGCGCG
(where x= corresponding counts as in first line)

Expand|Select|Wrap|Line Numbers

 
from itertools import product

import os
 
f2 = open('TRYYY', 'a')
 
#********Generate the permutations start********

per = product('ACGT', repeat=2)    # ATGC =nucleotides; 2= for di ntd(replace 2 with 3 fir tri ntds and so on)

f = open('myfile', 'w')

p = ""

for p in per:

    p = "".join(p)

    f.write(p + "\n")

f.close()
 
#********Generate the permutations ENDS********
 
with open('DNA_seq_val.txt', 'r+') as SEQ, open('myfile', 'r+') as TET: #open two files

    SEQ_lines = sum(1 for line in open('DNA_seq_val.txt'))        #count lines in sequences file

    #print (SEQ_lines)

    compo_lines = sum(1 for line in open('myfile'))        #count lines in composition

    #print (compo_lines)

    for lines in SEQ:

        line,val1 = lines.split("#")

        val2 = val1.rstrip('\n')

        val = str(val2)

        line = line.rstrip('\n')

        length =len(line)

        #print (line)        

        #print (val)

        LIN = line, val

        #print (LIN)

        newstr = "".join((line))

        print (newstr)

        #while True:        # infinte loop

        for PER in TET:

            #print (line)

            PER = PER.rstrip('\n')

            length2 =len(PER)

            #print (length2)

            #print (line)

#            print (PER)

            C_PER  = str(line.count(PER))

#            print (C_PER)

            for R in C_PER:

                R1 = "".join(R)

                f2.write(R1+ ",")

        f2.write(val,)

        f2.write('\t')

        f2.write(line)

        f2.write('\n')

    #exit()

Mar 1 '18 #1

Subscribe Post Reply

1159

dwblas

626

Expert 512MB

*Output I got is --
2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,99 AAAATTTT
77 CCCCGGGG
88 ATATATCGCGCG

Output Needed is 2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2,99 AAAATTTT
x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,77 CCCCGGGGx
x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,88 ATATATCGCGCG
(where x= corresponding counts as in first line)

That's nice, but how are we to help you get this from an unknown input and what do all these numbers mean, 2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2, and what about x,x,x,x,x,x,x,x,x,x,x,x,x,x,x,x??? Counting occurrences is relatively simple but there just isn't enough info here.

Mar 1 '18 #2

by: Pernell Williams | last post by:

Hi all: I am new to Python, and this is my first post (and it won't be my last!), so HELLO EVERYONE!! I am attempting to use "xreadlines", an outer loop and an inner loop in conjunction with...

Python

Integrating FILE * and int file handles

by: Woodster | last post by:

I currently have some code for an application that is running on Win32. I have tried to keep anything not directly gui related as separate as possible for portability reasons, including file...

C / C++

Looping Problem (Generating files - only the last record generates a file)

by: vasilijepetkovic | last post by:

Hello All, I have a problem with the program that should generate x number of txt files (x is the number of records in the file datafile.txt). Once I execute the program (see below) only one...

Python

Open linked PDF-Files from a PDF-File

by: Paul Kuebler | last post by:

????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????...

ASP.NET

Opening a File using "File.Open"

by: OutdoorGuy | last post by:

Greetings, I have a "newbie" question in relation to opening files from C#. I have a Windows form where I allow the user to type in a file extension in a text box (e.g., "xls"). I then take...

C# / C Sharp

loading sql files from a master file Oracle equivalent

by: Jeff Kish | last post by:

Hi. I need to give my customer an sql file that they can run in query analyzer. All the stuff they need to run is in a set of existing files. I'd like to just tell them to load this file (this...

Microsoft SQL Server

Datagrid Hyperlink field to play file system wave file help.

by: Morris Neuman | last post by:

Im working with VS 2005 and trying to use a Hyperlink field in a datagrid to play a wave file that is not located in the website folders but is in a plain folder on the same machine, windows 2003...

ASP.NET

iterate start at second row in file not first

by: notnorwegian | last post by:

i have a big file with sentences, the first file of each sentence contains a colon(:) somewher eon that line i want to jump past that sentence. if all(x != ':' for x in line): this way i can...

Python

Assembling small files in a large file but format is shifted

by: DeepNik | last post by:

Hi Perl Experts: I am relatively new to perl and did try to solve the problem by searching books and web but could not exactly solve it. Here is the problem. I want to gather "*.igf" files and put...

Perl

PERL script that prints out the file names and file sizes and determines the average

by: Ormazd | last post by:

Hello, I was wondering if anyone might be able to help me with a little PERL script? I'm very new and I have been given a task to write a simple Perl script that prints out the file names and...

Perl

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

Looping two files and count string occurrences of 2nd file in lines of first file

Similar topics