Perl - How to create all possible sequence from a word

Before giving any solution, we would like to know what you have tried so far.
But, just to give you a hint to start with, all you need to do is to use nested foreach() loops for all the required arrays to get a set of possible combinations.

-Nithin

Hi Nithin,
Thank you for your reply. Yes, I have been trying exactly what you have said about like nested foreach() loop, but it really confuse me to implement it as a programm..I am not a regular programmer, so I would be happy with modifying existing code of this sort to my need. Thats why I posted my concern into this forum. More I do understand that doing such a nested loop will consume more computational time so if any advanced programmer had done a heuristic approach for this problem would be much sought out for me here.

Oct 20 '08 #3

Icecrack

174

Expert 100+

Hi Nithin,
Thank you for your reply. Yes, I have been trying exactly what you have said about like nested foreach() loop, but it really confuse me to implement it as a programm..I am not a regular programmer, so I would be happy with modifying existing code of this sort to my need. Thats why I posted my concern into this forum. More I do understand that doing such a nested loop will consume more computational time so if any advanced programmer had done a heuristic approach for this problem would be much sought out for me here.

if you have Searched this forum you would have found something on the lines of what you want.

all i know is i will not attempt to help you unless some code is placed

Oct 20 '08 #4

4,059

Expert 2GB

I totally agree with what other people have posted, you need to show some effort before we post some code. That is my personal rule also, 99% of the time. I reserve that last 1% for a question I find interesting enough to violate my own rule.

My approach to solving the problem is not heuristic, but you could say its part of a heuristic approach to discovering other ways to do the same thing that are hopefully more efficient.

I applied very linear logic to solving the problem. Which is almost always how I start. There could be short cuts or better ways to do this. In fact I am almost sure there is but I might not be smart enough to figure them out.

Expand|Select|Wrap|Line Numbers

 use strict;

use warnings;

use Data::Dumper;

my %hash = (

   V => ['GTT','GTC','GTA','GTG'],

   A => ['GCT','GCC','GCA','GCG'],

   E => ['GAA','GAG'],

   F => ['TTT','TTC'],

   G => ['GGT','GGC','GGA','GGG'],

   H => ['CAT','CAC'],

);

my $word = 'VAEFGH';

my @order = split //, $word;# used to loop through the letters in sequential order

my %arrays;# initialize a hash 

%arrays = map {$_=> $hash{$_}} split //,$word;#convert the word into a hash of arrays

my @loops;#initialize an array to find how many permutations are possible

foreach my $array (keys %hash) {

   push @loops, scalar @{$hash{$array}};#gets the length of each array in the hash

}

my $i = (sort {$a <=> $b} @loops)[0];#sort the array and get the 0th value which is the number of possible permutations

my @perms;#initialize an array to store the permutations

foreach my $j (0..$i-1) {#start the number of loops we need to get all permutations

   my $perm;#initialize a scalar to build a permutation

   foreach my $letter (@order) {

      $perm .= @{$hash{$letter}}[$j];

   }

   push @perms, $perm;

}

print Dumper \@perms;

Now lets hear why the results are not correct ;)

Results using Data::Dumper to print the permutations:

Expand|Select|Wrap|Line Numbers

 $VAR1 = [

          'GTTGCTGAATTTGGTCAT',

          'GTCGCCGAGTTCGGCCAC'

        ];

I assumed the number of possible permutations is limited by the shortest array of sequences. If that assumption is wrong you can use the longest array as the number of loops you need to make but you will have to test each array to make sure you have not gone past the end of shorter arrays.

If this is school work (which I am sure it is) you should not submit this code as your own work, that would be unethical. Hopefully that is important to you.

Oct 21 '08 #5

4,059

Expert 2GB

This version filles in blank spots with '---':

Expand|Select|Wrap|Line Numbers

 use strict;

use warnings;

use Data::Dumper;

my %hash = (

   V => ['GTT','GTC','GTA','GTG'],

   A => ['GCT','GCC','GCA','GCG'],

   E => ['GAA','GAG'],

   F => ['TTT','TTC'],

   G => ['GGT','GGC','GGA','GGG'],

   H => ['CAT','CAC'],

);

my $word = 'VAEFGH';

my @order = split //, $word;

my %arrays;

%arrays = map {$_=> $hash{$_}} split //,$word;

my @loops;

foreach my $array (keys %hash) {

   push @loops, scalar @{$hash{$array}};

}

my $i = (sort {$a <=> $b} @loops)[-1];

my @perms;

foreach my $j (0..$i-1) {

   my $perm;

   foreach my $letter (@order) {

      $perm .= @{$hash{$letter}}[$j] || '---';

   }

   push @perms, $perm;

}

print Dumper \@perms;

Change this:

Expand|Select|Wrap|Line Numbers

|| '---';

to this to ignore blank spots

Expand|Select|Wrap|Line Numbers

|| '';

Oct 21 '08 #6

This version filles in blank spots with '---':

Expand|Select|Wrap|Line Numbers

use strict;

use warnings;

use Data::Dumper;

my %hash = (

   V => ['GTT','GTC','GTA','GTG'],

   A => ['GCT','GCC','GCA','GCG'],

   E => ['GAA','GAG'],

   F => ['TTT','TTC'],

   G => ['GGT','GGC','GGA','GGG'],

   H => ['CAT','CAC'],

);

my $word = 'VAEFGH';

my @order = split //, $word;

my %arrays;

%arrays = map {$_=> $hash{$_}} split //,$word;

my @loops;

foreach my $array (keys %hash) {

   push @loops, scalar @{$hash{$array}};

}

my $i = (sort {$a <=> $b} @loops)[-1];

my @perms;

foreach my $j (0..$i-1) {

   my $perm;

   foreach my $letter (@order) {

      $perm .= @{$hash{$letter}}[$j] || '---';

   }

   push @perms, $perm;

}

print Dumper \@perms;

Change this:

Expand|Select|Wrap|Line Numbers

|| '---';

to this to ignore blank spots

Expand|Select|Wrap|Line Numbers

|| '';

Hi Kevin,
Thanks for your help, I just saw your post today morning and thanks for the code and I will check it out and let you know the outcome very soon.

Regards,
B.Nataraj

Oct 21 '08 #7

4,059

Expert 2GB

The code I posted does not find all possible combinations for the "word". I guess after reviewing the thread I am not sure thats what you were after anyway. But if you really did want all permutations that would be a different thing and the output would grow exponentially.

Oct 21 '08 #8

Icecrack

174

Expert 100+

The code I posted does not find all possible combinations for the "word". I guess after reviewing the thread I am not sure thats what you were after anyway. But if you really did want all permutations that would be a different thing and the output would grow exponentially.

You should read and not scan i have learn't this the hard way.

:P

and after reading your script thats what i thought it was not what he wanted.

Oct 21 '08 #9

4,059

Expert 2GB

You should read and not scan i have learn't this the hard way.

:P

and after reading your script thats what i thought it was not what he wanted.

hehehe.... I do read the threads, I've been posting on forums for more than 10 years and learned long ago not to get into a thread half-baked. If he really wants all premutations that should actually be easier, if more intense, since it will generate a lot more sequences. The unblanaced length of the arrays is what made me think he wants only the possible combinations achieved from looping through the set of arrays once. Maybe not though.

Oct 21 '08 #10

The code I posted does not find all possible combinations for the "word". I guess after reviewing the thread I am not sure thats what you were after anyway. But if you really did want all permutations that would be a different thing and the output would grow exponentially.

Hi Kevin,
Yes I want all possible permutation by keeping the original length of the word as constant. The output should not be short in length ( I mean inserting dash or space is not permitted in my problem). This is originally a bioinformatics problem, I did not talk here in the language of bioinformatics since I belive most of you in this forum may not aware of bioinformatics or even molecular biology's pros and cons. But after seeing your intrest in this problem I am bound to explain my problem in bioinformatics term, that is the approach which I am doing is called backtranslation ,I am trying to translate from protein sequence to DNA sequence, The three letter codon (DNA sequence) (eg ATT, GCC etc.,) code for single amino acid , say A, V, E ,F etc., There are more than one codon can represent an indivdual amino acid but the reverse is not true. Hope you can get some idea of my problem now..so could you please suggest modification in your code (Which I belive 70% represented my problem) to achive it as 100% successful code.

Thanks and regards,
B.Nataraj

Oct 21 '08 #11

4,059

Expert 2GB

All the possible permutations is simple:

Expand|Select|Wrap|Line Numbers

 
use strict;

use warnings;

my @V=('GTT','GTC','GTA','GTG');

my @A=('GCT','GCC','GCA','GCG');

my @E=('GAA','GAG');

my @F=('TTT','TTC');

my @G=('GGT','GGC','GGA','GGG');

my @H=('CAT','CAC');

my $i = 1;

foreach my $v (@V) {

   foreach my $a (@A) {

      foreach my $e (@E) {

         foreach my $f (@F) {

            foreach my $g (@G) {

               foreach my $h (@H) {

                  print "$i $v$a$e$f$g$h\n";

                  $i++;

               }

            }

         }

      }

   }

}

$i is there just to show how many permutations there are, in this case 512.

You may also want to look into Bioperl which is written exclusively for this type of work.

Oct 21 '08 #12

All the possible permutations is simple:

Expand|Select|Wrap|Line Numbers

use strict;

use warnings;

my @V=('GTT','GTC','GTA','GTG');

my @A=('GCT','GCC','GCA','GCG');

my @E=('GAA','GAG');

my @F=('TTT','TTC');

my @G=('GGT','GGC','GGA','GGG');

my @H=('CAT','CAC');

my $i = 1;

foreach my $v (@V) {

   foreach my $a (@A) {

      foreach my $e (@E) {

         foreach my $f (@F) {

            foreach my $g (@G) {

               foreach my $h (@H) {

                  print "$i $v$a$e$f$g$h\n";

                  $i++;

               }

            }

         }

      }

   }

}

$i is there just to show how many permutations there are, in this case 512.

You may also want to look into Bioperl which is written exclusively for this type of work.

Hi Kevin,
Great help indeed and thanks ,as I am in the middle of my wet lab work that I do in addition to my computation work, for that reason I am unable to check your code and report you immediately, I do it today evening and will let you know.

Regards,
B.Nataraj

Oct 21 '08 #13

4,059

Expert 2GB

I should add that this is a much less interesting problem than I had originally anticipated so I am back to my personal rule. You need to show some effort in solving your coding problems before I will post anymore code. I am no more interested in learning bioinformatics than you are in learning perl programming.

You are welcome,
Kevin(ADC)

Oct 21 '08 #14

Please help, I'm a newb at perl and I can't figure this out.

I should add that this is a much less interesting problem than I had originally anticipated so I am back to my personal rule. You need to show some effort in solving your coding problems before I will post anymore code. I am no more interested in learning bioinformatics than you are in learning perl programming.

You are welcome,
Kevin(ADC)

Hi Kevin,
The code works fine the way I wanted it is to be. I do feel that it's so simple but I had panicked once I saw the problem earlier and more the urge to do fast and quick made me to post here...once again I reiterate that I am a weekend programmer and regular programmer like you it made so simple. Any way thanks a lot for your help

I now have to generalize your code for any possible input protein sequence given as input may be I have to make it to read from a text file by each line and have to convert it into the way the program does it now....Hope I can do it now by at least within three days and I will come back to you with my code if I happened to struck there elsewhere so you wont need to escalate from your personal rule then :-)

Regards,
B.Nataraj

Oct 21 '08 #15

by: Xah Lee | last post by:

here's another interesting algorithmic exercise, again from part of a larger program in the previous series. Here's the original Perl documentation: =pod merge($pairings) takes a list of...

Python

by: Alex | last post by:

Basically I want to parse some C++ source code and replace some constants. I've mangaed to read in the file and replace some of the things that need replacing. but I'm having a hard time...

Perl portability

by: Hendry Taylor | last post by:

I wrote a perl script and tested it on windows and it works fine. I then moved it onto a solaris machine and now it says that there is a syntax error with the following line of code: $user1 =...

compiling perl 5.8.7 on Solaris 8

by: Kirt Loki Dankmyer | last post by:

So, I download the latest "stable" tar for perl (5.8.7) and try to compile it on the Solaris 8 (SPARC) box that I administrate. I try all sorts of different switches, but I can't get it to compile....

consuming Perl ws, WDSL namespace problem

by: ryan | last post by:

I've been tasked with consuming a Perl web service that was written by a person in a different department of my company. The problem is it's the guy's first attempt at web services and he doesn't...

.NET Framework

Merged records to Word from Access-Create page break based on data

by: itm | last post by:

I have merged the data from Access. Now I need Word to start a new page every time the sequence number changes. How is this accomplished? Also, is it possible to tell Word to hide duplicates? ...

Microsoft Access / VBA

Taint (like in Perl) as a Python module: taint.py

by: Johann C. Rocholl | last post by:

The following is my first attempt at adding a taint feature to Python to prevent os.system() from being called with untrusted input. What do you think of it? # taint.py - Emulate Perl's taint...

Python

Perl Exception Help

by: muppetjones | last post by:

I have been receiving a "Floating point exception" from Perl after my program has run several iterations. I have not been able to pin point the location of the error, and I am clueless. Especially as...

apache error when interfacing html with perl

by: happyse27 | last post by:

Hi All, I got this apache errors(see section A1 and A2 below) when I used a html(see section b below) to activate acctman.pl(see section c below). Section D below is part of the configuration...