469,356 Members | 2,331 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,356 developers. It's quick & easy.

global search of motif

Hi all,

I was writing a code to find out the exact positions of different variants of a motif in a sequence. Therefore I have two arrays. One with the sequence and one with the different variants of the motif (e.g. small_motif_a CGTCGCACAGC). The problem is my output is completely senseless as it gives me positions that does not even lay in my sequence, and it finds only one position. There must be something wrong with the search in the loop...

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl -w
  2. use strict;
  3. use Bio::Perl;
  4.  
  5. # This script will find the exact location of motifs in a DNA sequence
  6.  
  7. my $dnafilename;
  8. my $varfilename;
  9. my $refseq;
  10. my $pos;
  11. my $motif;
  12. my $refseqtemp;
  13. my @dna;
  14. my $dna;
  15. my @var;
  16. my $var;
  17. @var=();
  18. @dna=();
  19.  
  20.  
  21. print "Enter the filename of your input file with the sequence:= ";            # File that contains your DNA sequence
  22. chomp ($dnafilename=<STDIN>);
  23. open(DNAINPUT,'<',$dnafilename) or die ("$dnafilename Can not open file\n"); 
  24.  
  25. print "Enter the filename of your input file with the variants:= ";            # File that contains your different variants
  26. chomp ($varfilename=<STDIN>);
  27. open (VARINPUT,'<',$varfilename) or die ("$varfilename Can not open file\n");
  28.  
  29. @dna=<DNAINPUT>;                                    # stores the sequence in an array
  30. # print @dna;
  31.  
  32. while (<VARINPUT>) 
  33. {                                            # stores the variants in an array
  34.  chomp;
  35.  push @var, $_;
  36. }
  37.  
  38. $refseq= $dna[0];
  39. chomp $refseq;                                  
  40. $pos=0;                                            # position of the first base
  41.  
  42.  
  43. my $temp;
  44. $temp=1;
  45.  
  46. foreach my $thing (@var)
  47. {
  48.  $motif=$thing;
  49.  $refseqtemp = $refseq;                                 #start for every motif with the whole sequence
  50.  print "$motif\n";
  51.  $motif =~ s/^\w*\s//;
  52.  print "$motif\n";
  53.  while(length($refseq)>0 && $temp>0)                                # as long as there is a rest sequence (>0)
  54.   {
  55.   $temp=0;
  56.   my $length = length($refseq);
  57.   print "length now $length\n";
  58.  
  59.    if ($refseq=~ /($motif)/ig)                                # search for the motif (variant)
  60.    {
  61.         print "found\n";
  62.         $pos= $pos + length($')-1;
  63.         print "pos $pos\n";
  64.         $pos= $pos + length($&);
  65.     $refseq= $';
  66.         $dna=$dna+1;
  67.         $temp=1;
  68.     }
  69.  
  70. else {
  71.        print "Sorry, no match\n";
  72.      }
  73.   }
  74.  
  75.  
  76. }
  77.  
Maybe someone has an idea... any help appreciated

Cheers,
Anja
Oct 20 '10 #1
0 1131

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

4 posts views Thread by Tandem person | last post: by
1 post views Thread by Zenobia | last post: by
6 posts views Thread by simon.robin.jackson | last post: by
reply views Thread by simon.robin.jackson | last post: by
reply views Thread by zhoujie | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.