471,870 Members | 1,353 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,870 software developers and data experts.

global search of motif

Hi all,

I was writing a code to find out the exact positions of different variants of a motif in a sequence. Therefore I have two arrays. One with the sequence and one with the different variants of the motif (e.g. small_motif_a CGTCGCACAGC). The problem is my output is completely senseless as it gives me positions that does not even lay in my sequence, and it finds only one position. There must be something wrong with the search in the loop...

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl -w
  2. use strict;
  3. use Bio::Perl;
  4.  
  5. # This script will find the exact location of motifs in a DNA sequence
  6.  
  7. my $dnafilename;
  8. my $varfilename;
  9. my $refseq;
  10. my $pos;
  11. my $motif;
  12. my $refseqtemp;
  13. my @dna;
  14. my $dna;
  15. my @var;
  16. my $var;
  17. @var=();
  18. @dna=();
  19.  
  20.  
  21. print "Enter the filename of your input file with the sequence:= ";            # File that contains your DNA sequence
  22. chomp ($dnafilename=<STDIN>);
  23. open(DNAINPUT,'<',$dnafilename) or die ("$dnafilename Can not open file\n"); 
  24.  
  25. print "Enter the filename of your input file with the variants:= ";            # File that contains your different variants
  26. chomp ($varfilename=<STDIN>);
  27. open (VARINPUT,'<',$varfilename) or die ("$varfilename Can not open file\n");
  28.  
  29. @dna=<DNAINPUT>;                                    # stores the sequence in an array
  30. # print @dna;
  31.  
  32. while (<VARINPUT>) 
  33. {                                            # stores the variants in an array
  34.  chomp;
  35.  push @var, $_;
  36. }
  37.  
  38. $refseq= $dna[0];
  39. chomp $refseq;                                  
  40. $pos=0;                                            # position of the first base
  41.  
  42.  
  43. my $temp;
  44. $temp=1;
  45.  
  46. foreach my $thing (@var)
  47. {
  48.  $motif=$thing;
  49.  $refseqtemp = $refseq;                                 #start for every motif with the whole sequence
  50.  print "$motif\n";
  51.  $motif =~ s/^\w*\s//;
  52.  print "$motif\n";
  53.  while(length($refseq)>0 && $temp>0)                                # as long as there is a rest sequence (>0)
  54.   {
  55.   $temp=0;
  56.   my $length = length($refseq);
  57.   print "length now $length\n";
  58.  
  59.    if ($refseq=~ /($motif)/ig)                                # search for the motif (variant)
  60.    {
  61.         print "found\n";
  62.         $pos= $pos + length($')-1;
  63.         print "pos $pos\n";
  64.         $pos= $pos + length($&);
  65.     $refseq= $';
  66.         $dna=$dna+1;
  67.         $temp=1;
  68.     }
  69.  
  70. else {
  71.        print "Sorry, no match\n";
  72.      }
  73.   }
  74.  
  75.  
  76. }
  77.  
Maybe someone has an idea... any help appreciated

Cheers,
Anja
Oct 20 '10 #1
0 1160

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

4 posts views Thread by Tandem person | last post: by
1 post views Thread by Zenobia | last post: by
6 posts views Thread by simon.robin.jackson | last post: by
reply views Thread by simon.robin.jackson | last post: by
NeoPa
reply views Thread by NeoPa | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.