By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
425,600 Members | 1,683 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 425,600 IT Pros & Developers. It's quick & easy.

Finding the mismatches

P: 1
i will write the complete problem i am facing. Here is the input file i am using.

Expand|Select|Wrap|Line Numbers
  1. sxoght: #query  hit     score   probability     qstart  qend    qorientation    tstart  tend    matches mismatches      gapOpening      gaps
  2. @SNPSTER4_104_308EFAAXX:1:1:1694:128
  3. GGGATAAGAGAGGTGCATGTTGGTATTTAAGGTAGT
  4. 1 alignment(s) -- reports limited to 10 alignment(s)
  5.  
  6. sxoght: SNPSTER4_104_308EFAAXX:1:1:1694:128     gi|122939163|ref|NM_000165.3|   -10     1.000000        1       36      +       1595    1630    35      1   00
  7. Score = -10, P(A|R) = 1.000000
  8. Query:          1 GGGATAAGAGAGGTGCATGTTGGTATTTAAGGTAGT 36
  9.                   |||||||||||||||||||||||||||||| |||||
  10. Sbjct:       1595 GGGATAAGAGAGGTGCATGTTGGTATTTAAAGTAGT 1630
  11.  
  12. @SNPSTER4_104_308EFAAXX:1:1:1608:94
  13. GCAGTTTTAAGTTATTAGTTTTTAAAATCAGTACTT
  14. 14 alignment(s) -- reports limited to 10 alignment(s)
  15.  
  16. sxoght: SNPSTER4_104_308EFAAXX:1:1:1608:94      gi|113412254|ref|XR_018775.1|   0       0.090884        1       36      +       1578    1613    36      0   00
  17. Score = 0, P(A|R) = 0.090884
  18. Query:          1 GCAGTTTTAAGTTATTAGTTTTTAAAATCAGTACTT 36
  19.                   ||||||||||||||||||||||||||||||||||||
  20. Sbjct:       1578 GCAGTTTTAAGTTATTAGTTTTTAAAATCAGTACTT 1613

this is a big file, though the whole file looks like this. What i am trying to do exactly is to grep the header (sxoght) for display in columns and also to display where there is a mismatch in the alignment between query and sbjct. for this input file, the expected results should look like:

Expand|Select|Wrap|Line Numbers
  1. >gi|122939163|ref|NM_000165.3| 1595 1630 SNPSTER4_104_308EFAA +XX:1:1:1694:128 1 36 36 -10 1 1.000000 35 mismatch : 1625.GA
  2.  
  3. >gi|113412254|ref|XR_018775.1| 1578 1613 SNPSTER4_104_308EFAA +XX:1:1:1608:94 1 36 36 0 1 0.090884 36 mismatch : 1581.GT 1612.TG

the code which i have written is :

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl
  2. open(FILE,"transalign.output") or die "can not open file";
  3. while($var=<FILE>){
  4.         $str1=();$str2=();
  5.         if($var=~/^sxoght:/){
  6.                 @ar=split(/\s+/,$var);
  7.         #print ">$ar[2]\t$ar[8]\t$ar[9]\t$ar[1]\t$ar[5]\t$ar[10]\t$ar[6]\t$ar[3]\t$ar[4]\t$ar[11]\n";
  8.         }
  9.         if($var=~/^Query:/){
  10.                 $str1=$var;
  11.                 $str1=~s/^Query:\s+//g;
  12.                 $str1=~s/\d+\s+//g;
  13.                 $str1=~s/\s+//g;
  14.         }
  15.         if($var=~/^Sbjct:/){
  16.                 $str2=$var;
  17.                 $str2=~s/^Sbjct:\s+//g;
  18.                 $str2=~s/\d+\s+//g;
  19.                 $str2=~s/\s+//g;
  20.         }
  21.         for($i=0;$i<=length($str1);$i++)
  22.         {
  23.         if(substr($str1,$i,1) ne substr($str2,$i,1)){
  24.                 print substr($str1,$i,1);
  25.                 print substr($str2,$i,1);
  26.                 print "$i\n";
  27.         }
  28.         }
  29. }

I am not able to use "strict and warning" because using it doesnt allow me to access the scalar variable outside the loop. In my code, i m trying to extract the positions first, so that i will subtract it from the already stored @arr values of beginning and start. I am having problems with the for loop. I know where i am going wrong, but dont know how to correct it. PLEASEEEE HELP !!!
Nov 17 '08 #1
Share this Question
Share on Google+
2 Replies


numberwhun
Expert Mod 2.5K+
P: 3,503
First, there are extremely few reasons where you cannot use warnings and strict. They help alleviate the little syntactical errors you will undoubtedly experience. If you need to access variables outside of a loop where they may get populated, then you need to declare them outside of that loop.

I always have a section near the top of my script, just after all the use statements that is specifically for all variable declarations. Try that and run your code and get rid of any syntactical errors. Then, please elaborate on what you mean by "trouble with the loop". You need to be much more specific, even siting any errors.

Regards,

Jeff
Nov 17 '08 #2

nithinpes
Expert 100+
P: 410
Declare $str1 and $str2 outside the while loop.
Expand|Select|Wrap|Line Numbers
  1. my ($str1,$str2);my @ar;
  2. while(my $line=<FILE>){ 
  3.  
Modify the relevant section as below:
Expand|Select|Wrap|Line Numbers
  1. if($var=~/^Query:/){ 
  2.                 $str1=$var; 
  3.                 $str1=~s/^Query:\s+//g; 
  4.                 $str1=~s/\d+\s+//g; 
  5.                 $str1=~s/\s+//g; 
  6.                 $str2="";
  7.         } 
  8.         if($var=~/^Sbjct:/){ 
  9.                 $str2=$var; 
  10.                 $str2=~s/^Sbjct:\s+//g; 
  11.                 $str2=~s/\d+\s+//g; 
  12.                 $str2=~s/\s+//g; 
  13.         } 
  14.         if(($str1 ne "") && ($str2 ne ""))  {
  15.               for($i=0;$i<=length($str1);$i++) 
  16.         { 
  17.         if(substr($str1,$i,1) ne substr($str2,$i,1)){ 
  18.                my $add=$i+$ar[8];
  19.                print " $add.";
  20.                print substr($str1,$i,1);
  21.                print substr($str2,$i,1);
  22.                                }           
  23.         } 
  24.         $str1=$str2="";    print "\n"; 
  25. }
  26.  
- Nithin
Nov 18 '08 #3

Post your reply

Sign in to post your reply or Sign up for a free account.