By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,903 Members | 1,104 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,903 IT Pros & Developers. It's quick & easy.

average by user defined cutoff

P: 55
Hi all,
I was trying to calculate the average value from different parts of the same data file. For example, if suppose we have number 1 - 10 and i was trying to calculate the average of only first 3 values and then 4 values and then last 3 value and then calculate the three averages. I have written a code but I guess it is very good way to calculate it and sometimes i get garbage values also.

Here is a data file:
Expand|Select|Wrap|Line Numbers
  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11.  
so when the user specifies the different parts as arguments, the average should be calculated.

Here is my perl code:
Expand|Select|Wrap|Line Numbers
  1. #!/usrbin/perl                                                                                                                                              
  2.  
  3. use strict;                                                                                                                                                 
  4. use warnings;                                                                                                                                               
  5.  
  6. my $file = $ARGV[0];
  7. my $cut1 = $ARGV[1];
  8. my $cut2 = $ARGV[2];
  9. my $cut3 = $ARGV[3];
  10. my (@tt,$result1,$result2,$result3);
  11.  
  12. open (A,$file);
  13. my ($count,$total) = 0;
  14. my ($val,$result);
  15.  
  16. while (<A>)
  17. {
  18.     my @temp = split (/\s+/,$_);
  19.     $val = $temp[1];
  20.     $total = $total + $val;
  21.     $count++;
  22.  
  23.     if ($count == $cut1)
  24.     {
  25.         $result1 = sprintf("%.3f",$total/$count);
  26.         push(@tt,$result1);
  27.          my $s1= $total;
  28.     my $r1 = $total/$count;
  29.     print "$s1\t$count\t$r1\n";
  30.     }
  31.  
  32.     if ($count ==  ($cut2+ $cut1))
  33.     {
  34.  
  35.         $result2 = sprintf("%.3f",($total-($result1*$cut1))/($count-$cut1));
  36.         push(@tt,$result2);
  37.         my $s2=($total -($result1*$cut1));
  38.         my $c2 = ($count - $cut1);
  39. my $r2 = $s2/$c2;
  40.         print "$s2\t$c2\t$r2\n";
  41.  
  42.     }
  43.     if ($count == ($cut3+$cut2+$cut1))
  44.     {
  45.         $result3 = $total- (($result1*$cut1) + ($result2*$cut2)) / ($count- ($cut1+$cut2));
  46.         push(@tt,$result3);
  47.         my $s3 = ($total- (($result1*$cut1) + ($result2*$cut2)));
  48.         my $c3 = $count - ($cut1+$cut2);
  49.         my $r3 = $s3/$c3;
  50.     print "$s3\t$c3\t$r3\n";
  51.     }
  52. }
  53. my $add = 0;
  54. foreach my $r(@tt)
  55. {
  56.     $add = $add +$r;
  57. }
  58. print "$add/scalar(@tt)\t";
  59. my $final = sprintf("%.3f",$add/scalar(@tt));
  60. print "$final\n";
  61.  
Right now i can take 3 user cutoffs but if i want to make this program take any number of cutoffs to calculate the averages.
I guess there must a better way to calculate the average from different section of same data file.
Here I have used just 1 to 10 numbers as examples but my actual data files have 16900 lines and i have to calculate the average by using different parts of the file.
Any help in this regard is appreciated.
Thanks
Kumar
Jun 7 '09 #1
Share this Question
Share on Google+
5 Replies


KevinADC
Expert 2.5K+
P: 4,059
If you have a file with just numbers on each line, why are you using split?

Expand|Select|Wrap|Line Numbers
  1.  my @temp = split (/\s+/,$_);
Anyway, something like this seems easier:

Expand|Select|Wrap|Line Numbers
  1. #!/usr/bin/perl                                                                                                                                              
  2.  
  3. use strict;                                                                                                                                                 
  4. use warnings;                                                                                                                                               
  5.  
  6. my $file = $ARGV[0];
  7. my $cut1 = $ARGV[1];
  8. my $cut2 = $ARGV[2];
  9. my $cut3 = $ARGV[3];
  10.  
  11. my (@cut1, @cut2, @cut3);
  12.  
  13. open (my $IN, $file) or die "$!";
  14. push @cut1, chomp <$IN> for (1..$cut1);
  15. push @cut2, chomp <$IN> for ($cut1+1..$cut2);
  16. push @cut3, chomp <$IN> for ($cut2+1..$cut3);
  17. close $IN;
  18.  
  19. average(\@cut1,\@cut2,\@cut3);
  20.  
  21. sub average {
  22.    my @arrays = @_;
  23.    my $sum;
  24.    foreach my $list (@arrays) {
  25.       $sum += $_ for @{$list};
  26.       my $avg = $sum / @{$list};
  27.       print "Average = $avg\n";
  28.       $sum = 0;
  29.    }
  30.  
Jun 7 '09 #2

P: 55
Thanks KevinADC for the reply,
I ran your code on a file with numbers 1 to 10, but its throwing an error, "Can't modify <HANDLE> in chomp at new.pl line 14, near "<$IN> for "
Execution of new.pl aborted due to compilation errors."
I checked on the error and when I removed the chomp it was working fine but now "Illegal division by zero at new.pl line 27." error is coming and was not able to remove it.

Thanks
Kumar
Jun 8 '09 #3

KevinADC
Expert 2.5K+
P: 4,059
oops, my bad. THis new version of the code assumes (for a 10 line file) that cut1 cut2 and cut 3 are equal to 3, 4, 3 respectively, if not the for() loop conditions need to be adjusted.

Expand|Select|Wrap|Line Numbers
  1. use strict;
  2. use warnings;
  3.  
  4. my $file = $ARGV[0];
  5. my $cut1 = $ARGV[1];
  6. my $cut2 = $ARGV[2];
  7. my $cut3 = $ARGV[3];
  8.  
  9. my (@cut1, @cut2, @cut3);
  10.  
  11. open (my $IN, $file) or die "$!";
  12. for (1 .. $cut1) {
  13.    chomp ($_ = <$IN>);
  14.    push @cut1,$_;
  15. }
  16. for ($cut1+1 .. $cut1+$cut2){
  17.    chomp ($_ = <$IN>);
  18.    push @cut2, $_;
  19. }
  20. for ($cut1+$cut2+1 .. $cut1+$cut2+$cut3){
  21.    chomp ($_ = <$IN>);
  22.    push @cut3, $_;
  23. }    
  24. close $IN;
  25.  
  26. average(\@cut1,\@cut2,\@cut3);
  27.  
  28. sub average {
  29.    my @arrays = @_;
  30.    my $sum;
  31.    foreach my $list (@arrays) {
  32.       $sum += $_ for @{$list};
  33.       my $avg = $sum / @{$list};
  34.       print "Average = $avg\n";
  35.       $sum = 0;
  36.    }
  37. }
Jun 8 '09 #4

P: 55
Thanks so much Kevin,
the output for the calculation of 1 to 10 " Average = 2 ; Average = 5.5; Average = 9" is correct, only a last small help, if in the end if I want to take the average of these values then how do i modifiy the subroutine?

Thanks
Kumar
Jun 8 '09 #5

KevinADC
Expert 2.5K+
P: 4,059
change the subroutine to something like this:

Expand|Select|Wrap|Line Numbers
  1. sub average {
  2.    my @arrays = @_;
  3.    my $sum;
  4.    my @averages;
  5.    my $total;
  6.    foreach my $list (@arrays) {
  7.       $sum += $_ for @{$list};
  8.       push @averages, $sum;
  9.       my $avg = $sum / @{$list};
  10.       print "Average = $avg\n";
  11.       $sum = 0;
  12.    }
  13.    ($total += $_) for @averages;
  14.    printf "Total Average = %.3f\n", $total / @averages;
  15. }
  16.  
Next time, please show some effort on your part first to solve the problem.
Jun 8 '09 #6

Post your reply

Sign in to post your reply or Sign up for a free account.