# average by user defined cutoff

 P: 55 Hi all, I was trying to calculate the average value from different parts of the same data file. For example, if suppose we have number 1 - 10 and i was trying to calculate the average of only first 3 values and then 4 values and then last 3 value and then calculate the three averages. I have written a code but I guess it is very good way to calculate it and sometimes i get garbage values also. Here is a data file: Expand|Select|Wrap|Line Numbers 1 2 3 4 5 6 7 8 9 10   so when the user specifies the different parts as arguments, the average should be calculated. Here is my perl code: Expand|Select|Wrap|Line Numbers #!/usrbin/perl                                                                                                                                                 use strict;                                                                                                                                                  use warnings;                                                                                                                                                  my \$file = \$ARGV; my \$cut1 = \$ARGV; my \$cut2 = \$ARGV; my \$cut3 = \$ARGV; my (@tt,\$result1,\$result2,\$result3);   open (A,\$file); my (\$count,\$total) = 0; my (\$val,\$result);   while () {     my @temp = split (/\s+/,\$_);     \$val = \$temp;     \$total = \$total + \$val;     \$count++;       if (\$count == \$cut1)     {         \$result1 = sprintf("%.3f",\$total/\$count);         push(@tt,\$result1);          my \$s1= \$total;     my \$r1 = \$total/\$count;     print "\$s1\t\$count\t\$r1\n";     }       if (\$count ==  (\$cut2+ \$cut1))     {           \$result2 = sprintf("%.3f",(\$total-(\$result1*\$cut1))/(\$count-\$cut1));         push(@tt,\$result2);         my \$s2=(\$total -(\$result1*\$cut1));         my \$c2 = (\$count - \$cut1); my \$r2 = \$s2/\$c2;         print "\$s2\t\$c2\t\$r2\n";       }     if (\$count == (\$cut3+\$cut2+\$cut1))     {         \$result3 = \$total- ((\$result1*\$cut1) + (\$result2*\$cut2)) / (\$count- (\$cut1+\$cut2));         push(@tt,\$result3);         my \$s3 = (\$total- ((\$result1*\$cut1) + (\$result2*\$cut2)));         my \$c3 = \$count - (\$cut1+\$cut2);         my \$r3 = \$s3/\$c3;     print "\$s3\t\$c3\t\$r3\n";     } } my \$add = 0; foreach my \$r(@tt) {     \$add = \$add +\$r; } print "\$add/scalar(@tt)\t"; my \$final = sprintf("%.3f",\$add/scalar(@tt)); print "\$final\n";   Right now i can take 3 user cutoffs but if i want to make this program take any number of cutoffs to calculate the averages. I guess there must a better way to calculate the average from different section of same data file. Here I have used just 1 to 10 numbers as examples but my actual data files have 16900 lines and i have to calculate the average by using different parts of the file. Any help in this regard is appreciated. Thanks Kumar Jun 7 '09 #1
 Expert 2.5K+ P: 4,059 If you have a file with just numbers on each line, why are you using split? Expand|Select|Wrap|Line Numbers  my @temp = split (/\s+/,\$_); Anyway, something like this seems easier: Expand|Select|Wrap|Line Numbers #!/usr/bin/perl                                                                                                                                                 use strict;                                                                                                                                                  use warnings;                                                                                                                                                  my \$file = \$ARGV; my \$cut1 = \$ARGV; my \$cut2 = \$ARGV; my \$cut3 = \$ARGV;   my (@cut1, @cut2, @cut3);   open (my \$IN, \$file) or die "\$!"; push @cut1, chomp <\$IN> for (1..\$cut1); push @cut2, chomp <\$IN> for (\$cut1+1..\$cut2); push @cut3, chomp <\$IN> for (\$cut2+1..\$cut3); close \$IN;   average(\@cut1,\@cut2,\@cut3);   sub average {    my @arrays = @_;    my \$sum;    foreach my \$list (@arrays) {       \$sum += \$_ for @{\$list};       my \$avg = \$sum / @{\$list};       print "Average = \$avg\n";       \$sum = 0;    }   Jun 7 '09 #2

 P: 55 Thanks KevinADC for the reply, I ran your code on a file with numbers 1 to 10, but its throwing an error, "Can't modify in chomp at new.pl line 14, near "<\$IN> for " Execution of new.pl aborted due to compilation errors." I checked on the error and when I removed the chomp it was working fine but now "Illegal division by zero at new.pl line 27." error is coming and was not able to remove it. Thanks Kumar Jun 8 '09 #3

 Expert 2.5K+ P: 4,059 oops, my bad. THis new version of the code assumes (for a 10 line file) that cut1 cut2 and cut 3 are equal to 3, 4, 3 respectively, if not the for() loop conditions need to be adjusted. Expand|Select|Wrap|Line Numbers use strict; use warnings;   my \$file = \$ARGV; my \$cut1 = \$ARGV; my \$cut2 = \$ARGV; my \$cut3 = \$ARGV;   my (@cut1, @cut2, @cut3);   open (my \$IN, \$file) or die "\$!"; for (1 .. \$cut1) {    chomp (\$_ = <\$IN>);    push @cut1,\$_; } for (\$cut1+1 .. \$cut1+\$cut2){    chomp (\$_ = <\$IN>);    push @cut2, \$_; } for (\$cut1+\$cut2+1 .. \$cut1+\$cut2+\$cut3){    chomp (\$_ = <\$IN>);    push @cut3, \$_; }     close \$IN;   average(\@cut1,\@cut2,\@cut3);   sub average {    my @arrays = @_;    my \$sum;    foreach my \$list (@arrays) {       \$sum += \$_ for @{\$list};       my \$avg = \$sum / @{\$list};       print "Average = \$avg\n";       \$sum = 0;    } } Jun 8 '09 #4

 P: 55 Thanks so much Kevin, the output for the calculation of 1 to 10 " Average = 2 ; Average = 5.5; Average = 9" is correct, only a last small help, if in the end if I want to take the average of these values then how do i modifiy the subroutine? Thanks Kumar Jun 8 '09 #5

 Expert 2.5K+ P: 4,059 change the subroutine to something like this: Expand|Select|Wrap|Line Numbers sub average {    my @arrays = @_;    my \$sum;    my @averages;    my \$total;    foreach my \$list (@arrays) {       \$sum += \$_ for @{\$list};       push @averages, \$sum;       my \$avg = \$sum / @{\$list};       print "Average = \$avg\n";       \$sum = 0;    }    (\$total += \$_) for @averages;    printf "Total Average = %.3f\n", \$total / @averages; }   Next time, please show some effort on your part first to solve the problem. Jun 8 '09 #6 