437,933 Members | 1,716 Online Need help? Post your question and get tips & solutions from a community of 437,933 IT Pros & Developers. It's quick & easy.

# sort and compare

 P: 55 Hi All, I am trying to sort a column based on the numerical value but it seems to me that perl sorts based on first value of the column. For example, some of the data in my file is 23, 5467, 21, 64, 654, 342, 10 When i used the sort function on the array of these numbers, it sorted based on first numerical number and not based on total value. Like it sorted 10, 21, 23,342,5467, 64,654 but i am trying to get based on value like 5467,654,342,64,23,21,10. Any help will be appreciated. Thanks Kumar Oct 12 '08 #1
12 Replies

 Expert Mod 2.5K+ P: 3,503 So, you wanted sorted in two ways, first, but the length of the number, and then a sub-sort to put them in order of each length, largest to smallest (reverse sort). What code have you tried thus far? You need to show us what you have done before we can help you. BTW, please put your code in code tags. An example is shown in the "Reply Guidelines" box to the right of the reply window. Regards, Jeff Oct 12 '08 #2

 Expert 2.5K+ P: 4,059 to sort numerically you have to use the numeric comparison operator <=>: Expand|Select|Wrap|Line Numbers @out = sort {\$a <=> \$b} @in; otherwise perl sorts the numbers as ASCII strings, which is the default sort. Its covered in the sort functions documentation. To reverse the order transpose \$a and \$b. Oct 12 '08 #3

 P: 55 Thanks for your reply, here is my code Expand|Select|Wrap|Line Numbers my %hash; open (FH, "cluster4.info"); while() {     my \$line = \$_;     chomp \$line;     if (\$line=~/^#/)         {next;}     \$line=~/.*\s(\d+).*=\s+(\d+)/;     my \$cnum = \$1;my \$mem = \$2;     \$hash{\$cnum} = \$mem; }       foreach \$value (sort {\$hash{\$b} cmp \$hash{\$a} }            keys %hash) {     print "\$value \$hash{\$value}\n"; }   Thanks kumar Oct 13 '08 #4

 P: 55 and here is my sample data Expand|Select|Wrap|Line Numbers Cluster 1  Number of members =  23 Cluster 2  Number of members =  285 Cluster 3  Number of members =  4 Cluster 4  Number of members =  28 Cluster 5  Number of members =  1 Cluster 6  Number of members =  24 Cluster 7  Number of members =  54 Cluster 8  Number of members =  246 Cluster 9  Number of members =  1435   Thanks Kumar Oct 13 '08 #5

 Expert 2.5K+ P: 4,059 Thanks for your reply, here is my code Expand|Select|Wrap|Line Numbers my %hash; open (FH, "cluster4.info"); while() {     my \$line = \$_;     chomp \$line;     if (\$line=~/^#/)         {next;}     \$line=~/.*\s(\d+).*=\s+(\d+)/;     my \$cnum = \$1;my \$mem = \$2;     \$hash{\$cnum} = \$mem; }       foreach \$value (sort {\$hash{\$b} cmp \$hash{\$a} }            keys %hash) {     print "\$value \$hash{\$value}\n"; }   Thanks kumar Is there a question you have about your code? Oct 13 '08 #6

 P: 55 Hi All, Thanks for your help, I solved the problem of sorting and printing, only a small issue, while printing the data I want to put a new line(only once) when the value is less than 100 but when i am using the if condition its printing in the all the values which i don't want. Any help. I am posting my code, sample data and result output. Expand|Select|Wrap|Line Numbers open (FH, "sample4.out") or die "Check the input file"; my \$cnt = 0; while () {     \$n1 = \$_;     if(\$n1=~ /cluster t.(\d+) has (\d+)/)     {         \$cid = \$1;         \$nc  = \$2;         \$hash{\$cid} = \$nc;         \$cnt++;     } } print "#Total number of cluster = \$cnt\n"; close (FH);   \$count = 1; foreach \$value (sort {\$hash{\$b} <=> \$hash{\$a} } keys %hash) {     print "\$count Cluster \$value : Number of members = \$hash{\$value}\n";if(\$hash{\$value} >= 100){print "\n";}     \$count++; }   Expand|Select|Wrap|Line Numbers Sample data #Total number of cluster = 20 Cluster 17  Number of members =  1 Cluster 19  Number of members =  1 Cluster 5  Number of members =  1 Cluster 13  Number of members =  2 Cluster 3  Number of members =  4 Cluster 16  Number of members =  9 Cluster 1  Number of members =  23 Cluster 6  Number of members =  24 Cluster 4  Number of members =  28 Cluster 7  Number of members =  54 Cluster 18  Number of members =  57 Cluster 20  Number of members =  92 Cluster 10  Number of members =  101 Cluster 14  Number of members =  158 Cluster 15  Number of members =  210 Cluster 8  Number of members =  246 Cluster 2  Number of members =  285 Cluster 12  Number of members =  525 Cluster 9  Number of members =  1435 Cluster 11  Number of members =  5744   Expand|Select|Wrap|Line Numbers RESULT #Total number of cluster = 20 1 Cluster 11 : Number of members = 5744   2 Cluster 9 : Number of members = 1435   3 Cluster 12 : Number of members = 525   4 Cluster 2 : Number of members = 285   5 Cluster 8 : Number of members = 246   6 Cluster 15 : Number of members = 210   7 Cluster 14 : Number of members = 158   8 Cluster 10 : Number of members = 101   9 Cluster 20 : Number of members = 92 10 Cluster 18 : Number of members = 57 11 Cluster 7 : Number of members = 54 12 Cluster 4 : Number of members = 28 13 Cluster 6 : Number of members = 24 14 Cluster 1 : Number of members = 23 15 Cluster 16 : Number of members = 9 16 Cluster 3 : Number of members = 4 17 Cluster 13 : Number of members = 2 18 Cluster 17 : Number of members = 1 19 Cluster 19 : Number of members = 1 20 Cluster 5 : Number of members = 1   Thanks Kumar Oct 13 '08 #7

 Expert 100+ P: 174 Hi All, Thanks for your help, I solved the problem of sorting and printing, only a small issue, while printing the data I want to put a new line(only once) when the value is less than 100 but when i am using the if condition its printing in the all the values which i don't want. Any help. I am posting my code, sample data and result output. Thanks Kumar can you restate that i don't understand what your getting at. Oct 13 '08 #8

 Expert 2.5K+ P: 4,059 I'm confused. The data you posted does not seem to be the data your script is working with. Expand|Select|Wrap|Line Numbers if(\$n1=~ /cluster t.(\d+) has (\d+)/) The data is not: cluster t.nn has nn where 'nn' is some integer Oct 13 '08 #9

 P: 55 Sorry for the confusion, actually what I am trying to do is, when the result is printed then, if the number of elements(last column values) are less than 100 then put a new line between the result. so the result should look something like this: Expand|Select|Wrap|Line Numbers #Total number of cluster = 17   1 Cluster 12 : Number of members = 5309 2 Cluster 8 : Number of members = 1697 3 Cluster 17 : Number of members = 683 4 Cluster 1 : Number of members = 400 5 Cluster 7 : Number of members = 218 6 Cluster 5 : Number of members = 207 7 Cluster 16 : Number of members = 173 8 Cluster 2 : Number of members = 100   9 Cluster 9 : Number of members = 79 10 Cluster 10 : Number of members = 54 11 Cluster 3 : Number of members = 53 12 Cluster 6 : Number of members = 16 13 Cluster 13 : Number of members = 4 14 Cluster 14 : Number of members = 3 15 Cluster 11 : Number of members = 2 16 Cluster 15 : Number of members = 1 17 Cluster 4 : Number of members = 1   but with my code, its putting new line in every line after the number of elements are less than 100, so my results looks like this Expand|Select|Wrap|Line Numbers #Total number of cluster = 17   1 Cluster 12 : Number of members = 5309 2 Cluster 8 : Number of members = 1697 3 Cluster 17 : Number of members = 683 4 Cluster 1 : Number of members = 400 5 Cluster 7 : Number of members = 218 6 Cluster 5 : Number of members = 207 7 Cluster 16 : Number of members = 173 8 Cluster 2 : Number of members = 100   9 Cluster 9 : Number of members = 79   10 Cluster 10 : Number of members = 54   11 Cluster 3 : Number of members = 53   12 Cluster 6 : Number of members = 16   13 Cluster 13 : Number of members = 4   14 Cluster 14 : Number of members = 3   15 Cluster 11 : Number of members = 2   16 Cluster 15 : Number of members = 1   17 Cluster 4 : Number of members = 1   I know that i am printing the new line in the loop but i tried different combinations but neither worked. Again sorry for the confusion. Thanks Kumar Oct 13 '08 #10

 Expert 100+ P: 174 try this if this is what you are asking for? Expand|Select|Wrap|Line Numbers   foreach \$value (sort {\$hash{\$b} <=> \$hash{\$a} } keys %hash) { if (\$hash{\$value} >= 100){print "\n";}     print "\$count Cluster \$value : Number of members = \$hash{\$value}\n";     \$count++;     } Oct 13 '08 #11

 Expert 100+ P: 410 If you want to insert an extra newline only once before the values displayed are less than 100, use this approach. Expand|Select|Wrap|Line Numbers \$count = 1;  my @high = grep(\$hash{\$_} >= 100, keys %hash); \$hash{\$high[\$#high]} = \$hash{\$high[\$#high]}. "\n"; #append extra newline   foreach \$value (sort {\$hash{\$b} <=> \$hash{\$a} } keys %hash)  {      print "\$count Cluster \$value : Number of members = \$hash{\$value}\n";      \$count++;  }    Oct 13 '08 #12

 Expert 100+ P: 174 Ok i think i got what you mean now, you want the first line that is below or equal to 100 to print a new line before it and thats it. try: Expand|Select|Wrap|Line Numbers foreach \$value (sort {\$hash{\$b} <=> \$hash{\$a} } keys %hash) { \$flag=0; if ((\$hash{\$value} >= 100) && (\$flag == 0) { print "\n"; \$flag=1; }     print "\$count Cluster \$value : Number of members = \$hash{\$value}\n";     \$count++; } Oct 13 '08 #13 