In part one we discussed the default sort function. In part two we will discuss more advanced techniques you can use to sort data. Some of the techniques might introduce unfamiliar methods or syntax to a less experienced perl coder. I will post links to online resources you can read if necessary. Experienced perl coders might find nothing new or useful contained in this article.
Short Review
In part one I showed you some very basic syntax you can use to sort data:
Expand|Select|Wrap|Line Numbers
- @sorted = sort (@array); # default lexical ascending order sort
- @sorted = reverse sort @array; # default lexical sort in reverse/descending order
- @sorted = sort {$a cmp $b} @array; # same as first example
- @sorted = sort {$b cmp $a} @array; # same as second example
Processing Data While Sorting
If you don't have much data to sort or little processing to do this can be a good option. It takes less planning and experience to get working and is generally efficient enough for most applications. For the first example we will use a list of names:
Expand|Select|Wrap|Line Numbers
- my @names = qw(Jim JANE fred Andrew Chris albert sally Martha);
Expand|Select|Wrap|Line Numbers
- my @sorted = sort {lc $a cmp lc $b} @names;
The sorted list:
albert
Andrew
Chris
fred
JANE
Jim
Martha
sally
Note that the data has not been changed, only the order. It is still in the same format of mixed case characters. The data is assigned to $a or $b and converted to lower case only during the sort comparison. What this means is that some of the data will be converted to lower case characters more than one time during the sort. This can have a big impact on how long it takes to sort data if the processing is slow. In this example, using the lc operator is marginal as far as overall performance goes. Even if it were a list of a million names it should still go pretty fast. But if you needed to use regular expressions or sorting by different fields or sorting references to data it can pay big dividends to pre-process the data before sorting.
Another way is to use a subroutine instead of a code block:
Expand|Select|Wrap|Line Numbers
- my @sorted = sort lowercase @names;
- sub lowercase {
- return lc $a cmp lc $b;
- }
You can give sorting subroutines meaningful names so it's function is understood i.e.: by_age by_date by_name by_number. Generally I find using a code block easiest for simple sorts like the previous examples and reserve using a subroutine for more complex sorts. Assume we have an array of hashes; the keys are first_name, last_name, and age.
Expand|Select|Wrap|Line Numbers
- my @sorted = sort by_name_and_age @array_of_hashes;
- sub by_name_and_age {
- return $a->{last_name} cmp $b->{last_name}
- || $a->{first_name} cmp $b->{first_name}
- || $b->{age} <=> $a->{age};
- }
Part three of Sorting Data with Perl will discuss methods of sorting after processing.
This article is protected under the Creative Commons License.