473,372 Members | 1,011 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,372 software developers and data experts.

Help for a new Perl User - Looking for suggestion

I am new to Perl and I am trying for figure out how to solve this
problem. If anyone can give me some suggestions, I would greatly
appreciate it.

I am trying to read a log file and generate some statistics from it.
For simplicity purposes (only) I have edited some of my logs and code.

Below is an example of a log file which has 3 columns (separated by
commas). The first column contains a time stamp, the second column
contains a numeric identifier and the last column contains a NYSE
Symbol:
10:00,123,KO
10:00,124,KO
10:00,123,KO
10:00,123,KO
10:00,125,T
10:00,125,T
10:20,123,KO
10:20,123,KO
10:20,126,YY
10:20,123,KO
10:20,129,PP
10:40,145,YY
10:40,147,MM
11:00,123,KO
11:00,124,KO
11:00,123,KO
11:00,123,KO
11:00,125,T
11:00,125,T
11:20,123,KO

I am trying to determine at each time interval, how many times the
numeric identifier appears. For example, I would like my output to
look something like this:

TIME NUMERIC IDENTIFIER OCCURENCES SYMBOL
10:00 123 3 KO
10:00 124 1 KO
10:00 125 2 TO
10:20 123 3 KO
10:20 126 1 YY
10:20 129 1 PP
10:40 145 1 YY
10:40 147 1 MM
11:00 123 3 KO
11:00 125 2 KO
11:00 124 1 KO
11:20 123 1 KO

Please keep in mind that my log file contains roughly 70,000 rows of
data
I have been working on this some time and I am ABLE to determine the
total number of messages in the time period (see below), but I am
UNABLE to further break it out by numeric identifier. Can anyone
recommend how I would do this? Below is how far I have gotten in the
code so far:

SAMPLE OF CODE:

$samplelog = "test2.log";
open (IN, "$samplelog");

while ($rcd = <IN>){
@fields = split(/,/,$rcd);
$time{$fields[0]}++;

}

foreach $key (sort keys(%time)) {
print "$key $time{$key} \n";
}

Thank you
Jul 19 '05 #1
2 2829
In article <ef*************************@posting.google.com> , scadav
<dk*******@yahoo.com> wrote:
I am new to Perl and I am trying for figure out how to solve this
problem. If anyone can give me some suggestions, I would greatly
appreciate it.

I am trying to read a log file and generate some statistics from it.
For simplicity purposes (only) I have edited some of my logs and code.

Below is an example of a log file which has 3 columns (separated by
commas). The first column contains a time stamp, the second column
contains a numeric identifier and the last column contains a NYSE
Symbol:
[data snipped (see below program, below)]
I am trying to determine at each time interval, how many times the
numeric identifier appears. For example, I would like my output to
look something like this:

TIME NUMERIC IDENTIFIER OCCURENCES SYMBOL
10:00 123 3 KO
10:00 124 1 KO
10:00 125 2 TO
10:20 123 3 KO
10:20 126 1 YY
10:20 129 1 PP
10:40 145 1 YY
10:40 147 1 MM
11:00 123 3 KO
11:00 125 2 KO
11:00 124 1 KO
11:20 123 1 KO

Please keep in mind that my log file contains roughly 70,000 rows of
data
I have been working on this some time and I am ABLE to determine the
total number of messages in the time period (see below), but I am
UNABLE to further break it out by numeric identifier. Can anyone
recommend how I would do this? Below is how far I have gotten in the
code so far:

SAMPLE OF CODE:

$samplelog = "test2.log";
open (IN, "$samplelog");

while ($rcd = <IN>){
@fields = split(/,/,$rcd);
$time{$fields[0]}++;

}

foreach $key (sort keys(%time)) {
print "$key $time{$key} \n";
}


Use the entire record as your key. Use chomp first to remove the
newline at the end. Then split apart the record, which is also the key
to the hash, to do the printing:

Jim 48% cat scadav.pl
#!/usr/local/bin/perl

use strict;
use warnings;

my @fields;
my %time;

while (my $rcd = <DATA>){
chomp($rcd);
$time{$rcd}++;
}

foreach my $entry (sort keys(%time)) {
my($tim,$num,$id) = split(/,/,$entry);
printf " %5s %3d %3d %s\n", $tim, $num, $time{$entry}, $id;
}
__END__
10:00,123,KO
10:00,124,KO
10:00,123,KO
10:00,123,KO
10:00,125,T
10:00,125,T
10:20,123,KO
10:20,123,KO
10:20,126,YY
10:20,123,KO
10:20,129,PP
10:40,145,YY
10:40,147,MM
11:00,123,KO
11:00,124,KO
11:00,123,KO
11:00,123,KO
11:00,125,T
11:00,125,T
11:20,123,KO

Jim 49% ./scadav.pl
10:00 123 3 KO
10:00 124 1 KO
10:00 125 2 T
10:20 123 3 KO
10:20 126 1 YY
10:20 129 1 PP
10:40 145 1 YY
10:40 147 1 MM
11:00 123 3 KO
11:00 124 1 KO
11:00 125 2 T
11:20 123 1 KO
Jim 50%

If you wish to break down your data in different ways or sort, then you
need to consider more complicated solutions such as defining a
hash-of-hashes or a hash-of-arrays to store your multi-level data.

Some more pointers for those new to Perl:

1. Always put 'use strict' and 'use warnings' at the beginning of your
program, then declare all of your variables with 'my' or 'our' or use a
package name for globals.

2. Always check the results of an open call (and all other system
calls, as well):

open (IN, $samplelog) or die("Can't open $samplelog: $!";

3. There is no need to put double-quotes around $samplelog in the above.

4. Post further miscellaneous Perl questions to comp.lang.perl.misc, as
this newsgroup is defunct, but be sure and check the guidelines for
that newsgroup before doing so:

http://mail.augustmail.com/~tadmc/cl...uidelines.html
Jul 19 '05 #2
Jim Gibson <jg*****@mail.arc.nasa.gov> wrote in
news:020720041255540732%jg*****@mail.arc.nasa.gov:
In article <ef*************************@posting.google.com> , scadav
<dk*******@yahoo.com> wrote:
I am new to Perl and I am trying for figure out how to solve this
problem. If anyone can give me some suggestions, I would greatly
appreciate it.

I am trying to read a log file and generate some statistics from it.
For simplicity purposes (only) I have edited some of my logs and
code.

Below is an example of a log file which has 3 columns (separated by
commas). The first column contains a time stamp, the second column
contains a numeric identifier and the last column contains a NYSE
Symbol:

[data snipped (see below program, below)]
I am trying to determine at each time interval, how many times the
numeric identifier appears. For example, I would like my output to
look something like this:

TIME NUMERIC IDENTIFIER OCCURENCES SYMBOL
10:00 123 3 KO
10:00 124 1 KO
10:00 125 2 TO
10:20 123 3 KO
10:20 126 1 YY
10:20 129 1 PP
10:40 145 1 YY
10:40 147 1 MM
11:00 123 3 KO
11:00 125 2 KO
11:00 124 1 KO
11:20 123 1 KO

Please keep in mind that my log file contains roughly 70,000 rows of
data
I have been working on this some time and I am ABLE to determine the
total number of messages in the time period (see below), but I am
UNABLE to further break it out by numeric identifier. Can anyone
recommend how I would do this? Below is how far I have gotten in the
code so far:

SAMPLE OF CODE:

$samplelog = "test2.log";
open (IN, "$samplelog");

while ($rcd = <IN>){
@fields = split(/,/,$rcd);
$time{$fields[0]}++;

}

foreach $key (sort keys(%time)) {
print "$key $time{$key} \n";
}


Use the entire record as your key. Use chomp first to remove the
newline at the end. Then split apart the record, which is also the key
to the hash, to do the printing:

Jim 48% cat scadav.pl
#!/usr/local/bin/perl

use strict;
use warnings;

my @fields;
my %time;

while (my $rcd = <DATA>){
chomp($rcd);
$time{$rcd}++;
}

foreach my $entry (sort keys(%time)) {
my($tim,$num,$id) = split(/,/,$entry);
printf " %5s %3d %3d %s\n", $tim, $num, $time{$entry}, $id;
}
__END__
10:00,123,KO
10:00,124,KO
10:00,123,KO
10:00,123,KO
10:00,125,T
10:00,125,T
10:20,123,KO
10:20,123,KO
10:20,126,YY
10:20,123,KO
10:20,129,PP
10:40,145,YY
10:40,147,MM
11:00,123,KO
11:00,124,KO
11:00,123,KO
11:00,123,KO
11:00,125,T
11:00,125,T
11:20,123,KO

Jim 49% ./scadav.pl
10:00 123 3 KO
10:00 124 1 KO
10:00 125 2 T
10:20 123 3 KO
10:20 126 1 YY
10:20 129 1 PP
10:40 145 1 YY
10:40 147 1 MM
11:00 123 3 KO
11:00 124 1 KO
11:00 125 2 T
11:20 123 1 KO
Jim 50%

If you wish to break down your data in different ways or sort, then
you need to consider more complicated solutions such as defining a
hash-of-hashes or a hash-of-arrays to store your multi-level data.

Some more pointers for those new to Perl:

1. Always put 'use strict' and 'use warnings' at the beginning of your
program, then declare all of your variables with 'my' or 'our' or use
a package name for globals.

2. Always check the results of an open call (and all other system
calls, as well):

open (IN, $samplelog) or die("Can't open $samplelog: $!";

3. There is no need to put double-quotes around $samplelog in the
above.

4. Post further miscellaneous Perl questions to comp.lang.perl.misc,
as this newsgroup is defunct, but be sure and check the guidelines for
that newsgroup before doing so:

http://mail.augustmail.com/~tadmc/cl...uidelines.html


Thank you for your assistance and suggestions it is much appreciated.
Jul 19 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: Wayne Folta | last post by:
I've been a long-time Perl programmer, though I've not used a boatload of packages nor much of the tacky OO. A couple of years ago, I decided to look into Python and Ruby. Python looked OK, but...
2
by: Daniel Tonks | last post by:
Here's the situation: I have a Perl-based forum that I wrote myself. It currently supports user accounts, but these are stored in a flat text file (as is everything else for that matter). I'm...
0
by: Mike Chirico | last post by:
Hopefully this will help someone... Helpful Things to Know about MySQL Mike Chirico (mchirico@users.sourceforge.net) Last Updated: Fri Apr 16 11:47:34 EDT 2004 The latest version of this...
9
by: Dieter Vanderelst | last post by:
Dear all, I'm currently comparing Python versus Perl to use in a project that involved a lot of text processing. I'm trying to determine what the most efficient language would be for our...
2
by: kavikailash | last post by:
can any body give me a suggestion how to extract values from html form and to check the uniqueness after that it shld be stored in mysql table. pls give me suggestion in that with example...
10
by: stylecomputers | last post by:
Hey guys, I am absolutely new to Linux programming, with no w######s programming experience except a small amount of C++ console apps. Reasonably new to Linux, BSD etc, got good sound networking...
6
by: Henry Law | last post by:
I'm trying to debug a Perl problem whose symptoms are that a running Perl process either doesn't receive, or fails inexplicably to handle, a SIGTERM sent from the OS during shutdown (Linux Fedora...
2
by: MK | last post by:
Hello, I am new to XML and PERL and I have a few questions the answers to which I need to complete a project. All your time and effort would be highly appreciated. I have to make a small HTML page...
1
numberwhun
by: numberwhun | last post by:
I am having an issue with understanding something in the WWW::Mechanize module. I have a website which I want to download a whole plethora of pdf files from. It is a site that I have paid to access...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.