Hi,
I need to write a script which reads some data and reports the findings.
Just to give you an idea the structure is similar to the following.
Data input example:
HEADING 1
**********
ColumnA ColumnB ColumnC ColumnD ColumnE
Pete Male Marketing Single 40
Kate Female Marketing Married 30
John Male Sales Married 38
Pete Male Sales Single 52
John Male Sales Single 24
HEADING 2
**********
ColumnF ColumnG ColumnH ColumnI
whatever
whatever
whatever
whatever
Report Output example:
# of Pete's =
# of Males =
# of Salespeople =
# of Singles =
# of over 35s =
Since this is the first time I'm even writing such a script I would
appreciate some pointers.
1) Do I use arrays or associate arrays for this? Why or why not?
2) Is it possible for someone to give me a code example of counting how many
Singles we have?
3) What happens when I have read all the data under HEADING 1 and need to
move onto HEADING 2?
That is, how do I accomplish the jump from what I think is one loop onto the
next?
I imagine that there will be many more posts following this one so there's
no need to get into too much detail. Some guidance would be nice as I will
need to utilise Google and my references for the rest.
Thanks in advance. 35 3587
Ga Mu,
Great stuff - thanks very much. :)
The headings differentiate blocks of data so once we count everything under
HEADING 1 we move onto HEADING 2 then HEADING 3 etc.
Does this help a bit?
"Ga Mu" <Ng******@SPcomcast.netAM> wrote in message
news:xl44b.306860$Ho3.43264@sccrnsc03... Troll wrote:
1) Do I use arrays or associate arrays for this? Why or why not?
Use hashes (aka associative arrays) because they work so well for counting occurences of words. A hash instance is automatically initialized to zero the first time it is used, so, assuming you have already declared the hash %names ('my %names;') and we are in your parsing loop and have extracted the person's name into $name, all you need is:
$names{$name}++; # increment the count for this name.
2) Is it possible for someone to give me a code example of counting how
many Singles we have?
You could count everything with hashes.
Prior to your parsing loop:
my (%names, %sexes, %depts, %m_statuses, %ages);
Within your parsing loop:
# extract four words and a number into scalars: my ($name, $sex, $dept, $m_status, $age) = /^(\w+) (\w+) (\w+) (\w+) (\d+)$/;
# increment counts for each: $names{$name}++; $sexes{$sex}++; $depts{$dept}++; $m_statuses{$m_status}++; $ages{$age}++;
After your parsing loop:
$names{'Pete'} gives the number of Petes. $sexes{'Male'} gives the number of Males. $depts{'Sales'} gives the number of sales people. $m_statuses{'Single'} gives the number of single people. $ages{'25'} gives the number of 25 year-olds.
To print a list of all names and the number of occurences of each:
foreach $key (keys %names) { print "$key: $names{$key}\n"; }
This will output something like:
John: 2 Pete: 3 Kate: 1
This list could have been sorted by either name or count. Do a 'perldoc -f' for 'keys' and 'sort'.
3) What happens when I have read all the data under HEADING 1 and need
to move onto HEADING 2? That is, how do I accomplish the jump from what I think is one loop onto
the next?
Can't answer that, as you don't provide enough detail. What is the significance of the headings? Would the results be the same if the headings were completely ignored or do the headings signify some distinction between blocks of data?
Greg
Ga Mu,
Pls disregard last post.
With regard to the jump between HEADINGS, will it be enough to do something
like:
while (<>)
....
if (/HEADING 1/ .. /HEADING 2/) {
# line falls between HEADING 1 and HEADING 2 in the text, inclusive.
# then do the string extraction
# then increment stuff
elsif (/HEADING 2/ .. /HEADING 3/) {
# line falls between HEADING 2 and HEADING 3 in the text, inclusive.
# then do the string extraction
# then increment stuff
etc?
I quite like the code example you provided - actually found a similar one in http://www.oreilly.com/catalog/perlw...pter/ch08.html
Up until now I was under the impression that I would have to use split - can
you elaborate why you chose a different approach?
One other task I have to do is similar to: If a line contains Single in the column then get the single person's name.
I sort of came up with:
foreach $m_statuses{'Single'}
print $names{$name}
but that's probably totally wrong. Can you advise?
Thanks again.
"Troll" <ab***@microsoft.com> wrote in message
news:e7*******************@news-server.bigpond.net.au... Ga Mu, Great stuff - thanks very much. :)
The headings differentiate blocks of data so once we count everything
under HEADING 1 we move onto HEADING 2 then HEADING 3 etc.
Does this help a bit?
"Ga Mu" <Ng******@SPcomcast.netAM> wrote in message news:xl44b.306860$Ho3.43264@sccrnsc03... Troll wrote:
1) Do I use arrays or associate arrays for this? Why or why not? Use hashes (aka associative arrays) because they work so well for counting occurences of words. A hash instance is automatically initialized to zero the first time it is used, so, assuming you have already declared the hash %names ('my %names;') and we are in your parsing loop and have extracted the person's name into $name, all you need is:
$names{$name}++; # increment the count for this name.
2) Is it possible for someone to give me a code example of counting
how many Singles we have? You could count everything with hashes.
Prior to your parsing loop:
my (%names, %sexes, %depts, %m_statuses, %ages);
Within your parsing loop:
# extract four words and a number into scalars: my ($name, $sex, $dept, $m_status, $age) = /^(\w+) (\w+) (\w+) (\w+) (\d+)$/;
# increment counts for each: $names{$name}++; $sexes{$sex}++; $depts{$dept}++; $m_statuses{$m_status}++; $ages{$age}++;
After your parsing loop:
$names{'Pete'} gives the number of Petes. $sexes{'Male'} gives the number of Males. $depts{'Sales'} gives the number of sales people. $m_statuses{'Single'} gives the number of single people. $ages{'25'} gives the number of 25 year-olds.
To print a list of all names and the number of occurences of each:
foreach $key (keys %names) { print "$key: $names{$key}\n"; }
This will output something like:
John: 2 Pete: 3 Kate: 1
This list could have been sorted by either name or count. Do a 'perldoc -f' for 'keys' and 'sort'.
3) What happens when I have read all the data under HEADING 1 and need to move onto HEADING 2? That is, how do I accomplish the jump from what I think is one loop
onto the next?
Can't answer that, as you don't provide enough detail. What is the significance of the headings? Would the results be the same if the headings were completely ignored or do the headings signify some distinction between blocks of data?
Greg
Troll wrote: Ga Mu,
Pls disregard last post.
With regard to the jump between HEADINGS, will it be enough to do something like: while (<>) ... if (/HEADING 1/ .. /HEADING 2/) { # line falls between HEADING 1 and HEADING 2 in the text, inclusive. # then do the string extraction # then increment stuff elsif (/HEADING 2/ .. /HEADING 3/) { # line falls between HEADING 2 and HEADING 3 in the text, inclusive. # then do the string extraction # then increment stuff etc?
I am unclear as to the distinction between blocks. Are there a separate
group of totals for each heading or is everyting totalled up together?
If the latter, then simply ignore the headings. If the former, then you
could parse out the heading name and use a multidimensional hash. I.e.,
replace this:
$names{$name}++;
with this:
$names{$heading}{$name}++; I quite like the code example you provided - actually found a similar one in http://www.oreilly.com/catalog/perlw...pter/ch08.html Up until now I was under the impression that I would have to use split - can you elaborate why you chose a different approach?
Either method produces the same results. If you plan on incorporating
error checking, m// allows to specifically define a format, e.g., four
words and a number, whereas split simply breaks a string up into a list.
Whichever method makes you happy.
One other task I have to do is similar to:
If a line contains Single in the column then get the single person's name.
I sort of came up with: foreach $m_statuses{'Single'} print $names{$name}
but that's probably totally wrong. Can you advise?
Yes, it is totally wrong. $m_statuses{'Single'} is a scalar. It is the
count of lines where the marital status is 'Single'. Your foreach loop
above would produce a syntax error. Although it is not what you're
after, a valid foreach loop could look like this:
foreach $m_status ( keys %m_statuses ) {
#
# $m_status will be 'female' for one iteration of the loop and 'male'
# for the other. (Unless you have more than two sexes...)
#
}
Perhaps a more meaningful foreach loop would look like this:
foreach $age ( keys %ages ) {
#
# For each iteration, $age will one the ages that was found in the data
# -->> IN NO PARTICULAR ORDER <<-- unless you sort it.
#
}
To do what you propose, i.e., print the name of all single people, you
would have to include the logic for that in the parsing loop:
# extract four words and a number into scalars:
my ($name, $sex, $dept, $m_status, $age) =
/^(\w+) (\w+) (\w+) (\w+) (\d+)$/;
# increment counts for each:
$names{$name}++;
$sexes{$sex}++;
$depts{$dept}++;
$m_statuses{$m_status}++;
$ages{$age}++;
# take special actions:
if ( $m_status eq 'Single' ) print "$name is single.\n";
if ( $age >= 40 ) print "$name is over the hill!\n";
Hope this helps!
Greg
Thanks again !
1)
Sorry for being too vague. With regard to the HEADINGS they separate blocks
of data. But because the column names will be different [data is different]
then I'm not quite sure I could use:
$names{$heading}{$name}++;
So I'm looking at creating separate my () definitions for each HEADING and
just wanted to confirm how to jump out of one HEADING loop and start with
the next.
For example, under HEADING 1 we have these columns:
Name, Sex, Dept, M_Status, Age
and under HEADING 2we have:
Address, Phone#, Mobile#, Salary
So at the beginning of the script I would have
my (%names, %sexes, %depts, %m_statuses, %ages)
my (%addresses, %phones, %mobiles, %salaries)
#then I have my while (<>) and parsing here
#I have my output at the end
Is that a little more clearer?
2)
With my last question regarding the printing of the names of single people,
if we include a print statement in the parsing loop would that give us
something like:
Pete is single.
John is single.
while the parsing is still running?
What I'm after is hopefully feeding that output into something else
[@array?] which can then print a list of the names [line by line] at the end
of the script, something like:
#this is the output structure
Number of Petes =
Number of Males =
Singles are:
Pete
John
Number of Salespeople =
Does this make sense?
Thanks Greg.
"Ga Mu" <Ng******@SPcomcast.netAM> wrote in message
news:3G*******************@rwcrnsc52.ops.asp.att.n et... Troll wrote: Ga Mu,
Pls disregard last post.
With regard to the jump between HEADINGS, will it be enough to do
something like: while (<>) ... if (/HEADING 1/ .. /HEADING 2/) { # line falls between HEADING 1 and HEADING 2 in the text, inclusive. # then do the string extraction # then increment stuff elsif (/HEADING 2/ .. /HEADING 3/) { # line falls between HEADING 2 and HEADING 3 in the text, inclusive. # then do the string extraction # then increment stuff etc?
I am unclear as to the distinction between blocks. Are there a separate group of totals for each heading or is everyting totalled up together? If the latter, then simply ignore the headings. If the former, then you could parse out the heading name and use a multidimensional hash. I.e., replace this:
$names{$name}++;
with this:
$names{$heading}{$name}++;
I quite like the code example you provided - actually found a similar
one in http://www.oreilly.com/catalog/perlw...pter/ch08.html Up until now I was under the impression that I would have to use split -
can you elaborate why you chose a different approach?
Either method produces the same results. If you plan on incorporating error checking, m// allows to specifically define a format, e.g., four words and a number, whereas split simply breaks a string up into a list. Whichever method makes you happy.
One other task I have to do is similar to:
If a line contains Single in the column then get the single person's
name. I sort of came up with: foreach $m_statuses{'Single'} print $names{$name}
but that's probably totally wrong. Can you advise?
Yes, it is totally wrong. $m_statuses{'Single'} is a scalar. It is the count of lines where the marital status is 'Single'. Your foreach loop above would produce a syntax error. Although it is not what you're after, a valid foreach loop could look like this:
foreach $m_status ( keys %m_statuses ) { # # $m_status will be 'female' for one iteration of the loop and 'male' # for the other. (Unless you have more than two sexes...) # }
Perhaps a more meaningful foreach loop would look like this:
foreach $age ( keys %ages ) { # # For each iteration, $age will one the ages that was found in the data # -->> IN NO PARTICULAR ORDER <<-- unless you sort it. # }
To do what you propose, i.e., print the name of all single people, you would have to include the logic for that in the parsing loop:
# extract four words and a number into scalars: my ($name, $sex, $dept, $m_status, $age) = /^(\w+) (\w+) (\w+) (\w+) (\d+)$/;
# increment counts for each: $names{$name}++; $sexes{$sex}++; $depts{$dept}++; $m_statuses{$m_status}++; $ages{$age}++;
# take special actions: if ( $m_status eq 'Single' ) print "$name is single.\n"; if ( $age >= 40 ) print "$name is over the hill!\n";
Hope this helps!
Greg
Troll wrote: Thanks again !
1) Sorry for being too vague. With regard to the HEADINGS they separate blocks of data. But because the column names will be different [data is different] then I'm not quite sure I could use: $names{$heading}{$name}++;
So I'm looking at creating separate my () definitions for each HEADING and just wanted to confirm how to jump out of one HEADING loop and start with the next.
For example, under HEADING 1 we have these columns: Name, Sex, Dept, M_Status, Age
and under HEADING 2we have: Address, Phone#, Mobile#, Salary
So at the beginning of the script I would have my (%names, %sexes, %depts, %m_statuses, %ages) my (%addresses, %phones, %mobiles, %salaries) #then I have my while (<>) and parsing here #I have my output at the end
Is that a little more clearer?
Yes. Much clearer. There are a couple of different ways you could do
this. One is to use a single loop that reads through the file and uses
a state variable (e.g., $heading) to keep track of where you are in the
parsing process. The other is to have a separate loop for each heading.
Again, six of one, half a dozen of another. It's more a matter of
preference than anything else.
An example of the first approach:
my $heading = 'initial';
my $fin_name = '/usr/local/blah/blah/blah';
open FIN,$fin_name || die "Can't open $fin_name\n";
while (<FIN>) {
# check for a new heading
# I am assuming single word heading names
if ( /HEADING (\S+)/ {
$heading = $1; # set $heading equal to word extracted above
# take appropriate action based on the heading we are under
} elsif ( $heading eq 'NAMES' ) {
( $name, $sex, $dept, $m_status, $age ) =
/(\w+) (\w+) (\w+) (\w+) (\d+)/;
# update counts, append to lists, etc...
} elsif ( $heading eq 'ADDRESSES' ) {
# I am assuming the address field is limited to 30 characters
# here:
( $address,$phone, $mobile, $salary ) =
/(\.{30}) (\S+) (\S+) (\d+)/;
# update counts, append to lists, etc...
}
}
And the second approach:
my $heading = 'initial';
my $fin_name = '/usr/local/blah/blah/blah';
open FIN,$fin_name || die "Can't open $fin_name\n";
# scan for first heading
while ( <FIN> && ! /HEADING NAMES/ );
# parse the names, etc...
while ( <FIN> && ! /HEADING ADDRESSES/ ) {
( $name, $sex, $dept, $m_status, $age ) =
/(\w+) (\w+) (\w+) (\w+) (\d+)/;
# update counts, append to lists, etc...
# parse the addresses, etc...
# for brevity , I am assuming only two headings
while ( <FIN> ) {
( $address,$phone, $mobile, $salary ) =
/(\.{30}) (\S+) (\S+) (\d+)/;
# update counts, append to lists, etc...
}
2) With my last question regarding the printing of the names of single people, if we include a print statement in the parsing loop would that give us something like: Pete is single. John is single. while the parsing is still running?
Yes. What I'm after is hopefully feeding that output into something else [@array?] which can then print a list of the names [line by line] at the end of the script, something like: #this is the output structure Number of Petes = Number of Males = Singles are: Pete John Number of Salespeople =
Does this make sense?
Yes. It would be easy to create a list/array of, e.g., single people.
Prior to the loop, declare the array. Within the loop, test each person
for being single. If they are, push them onto the list:
# prior to your parsing loop, declare array @singles:
my @singles;
# within your parsing loop, after parsing out name, status, etc.:
if ( $m_status eq 'Single' ) push @singles,($name);
# after loop, to print the list of singles:
print "Single persons:\n";
foreach $single_person ( @singles ) print " $single_person\n";
Greg
Wow. I don't know how you get the time to respond to my queries in such
detail. It is greatly appreciated.
I just came back from work and it's like 2:30 am so I'll crash out soon and
have a closer read tomorrow [especially of the HEADINGS part].
With the push @array stuff I actually got to this today in my readings. I
saw an example of appending an array onto another array with a push and I
was wondering if we could just substitute a $variable for one of the arrays.
I'm glad you confirmed this. :)
I was also wondering if doing this at the beginning of the script:
my (%names, %sexes, %depts, %m_statuses, %ages) # declaring things
locally
would be considered bad practice. I thought that one should declare things
as my ( ) if one is using things within a loop so as not to impact anything
external to the loop. But if one uses variables/arrays both within and
outside the loops, should we then still declare stuff as my ( )?
Maybe I'm just confused about my ( )...
Greg, if you could possibly keep an eye on this thread for the next few days
I would be very much in your debt. Your help has been invaluabe so far in
allowing me to visualise quite a few things.
Thanks very much.
"Ga Mu" <Ng******@SPcomcast.netAM> wrote in message
news:uR*******************@rwcrnsc52.ops.asp.att.n et... Troll wrote: Thanks again !
1) Sorry for being too vague. With regard to the HEADINGS they separate
blocks of data. But because the column names will be different [data is
different] then I'm not quite sure I could use: $names{$heading}{$name}++;
So I'm looking at creating separate my () definitions for each HEADING
and just wanted to confirm how to jump out of one HEADING loop and start
with the next.
For example, under HEADING 1 we have these columns: Name, Sex, Dept, M_Status, Age
and under HEADING 2we have: Address, Phone#, Mobile#, Salary
So at the beginning of the script I would have my (%names, %sexes, %depts, %m_statuses, %ages) my (%addresses, %phones, %mobiles, %salaries) #then I have my while (<>) and parsing here #I have my output at the end
Is that a little more clearer?
Yes. Much clearer. There are a couple of different ways you could do this. One is to use a single loop that reads through the file and uses a state variable (e.g., $heading) to keep track of where you are in the parsing process. The other is to have a separate loop for each heading. Again, six of one, half a dozen of another. It's more a matter of preference than anything else.
An example of the first approach:
my $heading = 'initial'; my $fin_name = '/usr/local/blah/blah/blah'; open FIN,$fin_name || die "Can't open $fin_name\n";
while (<FIN>) {
# check for a new heading # I am assuming single word heading names if ( /HEADING (\S+)/ {
$heading = $1; # set $heading equal to word extracted above
# take appropriate action based on the heading we are under
} elsif ( $heading eq 'NAMES' ) {
( $name, $sex, $dept, $m_status, $age ) = /(\w+) (\w+) (\w+) (\w+) (\d+)/;
# update counts, append to lists, etc...
} elsif ( $heading eq 'ADDRESSES' ) {
# I am assuming the address field is limited to 30 characters # here: ( $address,$phone, $mobile, $salary ) = /(\.{30}) (\S+) (\S+) (\d+)/;
# update counts, append to lists, etc...
}
}
And the second approach:
my $heading = 'initial'; my $fin_name = '/usr/local/blah/blah/blah'; open FIN,$fin_name || die "Can't open $fin_name\n";
# scan for first heading while ( <FIN> && ! /HEADING NAMES/ );
# parse the names, etc... while ( <FIN> && ! /HEADING ADDRESSES/ ) {
( $name, $sex, $dept, $m_status, $age ) = /(\w+) (\w+) (\w+) (\w+) (\d+)/;
# update counts, append to lists, etc...
# parse the addresses, etc... # for brevity , I am assuming only two headings while ( <FIN> ) {
( $address,$phone, $mobile, $salary ) = /(\.{30}) (\S+) (\S+) (\d+)/;
# update counts, append to lists, etc...
}
2) With my last question regarding the printing of the names of single
people, if we include a print statement in the parsing loop would that give us something like: Pete is single. John is single. while the parsing is still running?
Yes.
What I'm after is hopefully feeding that output into something else [@array?] which can then print a list of the names [line by line] at the
end of the script, something like: #this is the output structure Number of Petes = Number of Males = Singles are: Pete John Number of Salespeople =
Does this make sense?
Yes. It would be easy to create a list/array of, e.g., single people. Prior to the loop, declare the array. Within the loop, test each person for being single. If they are, push them onto the list:
# prior to your parsing loop, declare array @singles:
my @singles;
# within your parsing loop, after parsing out name, status, etc.:
if ( $m_status eq 'Single' ) push @singles,($name);
# after loop, to print the list of singles:
print "Single persons:\n"; foreach $single_person ( @singles ) print " $single_person\n";
Greg
Now time for some stupid Qs:
Let's say that the data I have is in a file called employees.
How can I call this file so that I can parse it?
1) Can I do:
@HRdata = `cat employees`;
while (<@HRdata>) {
2) With regard to the HEADING sections, the script has to be able to
recognise the different sections by the following rules:
# there's a blank line
before each heading
HEADING 1 # this is the name of the heading -
this is a string with a special character and a blank space as part of it
ColumnA ColumnB ColumnC # these are the column names - these are
strings which also can inlude a blank space if they have 2 or more words
******* # a sort of an underlining
pattern
I guess this is to make sure that one does not include any silly heading
data as part of the arrays created and the parsing only takes place on
'real' data. Can you pls advise? Or do you need more info? I'm more in
favour of creating separate 'if' loops due to my 'newbie' status. I'll get
lost otherwise...
Thanks.
"Troll" <ab***@microsoft.com> wrote in message
news:uR*******************@news-server.bigpond.net.au... Wow. I don't know how you get the time to respond to my queries in such detail. It is greatly appreciated. I just came back from work and it's like 2:30 am so I'll crash out soon
and have a closer read tomorrow [especially of the HEADINGS part].
With the push @array stuff I actually got to this today in my readings. I saw an example of appending an array onto another array with a push and I was wondering if we could just substitute a $variable for one of the
arrays. I'm glad you confirmed this. :)
I was also wondering if doing this at the beginning of the script:
my (%names, %sexes, %depts, %m_statuses, %ages) # declaring things locally
would be considered bad practice. I thought that one should declare things as my ( ) if one is using things within a loop so as not to impact
anything external to the loop. But if one uses variables/arrays both within and outside the loops, should we then still declare stuff as my ( )? Maybe I'm just confused about my ( )...
Greg, if you could possibly keep an eye on this thread for the next few
days I would be very much in your debt. Your help has been invaluabe so far in allowing me to visualise quite a few things.
Thanks very much.
"Ga Mu" <Ng******@SPcomcast.netAM> wrote in message news:uR*******************@rwcrnsc52.ops.asp.att.n et... Troll wrote: Thanks again !
1) Sorry for being too vague. With regard to the HEADINGS they separate blocks of data. But because the column names will be different [data is different] then I'm not quite sure I could use: $names{$heading}{$name}++;
So I'm looking at creating separate my () definitions for each HEADING and just wanted to confirm how to jump out of one HEADING loop and start with the next.
For example, under HEADING 1 we have these columns: Name, Sex, Dept, M_Status, Age
and under HEADING 2we have: Address, Phone#, Mobile#, Salary
So at the beginning of the script I would have my (%names, %sexes, %depts, %m_statuses, %ages) my (%addresses, %phones, %mobiles, %salaries) #then I have my while (<>) and parsing here #I have my output at the end
Is that a little more clearer? Yes. Much clearer. There are a couple of different ways you could do this. One is to use a single loop that reads through the file and uses a state variable (e.g., $heading) to keep track of where you are in the parsing process. The other is to have a separate loop for each heading. Again, six of one, half a dozen of another. It's more a matter of preference than anything else.
An example of the first approach:
my $heading = 'initial'; my $fin_name = '/usr/local/blah/blah/blah'; open FIN,$fin_name || die "Can't open $fin_name\n";
while (<FIN>) {
# check for a new heading # I am assuming single word heading names if ( /HEADING (\S+)/ {
$heading = $1; # set $heading equal to word extracted above
# take appropriate action based on the heading we are under
} elsif ( $heading eq 'NAMES' ) {
( $name, $sex, $dept, $m_status, $age ) = /(\w+) (\w+) (\w+) (\w+) (\d+)/;
# update counts, append to lists, etc...
} elsif ( $heading eq 'ADDRESSES' ) {
# I am assuming the address field is limited to 30 characters # here: ( $address,$phone, $mobile, $salary ) = /(\.{30}) (\S+) (\S+) (\d+)/;
# update counts, append to lists, etc...
}
}
And the second approach:
my $heading = 'initial'; my $fin_name = '/usr/local/blah/blah/blah'; open FIN,$fin_name || die "Can't open $fin_name\n";
# scan for first heading while ( <FIN> && ! /HEADING NAMES/ );
# parse the names, etc... while ( <FIN> && ! /HEADING ADDRESSES/ ) {
( $name, $sex, $dept, $m_status, $age ) = /(\w+) (\w+) (\w+) (\w+) (\d+)/;
# update counts, append to lists, etc...
# parse the addresses, etc... # for brevity , I am assuming only two headings while ( <FIN> ) {
( $address,$phone, $mobile, $salary ) = /(\.{30}) (\S+) (\S+) (\d+)/;
# update counts, append to lists, etc...
}
2) With my last question regarding the printing of the names of single people, if we include a print statement in the parsing loop would that give us something like: Pete is single. John is single. while the parsing is still running?
Yes.
What I'm after is hopefully feeding that output into something else [@array?] which can then print a list of the names [line by line] at
the end of the script, something like: #this is the output structure Number of Petes = Number of Males = Singles are: Pete John Number of Salespeople =
Does this make sense?
Yes. It would be easy to create a list/array of, e.g., single people. Prior to the loop, declare the array. Within the loop, test each person for being single. If they are, push them onto the list:
# prior to your parsing loop, declare array @singles:
my @singles;
# within your parsing loop, after parsing out name, status, etc.:
if ( $m_status eq 'Single' ) push @singles,($name);
# after loop, to print the list of singles:
print "Single persons:\n"; foreach $single_person ( @singles ) print " $single_person\n";
Greg
I'm getting heaps of the following errors when I run my script:
Use of uninitialized value in hash element at ...
The beginning of my script looks like:
my(%names, %sexes, %depts);
%names = ("name" => "0");
%sexes = ("sex" => "0");
%depts = ("dept" => "0");
$names = '0';
$sexes = '0';
$depts = '0';
$name = '0';
$sex = '0';
$dept = '0';
while (<>)
#and the parsing loop here...
The hash errors relate to only these 3 lines which are part of the parsing
loop:
$names{$name}++;
$sexes{$sex}++;
$depts{$dept}++;
Can you run over the variable declarations/initializations for me as I'm not
sure I'm doing this right?
Thanks.
"Troll" <ab***@microsoft.com> wrote in message
news:eh*******************@news-server.bigpond.net.au... Now time for some stupid Qs:
Let's say that the data I have is in a file called employees. How can I call this file so that I can parse it?
1) Can I do: @HRdata = `cat employees`; while (<@HRdata>) {
2) With regard to the HEADING sections, the script has to be able to recognise the different sections by the following rules: # there's a blank
line before each heading HEADING 1 # this is the name of the
heading - this is a string with a special character and a blank space as part of it ColumnA ColumnB ColumnC # these are the column names - these are strings which also can inlude a blank space if they have 2 or more words ******* # a sort of an underlining pattern
I guess this is to make sure that one does not include any silly heading data as part of the arrays created and the parsing only takes place on 'real' data. Can you pls advise? Or do you need more info? I'm more in favour of creating separate 'if' loops due to my 'newbie' status. I'll get lost otherwise...
Thanks. "Troll" <ab***@microsoft.com> wrote in message news:uR*******************@news-server.bigpond.net.au... Wow. I don't know how you get the time to respond to my queries in such detail. It is greatly appreciated. I just came back from work and it's like 2:30 am so I'll crash out soon and have a closer read tomorrow [especially of the HEADINGS part].
With the push @array stuff I actually got to this today in my readings.
I saw an example of appending an array onto another array with a push and
I was wondering if we could just substitute a $variable for one of the arrays. I'm glad you confirmed this. :)
I was also wondering if doing this at the beginning of the script:
my (%names, %sexes, %depts, %m_statuses, %ages) # declaring
things locally
would be considered bad practice. I thought that one should declare
things as my ( ) if one is using things within a loop so as not to impact anything external to the loop. But if one uses variables/arrays both within and outside the loops, should we then still declare stuff as my ( )? Maybe I'm just confused about my ( )...
Greg, if you could possibly keep an eye on this thread for the next few days I would be very much in your debt. Your help has been invaluabe so far
in allowing me to visualise quite a few things.
Thanks very much.
"Ga Mu" <Ng******@SPcomcast.netAM> wrote in message news:uR*******************@rwcrnsc52.ops.asp.att.n et... Troll wrote: > Thanks again ! > > 1) > Sorry for being too vague. With regard to the HEADINGS they separate blocks > of data. But because the column names will be different [data is different] > then I'm not quite sure I could use: > $names{$heading}{$name}++; > > So I'm looking at creating separate my () definitions for each
HEADING and > just wanted to confirm how to jump out of one HEADING loop and start with > the next. > > For example, under HEADING 1 we have these columns: > Name, Sex, Dept, M_Status, Age > > and under HEADING 2we have: > Address, Phone#, Mobile#, Salary > > So at the beginning of the script I would have > my (%names, %sexes, %depts, %m_statuses, %ages) > my (%addresses, %phones, %mobiles, %salaries) > #then I have my while (<>) and parsing here > #I have my output at the end > > Is that a little more clearer?
Yes. Much clearer. There are a couple of different ways you could do this. One is to use a single loop that reads through the file and
uses a state variable (e.g., $heading) to keep track of where you are in
the parsing process. The other is to have a separate loop for each
heading. Again, six of one, half a dozen of another. It's more a matter of preference than anything else.
An example of the first approach:
my $heading = 'initial'; my $fin_name = '/usr/local/blah/blah/blah'; open FIN,$fin_name || die "Can't open $fin_name\n";
while (<FIN>) {
# check for a new heading # I am assuming single word heading names if ( /HEADING (\S+)/ {
$heading = $1; # set $heading equal to word extracted above
# take appropriate action based on the heading we are under
} elsif ( $heading eq 'NAMES' ) {
( $name, $sex, $dept, $m_status, $age ) = /(\w+) (\w+) (\w+) (\w+) (\d+)/;
# update counts, append to lists, etc...
} elsif ( $heading eq 'ADDRESSES' ) {
# I am assuming the address field is limited to 30 characters # here: ( $address,$phone, $mobile, $salary ) = /(\.{30}) (\S+) (\S+) (\d+)/;
# update counts, append to lists, etc...
}
}
And the second approach:
my $heading = 'initial'; my $fin_name = '/usr/local/blah/blah/blah'; open FIN,$fin_name || die "Can't open $fin_name\n";
# scan for first heading while ( <FIN> && ! /HEADING NAMES/ );
# parse the names, etc... while ( <FIN> && ! /HEADING ADDRESSES/ ) {
( $name, $sex, $dept, $m_status, $age ) = /(\w+) (\w+) (\w+) (\w+) (\d+)/;
# update counts, append to lists, etc...
# parse the addresses, etc... # for brevity , I am assuming only two headings while ( <FIN> ) {
( $address,$phone, $mobile, $salary ) = /(\.{30}) (\S+) (\S+) (\d+)/;
# update counts, append to lists, etc...
}
> > > 2) > With my last question regarding the printing of the names of single people, > if we include a print statement in the parsing loop would that give
us > something like: > Pete is single. > John is single. > while the parsing is still running?
Yes.
> > What I'm after is hopefully feeding that output into something else > [@array?] which can then print a list of the names [line by line] at the end > of the script, something like: > #this is the output structure > Number of Petes = > Number of Males = > Singles are: > Pete > John > Number of Salespeople = > > > Does this make sense? >
Yes. It would be easy to create a list/array of, e.g., single people. Prior to the loop, declare the array. Within the loop, test each
person for being single. If they are, push them onto the list:
# prior to your parsing loop, declare array @singles:
my @singles;
# within your parsing loop, after parsing out name, status, etc.:
if ( $m_status eq 'Single' ) push @singles,($name);
# after loop, to print the list of singles:
print "Single persons:\n"; foreach $single_person ( @singles ) print " $single_person\n";
Greg
Troll wrote: Greg, I decided to give you a glimpse at the code itself so as to make it clearer. Just be aware that the variable/array names have changed but the general idea is the same. The hash errors refer to the variables in the increment section.
#!/usr/bin/perl -w
open(NET, "netstat|") || die ("Cannot run netstat: $!");
my(%UDP4localaddresses, %UDP4remoteaddresses, %UDP4states);
$UDP4localaddress = '0'; $UDP4remoteaddress = '0'; $UDP4state = '0';
Why are you doing this (above)? This is initializing three variables to
zero. These three variables have nothing to do with the three variables
of the same name in the while loop.
$UDP4localaddresses = '0'; $UDP4remoteaddresses = '0'; $UDP4states = '0';
Why are you doing this (above)? This is initializing three scalars to
zero. These three scalars have the same name, but have nothing else to
do with the hashes of the same name.
$UDP4localaddresses{$UDP4localaddress} = '0'; $UDP4remoteaddresses{$UDP4remoteaddress} = '0'; $UDP4states = ($UDP4state} = '0';
Instances of hash keys are automatically initialized to zero. That is
what makes them perfect for counting occurences of unknown words,
numbers, etc. And even if you had to initialize them, you are
initilizing $UDP4localaddresses{0} to zero.
while (<NET>) { my($UDP4localaddress, $UDP4remoteaddress, $UDP4state)= /(\s+) (\s+) (\s+)$/;
#increments start here $UDP4localaddresses{$UDP4localaddress}++; $UDP4remoteaddresses{$UDP4remoteaddress}++; $UDP4states = ($UDP4state}++;
If the increments above are failing, it is probably because your m// is
failing and one or more of the keys (variable inside the {}) are
undefined. Try putting a print statement before the increments and
print each of the variables you are extracting, then play with the
regular expression until you get values for ALL of them.
}
#here comes the output
Can you pls criticise my futile attempt to get this going? As one can see, I'm not that clear on initializations...
Troll wrote: Now time for some stupid Qs:
Let's say that the data I have is in a file called employees. How can I call this file so that I can parse it?
1) Can I do: @HRdata = `cat employees`; while (<@HRdata>) {
The above is considered bad practice, especially if the file is large.
Why read the entire file into memory when you can read, process, and
discard a line at a time..? To open and read a file:
open (FIN, '<employess') || die "blah blah blah...";
while (<FIN>) {
}
2) With regard to the HEADING sections, the script has to be able to recognise the different sections by the following rules: # there's a blank line before each heading HEADING 1 # this is the name of the heading - this is a string with a special character and a blank space as part of it ColumnA ColumnB ColumnC # these are the column names - these are strings which also can inlude a blank space if they have 2 or more words ******* # a sort of an underlining pattern
while (<FIN>) {
if ( /^$/ ) {
# this is a blank line, don't do anything
} elsif ( /HEADING (\.+)/ ) {
# this is a heading, with the heading name in $1
} elsif ( (($name, $sex, $status, $age) = /(\s+) (\s+) (\s+) (\d+)/) ==
4 ) {
# this line contains three words and a number, do whatever
# (I'm not really sure if this will work. My Linux box is
# down and I have no way of testing.)
}
} # end of while(<FIN>)
I guess this is to make sure that one does not include any silly heading data as part of the arrays created and the parsing only takes place on 'real' data. Can you pls advise? Or do you need more info? I'm more in favour of creating separate 'if' loops due to my 'newbie' status. I'll get lost otherwise...
"if loops"...? How does one make an if loop?
Thanks. "Troll" <ab***@microsoft.com> wrote in message news:uR*******************@news-server.bigpond.net.au...
Wow. I don't know how you get the time to respond to my queries in such detail. It is greatly appreciated. I just came back from work and it's like 2:30 am so I'll crash out soon
and
have a closer read tomorrow [especially of the HEADINGS part].
With the push @array stuff I actually got to this today in my readings. I saw an example of appending an array onto another array with a push and I was wondering if we could just substitute a $variable for one of the
arrays.
I'm glad you confirmed this. :)
I was also wondering if doing this at the beginning of the script:
my (%names, %sexes, %depts, %m_statuses, %ages) # declaring things locally
would be considered bad practice. I thought that one should declare things as my ( ) if one is using things within a loop so as not to impact
anything
external to the loop. But if one uses variables/arrays both within and outside the loops, should we then still declare stuff as my ( )? Maybe I'm just confused about my ( )...
Greg, if you could possibly keep an eye on this thread for the next few
days
I would be very much in your debt. Your help has been invaluabe so far in allowing me to visualise quite a few things.
Thanks very much.
"Ga Mu" <Ng******@SPcomcast.netAM> wrote in message news:uR*******************@rwcrnsc52.ops.asp.att .net...
Troll wrote:
Thanks again !
1) Sorry for being too vague. With regard to the HEADINGS they separate
blocks
of data. But because the column names will be different [data is
different]
then I'm not quite sure I could use: $names{$heading}{$name}++;
So I'm looking at creating separate my () definitions for each HEADING
and
just wanted to confirm how to jump out of one HEADING loop and start
with
the next.
For example, under HEADING 1 we have these columns: Name, Sex, Dept, M_Status, Age
and under HEADING 2we have: Address, Phone#, Mobile#, Salary
So at the beginning of the script I would have my (%names, %sexes, %depts, %m_statuses, %ages) my (%addresses, %phones, %mobiles, %salaries) #then I have my while (<>) and parsing here #I have my output at the end
Is that a little more clearer?
Yes. Much clearer. There are a couple of different ways you could do this. One is to use a single loop that reads through the file and uses a state variable (e.g., $heading) to keep track of where you are in the parsing process. The other is to have a separate loop for each heading. Again, six of one, half a dozen of another. It's more a matter of preference than anything else.
An example of the first approach:
my $heading = 'initial'; my $fin_name = '/usr/local/blah/blah/blah'; open FIN,$fin_name || die "Can't open $fin_name\n";
while (<FIN>) {
# check for a new heading # I am assuming single word heading names if ( /HEADING (\S+)/ {
$heading = $1; # set $heading equal to word extracted above
# take appropriate action based on the heading we are under
} elsif ( $heading eq 'NAMES' ) {
( $name, $sex, $dept, $m_status, $age ) = /(\w+) (\w+) (\w+) (\w+) (\d+)/;
# update counts, append to lists, etc...
} elsif ( $heading eq 'ADDRESSES' ) {
# I am assuming the address field is limited to 30 characters # here: ( $address,$phone, $mobile, $salary ) = /(\.{30}) (\S+) (\S+) (\d+)/;
# update counts, append to lists, etc...
}
}
And the second approach:
my $heading = 'initial'; my $fin_name = '/usr/local/blah/blah/blah'; open FIN,$fin_name || die "Can't open $fin_name\n";
# scan for first heading while ( <FIN> && ! /HEADING NAMES/ );
# parse the names, etc... while ( <FIN> && ! /HEADING ADDRESSES/ ) {
( $name, $sex, $dept, $m_status, $age ) = /(\w+) (\w+) (\w+) (\w+) (\d+)/;
# update counts, append to lists, etc...
# parse the addresses, etc... # for brevity , I am assuming only two headings while ( <FIN> ) {
( $address,$phone, $mobile, $salary ) = /(\.{30}) (\S+) (\S+) (\d+)/;
# update counts, append to lists, etc...
} 2) With my last question regarding the printing of the names of single
people,
if we include a print statement in the parsing loop would that give us something like: Pete is single. John is single. while the parsing is still running?
Yes.
What I'm after is hopefully feeding that output into something else [@array?] which can then print a list of the names [line by line] at
the
end
of the script, something like: #this is the output structure Number of Petes = Number of Males = Singles are: Pete John Number of Salespeople =
Does this make sense?
Yes. It would be easy to create a list/array of, e.g., single people. Prior to the loop, declare the array. Within the loop, test each person for being single. If they are, push them onto the list:
# prior to your parsing loop, declare array @singles:
my @singles;
# within your parsing loop, after parsing out name, status, etc.:
if ( $m_status eq 'Single' ) push @singles,($name);
# after loop, to print the list of singles:
print "Single persons:\n"; foreach $single_person ( @singles ) print " $single_person\n";
Greg
Thanks again :)
Will I get these errors:
Use of uninitialized value in print at ./netstat.pl line 16, <NET> line 1.
Use of uninitialized value in print at ./netstat.pl line 17, <NET> line 1.
Use of uninitialized value in print at ./netstat.pl line 18, <NET> line 1.
....etc
if an undefined value is passed, for example, to $UDP4localaddress?
Because if that's the case then all I need to do is to make sure that
whatever I'm passing as part of the m()// is correctly split and defined as
a string, digit, word etc, yes?
"Ga Mu" <Ng******@SPcomcast.netAM> wrote in message
news:SM25b.251663$cF.79266@rwcrnsc53... Troll wrote: Greg, I decided to give you a glimpse at the code itself so as to make it
clearer. Just be aware that the variable/array names have changed but the general idea is the same. The hash errors refer to the variables in the increment section.
#!/usr/bin/perl -w
open(NET, "netstat|") || die ("Cannot run netstat: $!");
my(%UDP4localaddresses, %UDP4remoteaddresses, %UDP4states);
$UDP4localaddress = '0'; $UDP4remoteaddress = '0'; $UDP4state = '0';
Why are you doing this (above)? This is initializing three variables to zero. These three variables have nothing to do with the three variables of the same name in the while loop.
$UDP4localaddresses = '0'; $UDP4remoteaddresses = '0'; $UDP4states = '0';
Why are you doing this (above)? This is initializing three scalars to zero. These three scalars have the same name, but have nothing else to do with the hashes of the same name.
$UDP4localaddresses{$UDP4localaddress} = '0'; $UDP4remoteaddresses{$UDP4remoteaddress} = '0'; $UDP4states = ($UDP4state} = '0';
Instances of hash keys are automatically initialized to zero. That is what makes them perfect for counting occurences of unknown words, numbers, etc. And even if you had to initialize them, you are initilizing $UDP4localaddresses{0} to zero.
while (<NET>) { my($UDP4localaddress, $UDP4remoteaddress, $UDP4state)= /(\s+) (\s+) (\s+)$/;
#increments start here $UDP4localaddresses{$UDP4localaddress}++; $UDP4remoteaddresses{$UDP4remoteaddress}++; $UDP4states = ($UDP4state}++;
If the increments above are failing, it is probably because your m// is failing and one or more of the keys (variable inside the {}) are undefined. Try putting a print statement before the increments and print each of the variables you are extracting, then play with the regular expression until you get values for ALL of them.
}
#here comes the output
Can you pls criticise my futile attempt to get this going? As one can
see, I'm not that clear on initializations...
Ga Mu wrote: while (<FIN>) {
if ( /^$/ ) {
# this is a blank line, don't do anything
next if /^\s*$/; # skip blank lines (or consisting of white space
# only)
} elsif ( /HEADING (\.+)/ ) {
# this is a heading, with the heading name in $1
if (/ .....) {
# this is a heading
next;
}
} elsif ( (($name, $sex, $status, $age) = /(\s+) (\s+) (\s+) (\d+)/) ==
if (......) {
# bla bla
next;
}
next moves on to the next "while step".
--
Kind regards, feel free to mail: mail(at)johnbokma.com (or reply)
virtual home: http://johnbokma.com/ ICQ: 218175426
John web site hints: http://johnbokma.com/websitedesign/
Troll wrote: Thanks again :)
Will I get these errors: Use of uninitialized value in print at ./netstat.pl line 16, <NET> line 1. Use of uninitialized value in print at ./netstat.pl line 17, <NET> line 1. Use of uninitialized value in print at ./netstat.pl line 18, <NET> line 1. ...etc
if an undefined value is passed, for example, to $UDP4localaddress? Because if that's the case then all I need to do is to make sure that whatever I'm passing as part of the m()// is correctly split and defined as a string, digit, word etc, yes?
Exactly. Experiment with your re in the m// until you get values.
Troll wrote: Thanks again. No reading files into memory from now on [unless necessary] :)
The data will actually be read from stdin in the form of $ netstat | netstat.pl or $ netstat.pl < netstat
Will something like this suffice? #!/usr/bin/perl -w while (<STDIN>) {
STDIN is the default file handle, so all you need is:
while (<>) {
} "if loops"...? How does one make an if loop?
What I meant here is that I'll create 4 separate 'if' sections [with their own elsif branches], one for each HEADING section [there are 4 of them]. So I think I meant 'if' statements...is that better or I am still confusing my terminology?
Makes more sense...
Thanks very much.
I'm having a bit of drama within my parsing loop.
If I'm trying to look for a specific pattern [ie. tcp] then I am able to
find it [by printing a 'found' message]. This message is then printed each
and every time 'tcp' is found [for a total of 6 times on 6 separate lines].
The script then finishes.
But if I'm trying to increment the number of times this pattern was found I
get the dreaded error:
Use of uninitialized value in hash element at ...
Here's the code extract:
while (<>) {
my($Proto)=
/(\s+)*$/;
if (/tcp/) {
print 'found';
$Protos{$Proto}++;
where am I failing ?
OK, I had some luck getting the first value incremented but no more.
Version which works:
*****************
if (/tcp/) {
my($Proto)=
/^(\w+)/;
$Protos{$Proto}++;
}
print "TCP = $Protos{'tcp'}\n";
#output section
TCP = 6 # all is correct here
Version which does not work:
**********************
if (/tcp/) {
my($Proto, $RecvQ)=
/^(\w+) (\s+)/;
$Protos{$Proto}++;
$RecvQs{$RecvQ)++;
}
print "TCP = $Protos{'tcp'}\n";
print "RecvQ = $RecvQs{'0'}\n";
#output section
TCP = 6 # all is correct here
Use of uninitialized value in concatenation (.) or string at... # error
time - this refers to the 2nd print statement
RecvQ = # this is blank
I have tried reading the second parameter as a (\s+) and as a (\d+) with no
luck. If you run netstat you will probably see that all items in the RecvQ
column are 0.
What have I done wrong now?
Can a number of whitespaces be represented by:
/^(\w+) (\s+)/; # this is a word followed by some spaces followed by a
string
or is the above only ONE whitespace?
Troll wrote: OK, I had some luck getting the first value incremented but no more.
Version which works: ***************** if (/tcp/) { my($Proto)= /^(\w+)/; $Protos{$Proto}++; } print "TCP = $Protos{'tcp'}\n";
#output section TCP = 6 # all is correct here
Version which does not work: ********************** if (/tcp/) { my($Proto, $RecvQ)= /^(\w+) (\s+)/; $Protos{$Proto}++; $RecvQs{$RecvQ)++; } print "TCP = $Protos{'tcp'}\n"; print "RecvQ = $RecvQs{'0'}\n";
#output section TCP = 6 # all is correct here Use of uninitialized value in concatenation (.) or string at... # error time - this refers to the 2nd print statement RecvQ = # this is blank
I have tried reading the second parameter as a (\s+) and as a (\d+) with no luck. If you run netstat you will probably see that all items in the RecvQ column are 0. What have I done wrong now?
I guess you want (\S+) ie, non-whitespace. If it are always digits you
should use (\d+). If the number of spaces between proto and recvq can be
more than one you should use something like:
(\w+)\s+(\d+)
print the values of $proto and $recvq
Also, you can't be sure there are any recvqs{'0'} so check this
same for protos.
print "TCP = ...." if defined $Protos{'tcp'};
print "RecvQ = ..." if defined $RecvQs{'0'};
Can a number of whitespaces be represented by: /^(\w+) (\s+)/; # this is a word followed by some spaces followed by a string
nope. \s+ means one or more whitespaces. Not *string*
and it is a word followed by exactly one space (white space?).
See above.
HTH
--
Kind regards, feel free to mail: mail(at)johnbokma.com (or reply)
virtual home: http://johnbokma.com/ ICQ: 218175426
John web site hints: http://johnbokma.com/websitedesign/
Troll wrote: Here's the code extract: while (<>) { my($Proto)= /(\s+)*$/;
Your m// above is saying find an occurence of one or more spaces, zero
or more times, terminated by an end-of-line.
if (/tcp/) { print 'found'; $Protos{$Proto}++;
This m// has nothing to do with the value, if any, that was extracted
into $proto. It is looking at the last line read for "tcp".
I'll continue in you next post...
Looks like I had some typos there but after correcting them it's still a no
go :(
/^(\w+) (\s+)/;
was changed to
/^(\w+)(\s+)(\S+)(\s+)(\S+)/;
# looking for word(s), 1 or more spaces, non-space(s), space(s),
non-space(s)
Still get the same output tho:
#output section
TCP = 6 # all is correct here
Use of uninitialized value in concatenation (.) or string at... # error
time - this refers to the 2nd print statement
RecvQ = # this is blank
What am I missing?
"Troll" <ab***@microsoft.com> wrote in message
news:%E******************@news-server.bigpond.net.au... OK, I had some luck getting the first value incremented but no more.
Version which works: ***************** if (/tcp/) { my($Proto)= /^(\w+)/; $Protos{$Proto}++; } print "TCP = $Protos{'tcp'}\n";
#output section TCP = 6 # all is correct here
Version which does not work: ********************** if (/tcp/) { my($Proto, $RecvQ)= /^(\w+) (\s+)/; $Protos{$Proto}++; $RecvQs{$RecvQ)++; } print "TCP = $Protos{'tcp'}\n"; print "RecvQ = $RecvQs{'0'}\n";
#output section TCP = 6 # all is correct here Use of uninitialized value in concatenation (.) or string at... #
error time - this refers to the 2nd print statement RecvQ = # this is blank
I have tried reading the second parameter as a (\s+) and as a (\d+) with
no luck. If you run netstat you will probably see that all items in the RecvQ column are 0. What have I done wrong now?
Can a number of whitespaces be represented by: /^(\w+) (\s+)/; # this is a word followed by some spaces followed by a string or is the above only ONE whitespace?
You just saved me some more stress John. Thanks !
This (\w+)\s+(\d+) did the trick.
Can you pls elaborate on the difference between including stuff in brackets
or not?
Is it always in brackets except for SPACE searches?
"John Bokma" <po********@castleamber.com> wrote in message
news:3f*********************@news.kabelfoon.nl... Troll wrote:
OK, I had some luck getting the first value incremented but no more.
Version which works: ***************** if (/tcp/) { my($Proto)= /^(\w+)/; $Protos{$Proto}++; } print "TCP = $Protos{'tcp'}\n";
#output section TCP = 6 # all is correct here
Version which does not work: ********************** if (/tcp/) { my($Proto, $RecvQ)= /^(\w+) (\s+)/; $Protos{$Proto}++; $RecvQs{$RecvQ)++; } print "TCP = $Protos{'tcp'}\n"; print "RecvQ = $RecvQs{'0'}\n";
#output section TCP = 6 # all is correct here Use of uninitialized value in concatenation (.) or string at... #
error time - this refers to the 2nd print statement RecvQ = # this is blank
I have tried reading the second parameter as a (\s+) and as a (\d+) with
no luck. If you run netstat you will probably see that all items in the
RecvQ column are 0. What have I done wrong now?
I guess you want (\S+) ie, non-whitespace. If it are always digits you should use (\d+). If the number of spaces between proto and recvq can be more than one you should use something like:
(\w+)\s+(\d+)
print the values of $proto and $recvq
Also, you can't be sure there are any recvqs{'0'} so check this same for protos.
print "TCP = ...." if defined $Protos{'tcp'}; print "RecvQ = ..." if defined $RecvQs{'0'};
Can a number of whitespaces be represented by: /^(\w+) (\s+)/; # this is a word followed by some spaces followed by a string
nope. \s+ means one or more whitespaces. Not *string* and it is a word followed by exactly one space (white space?). See above.
HTH
-- Kind regards, feel free to mail: mail(at)johnbokma.com (or reply) virtual home: http://johnbokma.com/ ICQ: 218175426 John web site hints: http://johnbokma.com/websitedesign/
Troll wrote: Looks like I had some typos there but after correcting them it's still a no go :( /^(\w+) (\s+)/; was changed to /^(\w+)(\s+)(\S+)(\s+)(\S+)/; # looking for word(s), 1 or more spaces, non-space(s), space(s), non-space(s)
Still get the same output tho: #output section TCP = 6 # all is correct here Use of uninitialized value in concatenation (.) or string at... # error time - this refers to the 2nd print statement RecvQ = # this is blank
What am I missing?
post a valid line.
--
Kind regards, feel free to mail: mail(at)johnbokma.com (or reply)
virtual home: http://johnbokma.com/ ICQ: 218175426
John web site hints: http://johnbokma.com/websitedesign/
Troll wrote: OK, I had some luck getting the first value incremented but no more.
Version which works: ***************** if (/tcp/) { my($Proto)= /^(\w+)/; $Protos{$Proto}++; } print "TCP = $Protos{'tcp'}\n";
#output section TCP = 6 # all is correct here
Version which does not work: ********************** if (/tcp/) { my($Proto, $RecvQ)= /^(\w+) (\s+)/;
The above re says find and extract a word into $Proto, find some
whitepsace and ignore it, then find more whitepsace and extract it into
$RecvQ. Do you mean to use an uypper-case S, meaning find non-whitepsace..?
$Protos{$Proto}++; $RecvQs{$RecvQ)++; } print "TCP = $Protos{'tcp'}\n"; print "RecvQ = $RecvQs{'0'}\n";
#output section TCP = 6 # all is correct here Use of uninitialized value in concatenation (.) or string at... # error time - this refers to the 2nd print statement RecvQ = # this is blank
I have tried reading the second parameter as a (\s+) and as a (\d+) with no luck. If you run netstat you will probably see that all items in the RecvQ column are 0. What have I done wrong now?
If you are trying to parse this:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 redhat:ssh winxp:1099 ESTABLISHED
how about this:
my ($proto,$rxQ,$txQ,$l_addr,$r_addr,$state) =
/^(\w+) (\d+) (\d+) (\S+) (\S+) (\w+)/;
which says (with whitespace in between each):
find a word and extract into $proto,
find a number and extract into $rxQ,
ditto for $txQ,
find NON-whitespace and extract into $l_addr,
ditto for $r_addr,
find a word and extract into $state.
Use \S+ for the addresses because they contain numbers, letters, and a
colon. Neither \w nor \d would match these.
Can a number of whitespaces be represented by: /^(\w+) (\s+)/; # this is a word followed by some spaces followed by a string or is the above only ONE whitespace?
\s (lower-case) DOES NOT mean a string, it means whitespace.
\S (upper-case) means non-whitespace.
If you have access to "The Camel Book" by ORA, try reading the section
on pattern matching. It's clear you're not getting how to construct a
meaningful regular expression.
Ga Mu wrote: If you are trying to parse this:
Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 redhat:ssh winxp:1099 ESTABLISHED
how about this:
my ($proto,$rxQ,$txQ,$l_addr,$r_addr,$state) = /^(\w+) (\d+) (\d+) (\S+) (\S+) (\w+)/;
WHOOPS!
That re should have been:
/^(\w+) +(\d+) +(\d+) +(\S+) +(\S+) +(\w+)/
I just tested it and it works (my Linux box is back up).
Alternatively, you could use "\s+" instead of " +". The former means
one or more whitespace characters (space, tab, newline) , the latter (I
think) means find one or more space characters only (no tab or newline).
which says (with whitespace in between each):
find a word and extract into $proto, find a number and extract into $rxQ, ditto for $txQ, find NON-whitespace and extract into $l_addr, ditto for $r_addr, find a word and extract into $state.
Greg,
Your last bit made me laugh cause it is exactly how I feel. Still need a lot
of work to understand regexes.
But thanks to the last posts from yourself and John and 2 links I found,
this is much clearer now.
My apologies to both of you for being such a pain :(
"Ga Mu" <Ng******@SPcomcast.netAM> wrote in message
news:%ao5b.342052$YN5.233647@sccrnsc01... Troll wrote:
OK, I had some luck getting the first value incremented but no more.
Version which works: ***************** if (/tcp/) { my($Proto)= /^(\w+)/; $Protos{$Proto}++; } print "TCP = $Protos{'tcp'}\n";
#output section TCP = 6 # all is correct here
Version which does not work: ********************** if (/tcp/) { my($Proto, $RecvQ)= /^(\w+) (\s+)/; The above re says find and extract a word into $Proto, find some whitepsace and ignore it, then find more whitepsace and extract it into $RecvQ. Do you mean to use an uypper-case S, meaning find
non-whitepsace..? $Protos{$Proto}++; $RecvQs{$RecvQ)++; } print "TCP = $Protos{'tcp'}\n"; print "RecvQ = $RecvQs{'0'}\n";
#output section TCP = 6 # all is correct here Use of uninitialized value in concatenation (.) or string at... #
error time - this refers to the 2nd print statement RecvQ = # this is blank
I have tried reading the second parameter as a (\s+) and as a (\d+) with
no luck. If you run netstat you will probably see that all items in the
RecvQ column are 0. What have I done wrong now?
If you are trying to parse this:
Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 redhat:ssh winxp:1099 ESTABLISHED
how about this:
my ($proto,$rxQ,$txQ,$l_addr,$r_addr,$state) = /^(\w+) (\d+) (\d+) (\S+) (\S+) (\w+)/;
which says (with whitespace in between each):
find a word and extract into $proto, find a number and extract into $rxQ, ditto for $txQ, find NON-whitespace and extract into $l_addr, ditto for $r_addr, find a word and extract into $state.
Use \S+ for the addresses because they contain numbers, letters, and a colon. Neither \w nor \d would match these.
Can a number of whitespaces be represented by: /^(\w+) (\s+)/; # this is a word followed by some spaces followed by a string or is the above only ONE whitespace?
\s (lower-case) DOES NOT mean a string, it means whitespace. \S (upper-case) means non-whitespace.
If you have access to "The Camel Book" by ORA, try reading the section on pattern matching. It's clear you're not getting how to construct a meaningful regular expression.
:)
Mine was failing somewhere on the 4th parameter search ie. the first (\S+)
but that's because I was trying to do:
print "Dells = $LocalAddresses{'Dell'}\n;
whereas there was no value of Dell being passed anywhere - there was only
Dell:smtp and Dell:32769...DOH...
Also, what's the purpose of having something like this at the beginning of
the script:
my (%protos,%rxQs,%txQs,%l_addrs,%r_addrs,%states)
cause when I remarked it with # there was no diff to the way the script
runs.
Troll wrote: Will do - thanks. I came across the use strict before [looking at other ppls script examples] but never got the chance to read up on it yet.
On a TO DO list now...
put it on *top* of the list. Speaking of top: don't top post and snip
things no longer relevant.
--
Kind regards, feel free to mail: mail(at)johnbokma.com (or reply)
virtual home: http://johnbokma.com/ ICQ: 218175426
John web site hints: http://johnbokma.com/websitedesign/
Sorry. I tried to improve the visibility a bit as the thread was scrolling
but I see you point.
A NG etiquette refresher needed...
Will keep things in order now and *snip* [what does that stand for?] them
when necessary
"John Bokma" <po********@castleamber.com> wrote in message
news:3f*********************@news.kabelfoon.nl... Troll wrote:
Will do - thanks. I came across the use strict before [looking at other ppls script examples] but never got the chance
to read up on it yet.
On a TO DO list now...
put it on *top* of the list. Speaking of top: don't top post and snip things no longer relevant.
-- Kind regards, feel free to mail: mail(at)johnbokma.com (or reply) virtual home: http://johnbokma.com/ ICQ: 218175426 John web site hints: http://johnbokma.com/websitedesign/
Troll wrote: Sorry. I tried to improve the visibility a bit as the thread was scrolling but I see you point. A NG etiquette refresher needed... Will keep things in order now and *snip* [what does that stand for?] them when necessary
cut. It is quite common when a large part is removed to state what was
removed, e.g.:
[cut perl example]
or
[snip perl example]
Sometimes <> is used instead of []. Or even ...
Most newsreaders provide scrolling by pressing the space bar. Reading a
top post and scrolling down to understand to what it is referring (and
back up and down etc) is always harder than reading bottom down. Most
postings fit on a screen after careful cutting.
--
Kind regards, feel free to mail: mail(at)johnbokma.com (or reply)
virtual home: http://johnbokma.com/ ICQ: 218175426
John web site hints: http://johnbokma.com/websitedesign/
John and Greg,
Thanks for the help today [again].
I'm sure to have some more Qs tomorrow but right now I need to rewrite the
code from my laptop to an external Solaris box. This will also mean my
variable definions will change. This little task will take me some time
especially that the vi editor I have to use is a less friendly version than
the one which comes with my RH9. I then need to test the code on the
external system so I don't see myself posting anymore today until same time
tomorrow.
Cheers,
T
"John Bokma" <po********@castleamber.com> wrote in message
news:3f*********************@news.kabelfoon.nl... Troll wrote:
Sorry. I tried to improve the visibility a bit as the thread was
scrolling but I see you point. A NG etiquette refresher needed... Will keep things in order now and *snip* [what does that stand for?]
them when necessary
cut. It is quite common when a large part is removed to state what was removed, e.g.:
[cut perl example]
or
[snip perl example]
Sometimes <> is used instead of []. Or even ...
Most newsreaders provide scrolling by pressing the space bar. Reading a top post and scrolling down to understand to what it is referring (and back up and down etc) is always harder than reading bottom down. Most postings fit on a screen after careful cutting.
-- Kind regards, feel free to mail: mail(at)johnbokma.com (or reply) virtual home: http://johnbokma.com/ ICQ: 218175426 John web site hints: http://johnbokma.com/websitedesign/
Back again...
I have to say that I'm now in possession of a mostly working script - thanks
to the both of you and some Google links.
How can I get the total number of items that has been passed here?
$UDP6LocalAddresses{$UDP6LocalAddress}++;
"Troll" <ab***@microsoft.com> wrote in message
news:oC*******************@news-server.bigpond.net.au... John and Greg, Thanks for the help today [again].
I'm sure to have some more Qs tomorrow but right now I need to rewrite the code from my laptop to an external Solaris box. This will also mean my variable definions will change. This little task will take me some time especially that the vi editor I have to use is a less friendly version
than the one which comes with my RH9. I then need to test the code on the external system so I don't see myself posting anymore today until same
time tomorrow.
Cheers, T "John Bokma" <po********@castleamber.com> wrote in message news:3f*********************@news.kabelfoon.nl... Troll wrote:
Sorry. I tried to improve the visibility a bit as the thread was scrolling but I see you point. A NG etiquette refresher needed... Will keep things in order now and *snip* [what does that stand for?] them when necessary
cut. It is quite common when a large part is removed to state what was removed, e.g.:
[cut perl example]
or
[snip perl example]
Sometimes <> is used instead of []. Or even ...
Most newsreaders provide scrolling by pressing the space bar. Reading a top post and scrolling down to understand to what it is referring (and back up and down etc) is always harder than reading bottom down. Most postings fit on a screen after careful cutting.
-- Kind regards, feel free to mail: mail(at)johnbokma.com (or reply) virtual home: http://johnbokma.com/ ICQ: 218175426 John web site hints: http://johnbokma.com/websitedesign/
Troll wrote: Back again... I have to say that I'm now in possession of a mostly working script - thanks to the both of you and some Google links.
How can I get the total number of items that has been passed here? $UDP6LocalAddresses{$UDP6LocalAddress}++;
print "$UDP6LocalAddresses{$UDP6LocalAddress}\n";
or do you mean all?
There are two ways: summing all hash values or keeping an additional
counter. The latter means: before your loop:
my $total_count = 0;
and after each $UDP6LocalAddresses{$UDP6LocalAddress}++; do
$total_count++;
--
Kind regards, feel free to mail: mail(at)johnbokma.com (or reply)
virtual home: http://johnbokma.com/ ICQ: 218175426
John web site hints: http://johnbokma.com/websitedesign/
Yeah, I was trying this but kept on getting a Global parameter error of some sort:
Global symbol "" requires explicit package name
If I use the total_count method it unfortunately counts my section headings as well
[I ended up simplifying the section searches a bit]
"John Bokma" <po********@castleamber.com> wrote in message news:3f*********************@news.kabelfoon.nl... Troll wrote: Back again... I have to say that I'm now in possession of a mostly working script - thanks to the both of you and some Google links. How can I get the total number of items that has been passed here? $UDP6LocalAddresses{$UDP6LocalAddress}++; print "$UDP6LocalAddresses{$UDP6LocalAddress}\n"; or do you mean all? There are two ways: summing all hash values or keeping an additional counter. The latter means: before your loop: my $total_count = 0; and after each $UDP6LocalAddresses{$UDP6LocalAddress}++; do $total_count++; -- Kind regards, feel free to mail: mail(at)johnbokma.com (or reply) virtual home: http://johnbokma.com/ ICQ: 218175426 John web site hints: http://johnbokma.com/websitedesign/
Troll wrote: Yeah, I was trying this but kept on getting a Global parameter error of some sort:
Global symbol "" requires explicit package name
If I use the total_count method it unfortunately counts my section headings as well [I ended up simplifying the section searches a bit]
my $sum = 0;
foreach my $value (values %UDP6LocalAddresses) {
$sum += $value;
}
--
Kind regards, feel free to mail: mail(at)johnbokma.com (or reply)
virtual home: http://johnbokma.com/ ICQ: 218175426
John web site hints: http://johnbokma.com/websitedesign/
"Troll" <ab***@microsoft.com> wrote in message news:%i*******************@news-server.bigpond.net.au...
Yeah, I was trying this but kept on getting a Global parameter error of some sort:
Global symbol "" requires explicit package name
If I use the total_count method it unfortunately counts my section headings as well
[I ended up simplifying the section searches a bit]
"John Bokma" <po********@castleamber.com> wrote in message news:3f*********************@news.kabelfoon.nl... Troll wrote: Back again... I have to say that I'm now in possession of a mostly working script - thanks to the both of you and some Google links. How can I get the total number of items that has been passed here? $UDP6LocalAddresses{$UDP6LocalAddress}++; print "$UDP6LocalAddresses{$UDP6LocalAddress}\n"; or do you mean all? There are two ways: summing all hash values or keeping an additional counter. The latter means: before your loop: my $total_count = 0; and after each $UDP6LocalAddresses{$UDP6LocalAddress}++; do $total_count++; -- Kind regards, feel free to mail: mail(at)johnbokma.com (or reply) virtual home: http://johnbokma.com/ ICQ: 218175426 John web site hints: http://johnbokma.com/websitedesign/
OK, I fixed it.
It ain't the most pretty of solutions but it works.
if (/UDP: IPv6/../^$/) {
my($UDP6LocalAddress)=
/^\s+(\S+)/;
$UDP6LocalAddresses{$UDP6LocalAddress}++;
if (/^$/ || /UDP: IPv6/ || /Local Address/ || /-------/) {
#do nothing
} else {
$AddressCount++;
}
}
The following is now totally obsolete but I might leave it in anyway - maybe I'll sort this out later when I'll do some more reading?:
my($UDP6LocalAddress)=
/^\s+(\S+)/;
$UDP6LocalAddresses{$UDP6LocalAddress}++;
Thanks John - will give it a try.
"John Bokma" <po********@castleamber.com> wrote in message
news:3f*********************@news.kabelfoon.nl... Troll wrote:
Yeah, I was trying this but kept on getting a Global parameter error of
some sort: Global symbol "" requires explicit package name
If I use the total_count method it unfortunately counts my section
headings as well [I ended up simplifying the section searches a bit]
my $sum = 0; foreach my $value (values %UDP6LocalAddresses) {
$sum += $value; }
-- Kind regards, feel free to mail: mail(at)johnbokma.com (or reply) virtual home: http://johnbokma.com/ ICQ: 218175426 John web site hints: http://johnbokma.com/websitedesign/ This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Redd |
last post by:
The following is a technical report on a data modeling project
that was recently assigned to me by my professor. I post it
so that anyone else who is studying databases and data modeling
can have...
|
by: me |
last post by:
I've posted this in the microsoft news group but just noticed the comp
newsgroups. What's the difference anyways?
This one is a tricky one so I'm interested in seeing what all you gurus have
to...
|
by: Greg Teets |
last post by:
I have a table that provides all the details necessary for the line
items of the report.
I would like to use some other information in the header of the
report. This information, like getting...
|
by: xarax |
last post by:
Greetings,
What is the general practice, usual and customary way,
of including a data file into a source file?
I have some large data structures defined as source
similar to:
...
|
by: pbb |
last post by:
I've got an ASP.NET app (VB.NET) that I'm building for our company to use
in-house. Just for background info - the prototype of this program was a
windows-based app, but we want to make it...
|
by: Phil Endecott |
last post by:
Dear PostgreSQL experts,
This is with version 7.4.2.
My database has grown a bit recently, mostly in number of tables but
also their size, and I started to see ANALYSE failing with this...
|
by: fakeprogress |
last post by:
For a homework assignment in my Data Structures/C++ class, I have to
create the interface and implementation for a class called Book, create
objects within the class, and process transactions that...
|
by: Motoma |
last post by:
This article is cross posted from my personal blog. You can find the original article, in all its splendor, at http://motomastyle.com/creating-a-mysql-data-abstraction-layer-in-php/.
Introduction:...
|
by: Brock |
last post by:
I am trying to populate a Crystal Report from data in my DataGrid. The
reason for this is that I want the user to be able to change values
without updating the database, but still have their report...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
| |