473,473 Members | 1,581 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Counting Punctuation Characters in a text file

4 New Member
Hi

I'm a complete newbie at Perl and i was wondering if i can get some help completing an exercise i've come across.

I need to count the punctuation marks from a text file and then output a list of all occurring punctuation marks with its frequency printed next to it.

eg

2 hyphens
3 commas
4 periods
etc
etc

Not even sure where to begin, the exercise does give a hint but i'm not sure how that helps me at all, it says: "you will need to have some way of tying in the punctuation characters to its equivalent English name. Think ASCII and hash. Only those punctuation characters within a standard ASCII character set need to be checked for"

Any help with this would be appreciated.

Thanks
Mar 6 '07 #1
6 3982
KevinADC
4,059 Recognized Expert Specialist
a place to start:

How-can-I-count-the-number-of-occurrences-of-a-substring-within-a-string

from there maybe you can figure out the "Think ASCII and hash" part.
Mar 6 '07 #2
docsnyder
88 New Member
@watcher00

I closed "http://perldoc.perl.org/perlfaq4.html#How-can-I-count-the-number-of-occurrences-of-a-substring-within-a-string%3f" very fast, because it's not quite a "quick" reference. You should study it, of course, but in order to get immediate help, take this as a hint on how to proceed:

Expand|Select|Wrap|Line Numbers
  1. $text  = "Hello! This is a sentence, and an example. Two commas, one exclamation mark and two dots.";
  2. @marks = ( '\.', ',', '!' );
  3.  
  4. for $mark ( @marks ) {
  5.   $count{$mark} = () = $text =~ m(($mark))g;
  6.   printf("mark '$mark' occurs %d times\n", $count{$mark});
  7. }
But, be aware about meta characters of perl, which should be escaped (like "\.")!

Enjoy!

Greetz, Doc
Mar 7 '07 #3
watcher00
4 New Member
Thanks guys

here is what i've come up with, would appreciate any comments and suggestions.

Expand|Select|Wrap|Line Numbers
  1. #!c:/Perl/bin/perl.exe
  2.  
  3. print("Please type in the file name\n");
  4. $file= <STDIN>;
  5. open(FILE, "$file") || die "Couldn't open file: $!";
  6.  
  7.  
  8. $text  = join(//,<FILE>);
  9. @marks = ( '!', '"', '\'', '\(', '\)', ',', '-', '\.', '/', ':', ';', '\?' );
  10.  
  11. %names = (
  12.     "!" => "exclamation marks",
  13.     "\"" => "double quotes",
  14.     "\'" => "single quotes",
  15.     "\\(" => "opening parenthesis",
  16.     "\\)" => "closing parenthesis",
  17.     "," => "commas",
  18.     "-" => "hyphens",
  19.     "\\." => "periods",
  20.     "/" => "forward slashs",
  21.     ":" => "colons",
  22.     ";" => "semi-colons",
  23.     "\\?" => "question marks"
  24. );
  25.  
  26. for $mark ( @marks ) 
  27.  
  28. {
  29.     $count{$mark} = () = $text =~ m(($mark))g;
  30.     printf("%d $names{$mark}\n", $count{$mark});
  31. }
  32.  
  33.  
  34.  
Mar 7 '07 #4
miller
1,089 Recognized Expert Top Contributor
Looks good watcher00.

A few stylistic changes that I would suggest:

1) Always "use strict;" It's just a good habit to always follow
2) join takes a string separator, not a regular expression. use an empty string instead of an empty pattern, as in your code each of the lines of the file will be joined by '1'.
3) Take advantage of the \Q alias for quotemeta to avoid having to escape the meta characters in the regex manually.
4) This is a personal preference, but I never use () as the delimiter for a pattern. I much prefer to either use // or {} as I think this is easier to read.
5) Take advantage of keys so you don't have to define your list more than once.

These stylistic changes will result in the follow code:

Expand|Select|Wrap|Line Numbers
  1. #!c:/Perl/bin/perl.exe
  2.  
  3. use strict;
  4.  
  5. print("Please type in the file name\n");
  6. my $file = <STDIN>;
  7.  
  8. open(FILE, "$file") || die "Couldn't open file: $!";
  9. my $text = join '', <FILE>;
  10. close FILE;
  11.  
  12. my %names = (
  13.     '!' => "exclamation marks",
  14.     '"' => "double quotes",
  15.     "'" => "single quotes",
  16.     '(' => "opening parenthesis",
  17.     ')' => "closing parenthesis",
  18.     ',' => "commas",
  19.     '-' => "hyphens",
  20.     '.' => "periods",
  21.     '/' => "forward slashs",
  22.     ':' => "colons",
  23.     ';' => "semi-colons",
  24.     '?' => "question marks"
  25. );
  26.  
  27. my %count;
  28.  
  29. for my $mark ( keys %names ) {
  30.     $count{$mark} = () = $text =~ m/(\Q$mark\E)/g;
  31.     printf("%d $names{$mark}\n", $count{$mark});
  32. }
  33.  
- Miller
Mar 7 '07 #5
watcher00
4 New Member
Thanks a lot guys for your help, as i'm still learning perl, i'm not too sure how exactly this line of code works:

Expand|Select|Wrap|Line Numbers
  1. $count{$mark} = () = $text =~ m/(\Q$mark\E)/g
specifically, why does the " = () = " part need to be in there, what does it do?
Mar 8 '07 #6
miller
1,089 Recognized Expert Top Contributor
Ok, let's briefly talk about "$text =~ m/(\Q$mark\E)/g":

In a scalar or condition context, this will return true or false if the regex matches. In a while loop it will continue to return true until the last match is made because of the global 'g' modifier.

Expand|Select|Wrap|Line Numbers
  1. if ($text =~ m/(\Q$mark\E)/g)
  2.  
or
Expand|Select|Wrap|Line Numbers
  1. while ($text =~ m/(\Q$mark\E)/g) {
  2.  
In a list context, this statement will return all of the captured groups.

Expand|Select|Wrap|Line Numbers
  1. my @marks = ($text =~ m/(\Q$mark\E)/g);
  2.  
Therefore, what the statement you're asking about is doing is tricking the regex to return in a list context, and then assigning that list to a scalar, which essentially returns the number of elements in the list, in other words the number of matches.

Expand|Select|Wrap|Line Numbers
  1. $count{$mark} = () = $text =~ m/(\Q$mark\E)/g
  2.  
is equivalent to:

Expand|Select|Wrap|Line Numbers
  1. my @array = ($text =~ m/(\Q$mark\E)/g);
  2. $count{$mark} = scalar(@array);
  3.  
As always, to better understand the code, try experimenting:

Expand|Select|Wrap|Line Numbers
  1. my $test = 'this is a test. of foo. of bar. of baz. boo yay!';
  2. my @array = $test =~ m/(\Q.\E)/g;
  3. my $count = @array;
  4. print "$count\n";
  5. # Outputs 4
  6.  
- Miller
Mar 8 '07 #7

Sign in to post your reply or Sign up for a free account.

Similar topics

5
by: HumanJHawkins | last post by:
Hi, I have a table of text. I need to search for whole words within this text... For example, I need to be able to search for records that contain 'dog' but not return 'hotdog' or 'dogma' for...
3
by: MLH | last post by:
I have a block of text with about 19,000 characters - alphanumeric, punctuation, hard returns, etc... I would like to count the number of periods ( Chr$(46) ) appearing in the document. Whats a...
1
by: j | last post by:
Hi, I've been trying to do line/character counts on documents that are being uploaded. As well as the "counting" I also have to remove certain sections from the file. So, firstly I was working...
0
by: slacker | last post by:
I have to write a program that reads in a text file word by word and calculates how many times each word appears in the file. * I have to use dynamic array of pointers to structures to accumulate...
4
by: bigbagy | last post by:
Notes The programs will be compiled and tested on the machine which runs the Linux operating system. V3.4 of the GNU C/C++ compiler (gcc ,g++) must be used. A significant amount coding is...
7
by: peraklo | last post by:
Hello, there is another problem i am facing. i have a text file which is about 15000 lines big. i have to cut the last 27 lines from that file and create a new text file that contans those 27...
6
by: sv5perl | last post by:
I want to ask for some advice on a script that will count all the punctuation marks in a text file, I know it's probably quite a basic thing but I am new to Perl and would really appreciate the help,...
3
by: MLH | last post by:
Back in mid-2003, lucason posted a question about removing punctuation chars from a string. Suggested code was posted using Replace function. Could the FN below be easily modified for use with A97...
8
by: xiaolim | last post by:
i making a simple program to count the different kinds of characters in a text file and then display them out, however i only manage to count the total numbers of characters. #include...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.