By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,542 Members | 1,434 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,542 IT Pros & Developers. It's quick & easy.

counting different number of characters

P: 4
Expand|Select|Wrap|Line Numbers
  1. #/usr/bin/perl 
  2.  
  3.  use strict;
  4.  use warnings;
  5.  
  6.  
  7.  #  Setup the counters
  8.  
  9.  my $shed_counter= 0;          # question mark counter
  10.  my $doubleshed_counter = 0;      # period counter
  11.  my $ga_counter = 0;    # comma counter
  12.  
  13.  
  14.  # Open a file
  15.  
  16.  open(FILE, "world affairs.txt");
  17.  
  18.  
  19.  # Process the file
  20.  
  21.  while(<FILE>)
  22.  {
  23.      if( $_ = m/\།/)
  24.      {
  25.          $shed_counter++;
  26.      }
  27.  
  28.      if( $_ = m/\།+/)
  29.      {
  30.          $doubleshed_counter++;
  31.      }
  32.  
  33.      if( $_ = m/\ག/)
  34.      {
  35.          $ga_counter++;
  36.      }
  37.  }
  38.  
  39.  print("Number of question marks:  $qm_counter\n");
  40.  print("Number of commas:  $comma_counter\n");
  41.  print("Number of periods:    $period_counter\n");
  42.  
so this three are the charaters i want to count from the input text (which is the whole text )

i want to count the follow three characters
1)།
2)ག
)།།
is it possible??

thanks
chungku
Jan 28 '10 #1
Share this Question
Share on Google+
1 Reply


Expert 100+
P: 785
Yes, it's possible.

But your characters are shown all as boxes in this listing, so nobody can verify if you are using the right regular expression. You are writing about question marks, commas and periods, but these characters are definitively no question marks, commas and periods, else we could see them here properly. For example, I can see the comma after "FILE" properly, and the period after "affairs", in line 16: open(FILE, "world affairs.txt")

Other programmers probably wouldn't see the characters correctly, too, only if they use the exact editor and local settings you used.
So do us an the others a favor and code professionally:
use the normal ascii-characters instead of the same-looking national character, and if you want to count some national characters, use hexadecimal values of unicode for them to code them, instead of pasting them directly into the source code.

Example:
You can match hex characters with \x. Just put \x before the hexadecimal number. For instance, if you wish to replace all backspace characters with nothing, use this:

perl -pi -e "s|\x08||g" file.txt


By the way, why are you putting your counting results in $shed_counter, $doubleshed_counter and $ga_counter, but printing out $comma_counter, $period_counter and $qm_counter? all these latter variables are global and not initialized! It looks like a programming error. You probably want to print out the first mentioned variables.
Jan 29 '10 #2

Post your reply

Sign in to post your reply or Sign up for a free account.