473,322 Members | 1,620 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

word count problem

can anyone help me add a function that will count the occurance of each word in an input file. here's the code i have so far it counts the number of characters, words, and lines but i need the occurance of each word.

Expand|Select|Wrap|Line Numbers
  1. #include <fstream>  
  2. #include <iostream> 
  3. #include <string>   
  4. #include <cstdlib> 
  5.  
  6. using namespace std; 
  7.  
  8.  
  9.  
  10. string getInputFileName(); // a function to prompt for the complete file name
  11.  
  12. int numCharsInFile( ifstream &in, int &numLines ); // a function to count the
  13.                                                    //    number of characters and
  14.                                                    //    lines in a text file
  15.  
  16. int numWordsInFile( ifstream &in, int &numWords ); // a function to count words in file
  17.  
  18.  
  19.  
  20.  main ()
  21. {
  22.  
  23.     char c;
  24.   int nLines,          // number of lines in the text file
  25.       nChars,          // number of characters in the text file
  26.       avgCharsPerLine, // average number of characters per line
  27.       nWords;          // number of words in the text file
  28.  
  29.  
  30.   ifstream inFile; // handle for the input text file
  31.  
  32.   string fileName; // complete file name including the path
  33.  
  34.   fileName = getInputFileName(); // prompt and obtain the full file name
  35.  
  36.   inFile.open(fileName.c_str()); // try to open the file
  37.  
  38.   if( !inFile.is_open() )    // test for unsuccessfull file opening
  39.    {
  40.      cerr << "Cannot open file: " << fileName << endl << endl;
  41.      exit (0);
  42.    }
  43.  
  44.  
  45.   nChars = numCharsInFile( inFile, nLines ); // determine the number of lines
  46.                                             //    and characters in the file
  47.   nWords = numWordsInFile( inFile, nWords); // determine the number of words
  48.  
  49.   avgCharsPerLine = nChars / nLines;
  50.  
  51.  
  52.   cout << "The number of characters in the file: " << fileName
  53.        << " is = " << nChars << endl << endl;
  54.  
  55.   cout << "The number of lines in the file: " << fileName
  56.        << " is = " << nLines << endl << endl;
  57.  
  58.  
  59.   cout << "The number of Words in the file: " << fileName
  60.        << " is = " << nWords << endl << endl;
  61.  
  62.   cout << "The average number of characters per line in the text file: "
  63.        << fileName << " is: " << avgCharsPerLine << endl << endl;
  64.     cin>>c;
  65.   inFile.close(); // close the input file
  66.  
  67. }
  68.  
  69.  
  70.  
  71. string getInputFileName()
  72.  {
  73.    string fName; // fully qualified name of the file
  74.  
  75.    cout << "Please enter the fully qualified name of the " << endl
  76.         << "input text file (i.e. including the path): ";
  77.    cin >> fName; // cannot handle blanks in a file name or path
  78.    cout << endl; 
  79.  
  80.    return fName;
  81.  }
  82.  
  83.  
  84.  
  85.  
  86.  
  87. int numCharsInFile( ifstream &in, int &numLines )
  88.  {
  89.    int numChars = 0; 
  90.  
  91.    char ch; // character holder;
  92.  
  93.    numLines = 0; // initialize the number of lines to zero
  94.  
  95.    while ( in.get(ch) ) // get the next character from the file
  96.                         //   the function get will also get whitespace
  97.                         //   i.e. blanks, tabs and end of line characters
  98.     {
  99.      if (ch != ' ' )
  100.      {
  101.        if(ch != '\n')
  102.        numChars++;// increase the count of characters by one if ch is NOT '\n' AND NOT a blank space
  103.        else
  104.        {
  105.        numLines++;     // increase the count of lines by one if ch IS '\n'
  106.        }
  107.      } 
  108.     }
  109.     numLines += 1; // for some reason it needs to add one and the results are correct
  110.    return numChars; 
  111.  }
  112.  
  113.  
  114.  
  115.  
  116. int numWordsInFile( ifstream &in, int &nWords)
  117.  {
  118.     in.clear();
  119.  
  120.     in.seekg(0, ios_base::beg); 
  121.  
  122.     int numWords = 0 ; 
  123.  
  124.    char ch; 
  125.  
  126.  
  127.    while (in.get(ch)) 
  128.    {      
  129.  
  130.     if ( ch == ' ' || ch == '\n' || ch == '\t' ) 
  131.        numWords++;    
  132.  
  133.  
  134.     }
  135.  
  136.    return numWords+1; 
  137.  }
Nov 5 '07 #1
3 2678
scruggsy
147 100+
can anyone help me add a function that will count the occurance of each word in an input file. here's the code i have so far it counts the number of characters, words, and lines but i need the occurance of each word.
I'm not going to write the code, but think about it: If you're going to count the occurrence of each distinct word, you'll need to remember those words. So as you read words in, you'll need to store them so that subsequent words can be compared to them. How you store them is up to you, as is how you compare them. STL containers can be a big help there. Take a look at std::set if you're not familiar with it; it's a container which can't hold duplicate elements, which lets you easily determine if a word occurs more than once in the file. Another good way to do this might be to just store each word as it is written, then sort the words in alphabetical order: recurring words will appear next to each other, making it easy to count them.
Nov 5 '07 #2
Laharl
849 Expert 512MB
std::map would probably be better than std::set, since that way you can map strings (words) to integers (frequency counts).
Nov 5 '07 #3
weaknessforcats
9,208 Expert Mod 8TB
Also keep in mind that the >> operator stops on whitespace. You can fetch one word by:
Expand|Select|Wrap|Line Numbers
  1. sting str;
  2. fileName >> str;
  3.  
Also, you are not required to declare your variables at the beginning of each function. It looks like you have a C background and are just starting out on C++.
Nov 6 '07 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

5
by: The Roys | last post by:
Hi Im doing something wrong in quitting the Word.Application in my VB program. I have General Declarations Dim AppWord As Word.Application Form_Load() Set AppWord =...
5
by: jester.dev | last post by:
Hello, I'm learning Python from Python Bible, and having some problems with this code below. When I run it, I get nothing. It should open the file poem.txt (which exists in the current...
3
by: agent mike | last post by:
I am trying to count words in a text file. I am using the following code: in_stream.get(c); if(c == ' ' || c == '.' || c == ',') word_count++; and the word count is too low. If I include "...
5
by: STeve | last post by:
Hey guys, I currently have a 100 page word document filled with various "articles". These articles are delimited by the Style of the text (IE. Heading 1 for the various titles) These articles...
2
by: Mikey | last post by:
Sample VB .NET source code to create mailing labels or customized letters using MS Word MailMerge This VB .NET source code will start MS Word and call methods and set properties in MS Word to...
1
by: Adam Faulkner via DotNetMonster.com | last post by:
I had a problem before extracting pages from an existing word document and then inserting the content into a new word document. The following code below works with Microsoft Word 2000 Function...
1
by: vmoreau | last post by:
I have a text and I need to find a Word that are not enclosed in paranthesis. Can it be done with a regex? Is someone could help me? I am not familar with regex... Example looking for WORD:...
4
by: etuncer | last post by:
Hello All, I have Access 2003, and am trying to build a database for my small company. I want to be able to create a word document based on the data entered through a form. the real question is...
2
by: beanie | last post by:
i am a beginer in c programming and i am trying to Create a Concordance of Word Count for a Text File but my code is not working.pls can anyone helpme out.here is my code: #include <stdio.h>...
6
by: boyindie86 | last post by:
Hi I have been fighting with this lump of code for the last week what I am trying to do is that I am passing words into passages of texts, and I want the system to go and find exact word matches...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.