I would greatly appreciate it if any one of you kind souls could take some time to help me out with an interesting bug in my program. I have tried many times to find the source of the problem unsuccessfully and believe that a second set of eyes will do wonders. Thanks in advance for reading my post.
*********************************
So, after quite some work, I was able to get this program to run (I am a bit of a Java novice). However, the output is not what I am looking for. Here's the assignment for background info:
LAB ASSIGNMENT A19.3
CountWords
Background:
1. This lab assignment will count the occurrences of words in a text file. Here are some special cases that you must take into account:
- Hyphenated-words w/out space = 1 word
- Hypenated - words w/ space = 2 words
- Apostrophes in words = 1 word
2. You are encouraged to use a combination of all the programming tools you have learned so far, such as:
Data Structures Algorithms
Array classes
String class
ArrayList class
sorting
searches
text file processing
Assignment:
1. Your instructor will provide you with a data file (such as test.txt, Lincoln.txt, or dream.txt) to analyze. Parse the file and print out the following statistical results:
– Total number of unique words used in the file.
– Total number of words in a file.
– The top 30 words which occur the most frequently, sorted in descending order by count.
For example:
1 103 the
2 97 of
3 59 to
4 43 and
5 36 a
6 32 be
7 32 we
8 26 will
9 24 that
10 21 is
... rest of top 30 words ...
Number of words used = 525
Total # of words = 1577
Now, time for my code:
wordCounter.java: - import java.util.*;
-
import java.io.*;
-
-
public class wordCounter
-
{
-
private String inFileName;
-
private int i;
-
private ArrayList <String> sortedWords = new ArrayList <String> ();
-
private ArrayList <String> uniqueWords = new ArrayList <String> ();
-
private ArrayList <Word> indivCount = new ArrayList <Word> ();
-
-
public wordCounter(String fn)
-
{
-
inFileName = fn;
-
}
-
-
public void readData(ArrayList <String> fileWords)
-
{
-
Scanner in;
-
try
-
{
-
in = new Scanner(new File(inFileName));
-
int i = 0;
-
while(in.hasNext())
-
{
-
fileWords.add(in.next().toLowerCase());
-
i++;
-
}
-
}
-
catch(IOException x)
-
{
-
System.out.println("Error: " + x.getMessage());
-
}
-
}
-
-
public void sortList(ArrayList <String> a)
-
{
-
for(int position = 0; position < a.size(); position++)
-
{
-
String key = a.get(position);
-
-
while(position > 0 && a.get(position - 1).compareTo(key) > 0)
-
{
-
a.set(position, a.get(position - 1));
-
position--;
-
}
-
-
a.set(position, key);
-
}
-
sortedWords = a;
-
}
-
-
public int findUnique(ArrayList <String> fileWords)
-
{
-
uniqueWords = fileWords;
-
-
while(i < uniqueWords.size() - 1)
-
{
-
if(uniqueWords.get(i).compareTo(uniqueWords.get(i+1)) == 0)
-
{
-
uniqueWords.remove(i+1);
-
}
-
else
-
{
-
i++;
-
}
-
}
-
return uniqueWords.size();
-
}
-
-
public int returnWordTotal(ArrayList <String> a)
-
{
-
return a.size();
-
}
-
-
public void top30()
-
{
-
indivCount.add(new Word(sortedWords.get(0), 1));
-
-
for(int x = 0; x < sortedWords.size() - 1; x++)
-
{
-
if(sortedWords.get(x).compareTo(sortedWords.get(x+1)) == 0)
-
{
-
int count = indivCount.get(x).getCount();
-
indivCount.get(x).setCount(count++);
-
}
-
else
-
{
-
indivCount.add(new Word((sortedWords.get(x)), 1));
-
}
-
-
//indivCount.add(new Word(sortedWords.get(x).getWord(), (sortedWords.get(x).getCount() + 1)));
-
}
-
}
-
-
public void Sort()
-
{
-
mergeSort(indivCount, 0, indivCount.size() - 1);
-
}
-
-
private void merge(ArrayList <Word> a, int first, int mid, int last)
-
{
-
//same as in QuadSortComparableProject
-
//use a temporary array and then put back into original
-
int i = first;
-
int j = 1 + mid;
-
ArrayList <Word> temp = new ArrayList <Word> ();
-
-
while(i <= mid && j <= last)
-
{
-
if(a.get(i).compareTo(a.get(j)) < 0)
-
{
-
temp.add(a.get(i));
-
i++;
-
}
-
else
-
{
-
temp.add(a.get(j));
-
j++;
-
}
-
}
-
-
if(i > mid)
-
{
-
for(int x = j; x <= last; x++)
-
{
-
temp.add(a.get(x));
-
}
-
}
-
else if(j > last)
-
{
-
for(int y = i; y <= mid; y++)
-
{
-
temp.add(a.get(y));
-
}
-
}
-
-
for(int q = 0; q < temp.size(); q++)
-
{
-
a.set(first + q, temp.get(q));
-
}
-
}
-
-
public void mergeSort(ArrayList <Word> a, int first, int last)
-
{
-
//same as in QuadSortComparableProject
-
if(first != last)
-
{
-
int mid = (first + last)/2;
-
mergeSort(a, first, mid);
-
mergeSort(a, mid + 1, last);
-
merge(a, first, mid, last);
-
}
-
}
-
-
-
public void displayWord()
-
{
-
System.out.printf("%8s", "Count");
-
System.out.printf("%15s", "Word");
-
System.out.println("");
-
for(int i = 0; i < 30; i++) //30 used instead of: indivCount.size()
-
{
-
System.out.print(i+1);
-
System.out.printf("%8s", ((Word)indivCount.get(i)).getCount());
-
System.out.printf("%14s", ((Word)indivCount.get(i)).getWord());
-
System.out.println("");
-
if((i+1)%5 == 0)
-
{
-
System.out.println("");
-
}
-
}
-
}
-
-
}
-
Now, Word.java: -
public class Word implements Comparable <Word>
-
{
-
private String myWord;
-
private int myCount; //word occurrences
-
-
public Word(String word, int count)
-
{
-
myWord = word;
-
myCount = count;
-
}
-
-
public int getCount()
-
{
-
return myCount;
-
}
-
-
public void setCount(int count)
-
{
-
myCount = count;
-
}
-
-
public String getWord()
-
{
-
return myWord;
-
}
-
-
public void setWord(String word)
-
{
-
myWord = word;
-
}
-
-
public int compareTo(Word other)
-
{
-
if(myCount > other.myCount)
-
{
-
return 1;
-
}
-
else if(myCount < other.myCount)
-
{
-
return -1;
-
}
-
else
-
{
-
return 0;
-
}
-
}
-
-
}
-
And finally, my tester file, wordCounterTester.java: - import java.util.ArrayList;
-
-
-
public class wordCounterTester
-
{
-
private static ArrayList <String> fileWords = new ArrayList <String> ();
-
-
public static void main(String[] args)
-
{
-
wordCounter myCounter = new wordCounter("dream.txt");
-
myCounter.readData(fileWords);
-
myCounter.sortList(fileWords);
-
System.out.println("Total # of words in file: " + myCounter.returnWordTotal(fileWords));
-
System.out.println("Total # of unique words in file: " + myCounter.findUnique(fileWords));
-
myCounter.top30();
-
myCounter.Sort();
-
myCounter.displayWord();
-
}
-
-
}
Here is a sample output for MLK Jr's "I have a dream" speech (dream.txt):
Total # of words in file: 1580
Total # of unique words in file: 587
Count Word
1 1 you
2 1 york.
3 1 york
4 1 years
5 1 wrote
6 1 wrongful
7 1 would
8 1 work
9 1 words
10 1 withering
11 1 with
12 1 winds
13 1 will
14 1 whose
15 1 who
16 1 white
17 1 whirlwinds
18 1 which
19 1 where
20 1 when
21 1 were
22 1 we
23 1 waters
24 1 was
25 1 warm
26 1 wallow
27 1 walk,
28 1 walk
29 1 vote.
30 1 vote
Obviously, the program doesn't print the top 30 recurring words. It seems to print the last 30 unique words in alphabetical order. This is NOT right!! I have traced through my code many times and see no reason that the output should be wrong. I want it to look like the sample output in the assn. at the top of this post. Attached is the txt file I used.
SO: if anyone can take the time to help me sort this out, I would much appreciate any guidance. All I need is another set of eyes to help me identify the problem.
THANKS IN ADVANCE.
4 4602
we really should not help wth homework assignments please note for future posts! however look at what your calling to print in wordCounterTester.java your calling unique method......
NOTE: this is not a homework assignment for a grade. It is simply a problem my teacher has given my class to as optional review. So, helping me figure out were I went wrong would allow me to further my currently limited knowledge of Java.
Again, if anyone is willing to help me find the error in my program that leads to this incorrect output, that would be much appreciated! Thanks again.
I have traced through my code numerous times, and it seems to me that the output should be correct. I believe a new set of eyes is all that is necessary to help me solve this OPTIONAL problem.
NeoPa 32,556
Recognized Expert Moderator MVP
With homework assignments it is actually ok to help when the member has posted what they have already tried. We are always very keen to help anybody learn. What we want to avoid is posting something where the teacher (or professor or whatever) feels that we have done the assignment (or part of it) for them.
Hints and general advice, for instance on debugging techniques that are usable in various circumstances are not a problem. Care should be taken of course, not simply to hand out answers.
Please check your PMs though slapsh0t, as PMing experts directly is certainly not so acceptable.
Frinavale 9,735
Recognized Expert Moderator Expert
I don't see how NeoPa's answer was the "best answer" considering that it has nothing to do with the original problem or question. So, I have reset the best answer.
@slapsh0t11
It's very hard to read through so much code to try and figure out where you're going wrong. Its a lot easier for experts and members to look at a specific line of code or function rather than making them looking through an entire application. In the future try to reduce the question's size. Locate what you think the source of the problem is, and only post code that is relevant.
That being said...
Have you considered using a HashTable to solve your problem?
I know that your assignment requirements list what you're allowed to use, but you could easily implement a quick HashTable class using classes listed in the assignment.
You could create a new Key/Value pair for each word that you find and store it in the HashTable...if the key already exists for the word then add 1 to the Value.
If you don't know what a HashTable is then you should look it up :)
It's fairly similar to what you have really but you wouldn't be using a "Word Class".
You'd just use a HashTable with the keys being the words in the file and the values being the count of the words in the file.
It would look something like (pseudo-code): - If theHashTable.Keys collection contains the word then:
- Get the Value
- Add One to the value
- Store the value back in the hash table at the key (the word)
- If theHashTable.Keys collection does not contain the word then:
- Add a new key/value pair to the HashTable:
- the key being the word, the value being "1" (because there has only been one found so far).
- Get the next word and Loop
Now when you want to find out which words are unique you'd just loop through the hash table keys and check the value for each key...if the value is "1" then you know that it's unique.
If you don't want to use a HashTable then you should at least be using an ArrayList of Word objects (as apposed to an ArrayList of String objects). What good is an ArrayList of String Objects to you anyways?
You would populate this ArrayList in the readData method. The catch here is to only create a new Word object for each Unique word that you find.
So you'd do something like (again pseudo-code): - Get the next word in the file
- If ArrayList of Word Objects contains this word then:
- Get the Word Object for the word from the ArrayList
- Retrieve the current count for the word (using the getCount() method)
- Add One to the current count value
- Store the new value back in the Word (using the setCount() method)
- If the ArrayList of Word Objects does not contain the word then:
- Create a new Word Object for the word.
- Store the Word Object in the ArrayList of Word Objects
- Loop...
While you're looping you should be checking for the special conditions that your assignment outlines (that word space hyphen space word requirement is a little weird) and removing any punctuation that may be attached to words (I would think the word "walk" and the word "walk," would be considered the same word...but then again that word<space>-<space>word thing is weird...so check your requirements)
-Frinny
Sign in to post your reply or Sign up for a free account.
Similar topics |
by: jester.dev |
last post by:
Hello,
I'm learning Python from Python Bible, and having some
problems with this code below. When I run it, I get nothing. It
should open the file poem.txt (which exists in the current...
|
by: Theresa Hancock via AccessMonster.com |
last post by:
I have an Excel table I need to import into Access. The name is entered into one field "Name". I'd like to have two fields in Access, FirstName and LastName. How do I do this.
--
Message posted...
|
by: fb |
last post by:
Hello. I have this program that I copied out of a textbook. I can't
seem to get it to work. It's a rather old book, that seems to be using
old K&R C. I fixed up to be more standardized, but I...
|
by: Scott Schluer |
last post by:
Hi all,
I got a JavaScript function from a website that uses regular expressions to
count the number of words in a textbox. I'm trying to replicate it with
ASP.NET so I can run a second check on...
|
by: Tim |
last post by:
Hello,
I've finally managed to remotely load a DLL. I've expanded the code to
load it in a seperate domain to unload the appdomain, which works to a
certain extend.
The host application always...
| |
by: jeroenvlek |
last post by:
Hi there,
I've never done this before, so I don't know about any layout
possibilities. My apologies :)
The problem is this:
I've written a function:
map<const char*,...
|
by: joawhzr |
last post by:
Hello, my friends, I hope this is not an already asked (and resolved) question:
Is it possible to find out which word or words in a text field (an address for example) are in another table? and...
|
by: gflor16 |
last post by:
Problem: I have this code to run a word counter. But I have a problem when I hit the enter key, it doesn't give me any output of how many chars or words.
''' <summary>
''' Returns Word...
|
by: luv737 |
last post by:
I made sure there where no white spaces before or after the PHP begin and end tags in all the scripts called from the require_once.
Also these scripts have not really been touched but we upgraded...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |