473,379 Members | 1,257 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,379 software developers and data experts.

Word Count not Counting Right...

72
Got this assignment due a few weeks later and since I am done with the up coming assignment, decided to try out the next one early rather than rush later. First part of it requires me to do a character count on a text document which i will have to use huffman coding to encode.

Below is my code that i created to count and display the characters and its frequency in the document.

Expand|Select|Wrap|Line Numbers
  1. import java.io.*;
  2. import java.util.*;
  3.  
  4. public class SourceModel 
  5. {
  6.  
  7.  
  8.     public static void main(String [] Args)
  9.     {
  10.  
  11.         int j;
  12.         String str;
  13.  
  14.         Map<Integer, Integer> m = new HashMap<Integer, Integer>();                                
  15.  
  16.         try
  17.         {
  18.             BufferedReader in = new BufferedReader(new FileReader("DecOfInd.txt"));             
  19.  
  20.             while ((str = in.readLine()) != null) 
  21.             {
  22.                 for(int i = 0; i < str.length(); i++)
  23.                 {                        
  24.  
  25.                     if(!m.containsKey((int)str.charAt(i)))
  26.                     {
  27.                         m.put((int)str.charAt(i),1);
  28.                     }
  29.                     else
  30.                     {
  31.                         j = m.get((int)str.charAt(i));
  32.                         j++;                
  33.  
  34.                         m.put((int)str.charAt(i),j);
  35.  
  36.                     }
  37.  
  38.                 }
  39.             }            
  40.  
  41.         }
  42.         catch (IOException e) 
  43.         {
  44.  
  45.         }                        
  46.  
  47.         System.out.println();
  48.         System.out.println(m.size() + " distinct letters:");
  49.         System.out.println(m);
  50.  
  51.         int count = 0;
  52.  
  53.         for(int i :m.keySet()) 
  54.         {
  55.             System.out.println(i + " = " + m.get(i));
  56.             count += m.get(i);
  57.         }
  58.         System.out.println("Total Number of Characters: "+count);
  59.  
  60.     }
  61. }
  62.  
The code has no errors and from the first run, it looks like everything went ok with it displaying the characters and the count. But just to make sure I got it right, I used the unix command "wc" on the document and it seems I have missing characters or something.

Unix gives:
$ wc DecOfInd.txt
29 1369 8458 DecOfInd.txt

My program gives:
Total Number of Characters: 8429

Any kind soul would like to help me find out whats wrong with my code as I seem to have 29 missing characters.

Many thanks in advance.
Kenneth :)
Aug 1 '07 #1
12 3851
r035198x
13,262 8TB
Why did you use Map<Integer, Integer> instead of Map<Character, Integer> ?
Aug 1 '07 #2
KWSW
72
Why did you use Map<Integer, Integer> instead of Map<Character, Integer> ?
Was trying to use the Ascii code as the key.
Aug 1 '07 #3
JosAH
11,448 Expert 8TB
Unix gives:
$ wc DecOfInd.txt
29 1369 8458 DecOfInd.txt

My program gives:
Total Number of Characters: 8429

Any kind soul would like to help me find out whats wrong with my code as I seem to have 29 missing characters.

Many thanks in advance.
Kenneth :)
wc also counts the end of line characters (there are 29 lines in your file). Your
method doesn't (it reads entire Strings and removes the \n characters), so
8429+29 == 8458

kind regards,

Jos
Aug 1 '07 #4
r035198x
13,262 8TB
wc also counts the end of line characters (there are 29 lines in your file). Your
method doesn't (it reads entire Strings and removes the \n characters), so
8429+29 == 8458

kind regards,

Jos
Duh .
Aug 1 '07 #5
JosAH
11,448 Expert 8TB
Duh .
<vigorously starts kicking dust towards the general direction of r035198x/>

:-P

kind regards,

Jos ;-)
Aug 1 '07 #6
prometheuzz
197 Expert 100+
@OP: and if you run your code on other OS-es, you can get even different results. On Windows for example, the line separator is made out of two characters: \r\n.
Aug 1 '07 #7
KWSW
72
ah thanks for that info... hmmm will read() work instead since it doesn't read lines like readline()?
Aug 1 '07 #8
JosAH
11,448 Expert 8TB
ah thanks for that info... hmmm will read() work instead since it doesn't read lines like readline()?
Yup, that's the way to go because you want to reproduce the original file content
after Huffman decompression so you should take care of every single byte when
you compress the stuff. readLine() is a nono here, or you may want to add
those end-of-line characters yourself to the Huffman tables/tree.

kind regards,

Jos
Aug 1 '07 #9
KWSW
72
Yup, that's the way to go because you want to reproduce the original file content
after Huffman decompression so you should take care of every single byte when
you compress the stuff. readLine() is a nono here, or you may want to add
those end-of-line characters yourself to the Huffman tables/tree.

kind regards,

Jos
ok thanks for the heads up... will give it a try tmr... :)
Aug 1 '07 #10
KWSW
72
With the advise here, I have modified my code to write to a file the fequency of each character in its ascii code.

Expand|Select|Wrap|Line Numbers
  1. import java.io.*;
  2. import java.util.*;
  3.  
  4. public class SourceModel 
  5. {
  6.  
  7.  
  8.     public static void main(String [] Args)
  9.     {        
  10.  
  11.         Map<Integer,Integer> m = new HashMap<Integer,Integer>();
  12.  
  13.         String sourceFile = "DecOfInd.txt";
  14.         String outputFile = "DisProb_"+sourceFile;
  15.  
  16.         String str = "";
  17.  
  18.         try 
  19.         {
  20.  
  21.             File source = new File(sourceFile);
  22.             FileInputStream in = new FileInputStream(source);
  23.  
  24.  
  25.             int size = (int)source.length();            
  26.             byte[] text = new byte[size];
  27.  
  28.             System.out.println("The size of the file is "+source.length());
  29.  
  30.             int b = in.read(text);
  31.             int count;
  32.  
  33.             for (int i = 0; i < size ; i++) 
  34.             {
  35.                 if(m.containsKey((int)text[i]))
  36.                 {
  37.                     count = m.get((int)text[i]);
  38.                     count++;
  39.                     m.put((int)text[i],count);
  40.                 }
  41.                 else
  42.                 {
  43.                     m.put((int)text[i],1);
  44.                 }
  45.             }
  46.  
  47.             in.close();
  48.         }       
  49.  
  50.         catch (IOException e) 
  51.         {
  52.  
  53.         }
  54.  
  55.         // Writting Character Fequency To File
  56.  
  57.         try
  58.         {
  59.             BufferedWriter out = new BufferedWriter(new FileWriter(outputFile));                        
  60.  
  61.             int k = 0;
  62.  
  63.             for(int i : m.keySet()) 
  64.             {                                                
  65.                 k += m.get(i);                                                
  66.                 out.write(i + " = " + m.get(i) + '\n');
  67.             }
  68.  
  69.             out.write("Total Number of Characters: "+k+'\n');
  70.  
  71.             out.close();
  72.         }
  73.         catch (IOException e) 
  74.         {
  75.  
  76.         }
  77.  
  78.  
  79.         System.out.println();
  80.         System.out.println(m.size() + " distinct letters:");
  81.         System.out.println(m);
  82.  
  83.         int j = 0; int feq;
  84.  
  85.         for(int i : m.keySet()) 
  86.         {            
  87.             j += m.get(i);
  88.             System.out.println(i + " = " + m.get(i));
  89.         }
  90.  
  91.         System.out.println("Total Number of Characters: "+j);                           
  92.  
  93.     }
  94. }
While it runs fine on my machine(win xp), it seems to be giving me problems over at the school's server...

Expand|Select|Wrap|Line Numbers
  1. $ javac -classpath . SourceModel.java      
  2. SourceModel.java:11: not a statement
  3.         Map<Integer,Integer> m = new HashMap<Integer,Integer>();
  4.            ^
  5. SourceModel.java:11: ';' expected
  6.         Map<Integer,Integer> m = new HashMap<Integer,Integer>();
  7.                    ^
  8. SourceModel.java:63: ';' expected
  9.             for(int i : m.keySet()) 
  10.                       ^
  11. SourceModel.java:72: illegal start of expression
  12.         }
  13.         ^
  14. SourceModel.java:85: ';' expected
  15.         for(int i : m.keySet()) 
  16.                   ^
  17. SourceModel.java:93: illegal start of expression
  18.     }
  19.     ^
  20. SourceModel.java:91: ';' expected
  21.         System.out.println("Total Number of Characters: "+j);                           
  22.                                                              ^
  23. 7 errors
Is there something I need to do to make it unix friendly?
Aug 2 '07 #11
JosAH
11,448 Expert 8TB
While it runs fine on my machine(win xp), it seems to be giving me problems over at the school's server...

Is there something I need to do to make it unix friendly?
It has nothing to do with Unix or not: your school machine is running Java 1.4
or earlier; your home machine is running Java 1.5 or later.

kind regards,

Jos
Aug 2 '07 #12
KWSW
72
It has nothing to do with Unix or not: your school machine is running Java 1.4
or earlier; your home machine is running Java 1.5 or later.

kind regards,

Jos
Thanks for the quick reply... time to email my lecturer...
Aug 2 '07 #13

Sign in to post your reply or Sign up for a free account.

Similar topics

5
by: Clifford W. Racz | last post by:
Has anyone solved the issue of translating lists in Word 2003 (WordML) into xHTML? I have been trying to get the nested table code for my XSLT to work for a while now, with no way to get the...
1
by: Bill English | last post by:
I want to add word count to my application. Consider the textcontrol as a regular Windows Forms Rich Text Box. How would I go about counting words?
0
by: slacker | last post by:
I have to write a program that reads in a text file word by word and calculates how many times each word appears in the file. * I have to use dynamic array of pointers to structures to accumulate...
22
by: MP | last post by:
vb6,ado,mdb,win2k i pass the sql string to the .Execute method on the open connection to Table_Name(const) db table fwiw (the connection opened via class wrapper:) msConnString = "Data Source="...
1
by: vmoreau | last post by:
I have a text and I need to find a Word that are not enclosed in paranthesis. Can it be done with a regex? Is someone could help me? I am not familar with regex... Example looking for WORD:...
4
by: etuncer | last post by:
Hello All, I have Access 2003, and am trying to build a database for my small company. I want to be able to create a word document based on the data entered through a form. the real question is...
0
by: AcidBurn | last post by:
Hi Can anyone help me ragarding my query Im trying to develop a classic asp script that will count the number of lines of ms word document. say...this is the content of the word file: ...
1
by: beanie | last post by:
i am a c programming beginner and i am trying to Create a concordance of Word Count for a text File in c programming but my code isn't working.please can u help me out.here is my code: #include...
2
by: beanie | last post by:
i am a beginer in c programming and i am trying to Create a Concordance of Word Count for a Text File but my code is not working.pls can anyone helpme out.here is my code: #include <stdio.h>...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.