I am trying to implement inverted index in java from few days..but I am unable to implement it.the term and term frequencies are coming nicely but I am unable to retrieve the document Id's.I am not getting the idea how to use two treemap, or how to wrap one treemap inside another treemap.
I am attaching the code here. - import java.util.*;
-
import java.io.*;
-
-
public class invertindex{
-
-
public static void main (String[] args)
-
{
-
TreeMap <String, Integer> t1 = new TreeMap<String, Integer>();
-
// TreeMap <String, TreeSet> t2 = new TreeMap<String, TreeSet>();
-
readFile(t1);
-
//print(t1);
-
}
-
-
public static int getWord
-
(String word, TreeMap <String, Integer> t1 )
-
{
-
if (t1.containsKey(word))
-
{
-
return t1.get(word);
-
}
-
else {
-
return 0;
-
}
-
}
-
-
-
public static void readFile(TreeMap <String, Integer> t1 )
-
{
-
// t1.clear();
-
Scanner File;
-
String word;
-
Integer count;
-
String Docs [] = {"words.txt", "words2.txt","words3.txt", "words4.txt",};
-
try
-
{
-
for (int x=0; x<Docs.length; x++)
-
{
-
t1.clear();
-
-
File f= new File(Docs[x]);
-
BufferedReader br= new BufferedReader(new FileReader(f));
-
-
// File = new Scanner(new FileReader(Docs[x]));
-
-
String str="";
-
while ((str=br.readLine())!=null)
-
{
-
// word = File.next( );
-
StringTokenizer stk=new StringTokenizer(str, " ,.-");
-
while(stk.hasMoreTokens())
-
{
-
word=stk.nextToken();
-
word = word.toLowerCase();
-
-
count = getWord(word, t1) + 1;
-
t1.put(word, count);
-
}
-
}
-
-
print(t1);
-
}
-
}
-
-
catch (Exception e)
-
{
-
System.err.println(e);
-
return;
-
}
-
}
-
-
public static void print(TreeMap<String, Integer> t1)
-
{
-
System.out.println("(Term, TermFrequency)");
-
System.out.println("--------------------");
-
-
for(String word : t1.keySet( ))
-
{
-
System.out.printf("(%s,%d);", word, t1.get(word));
-
}
-
-
}
-
}
-
0 1929 Sign in to post your reply or Sign up for a free account.
Similar topics
by: Dave Brueck |
last post by:
Below is some information I collected from a *small* project in which I wrote
a Python version of a Java application. I share this info only as a data
point (rather than trying to say this data...
|
by: Maurice Ling |
last post by:
Hi,
I have read that this had been asked before but there's no satisfactory
replies then.
I have a module (pA) written in python, which originally is called by
another python module (pB), and...
|
by: asj |
last post by:
Since Java runs eBay, is used to power most of the hundreds of
millions of SIM cards in your cellphones, protects most of the
security/healthcare smartcards of entire countries like taiwan, and is...
|
by: David Van D |
last post by:
Hi there,
A few weeks until I begin my journey towards a degree in Computer
Science at Canterbury University in New Zealand,
Anyway the course tutors are going to be teaching us JAVA wth bluej...
|
by: Jobs |
last post by:
Download the JAVA , .NET and SQL Server interview with answers
Download the JAVA , .NET and SQL Server interview sheet and rate
yourself. This will help you judge yourself are you really worth of...
|
by: suryanector |
last post by:
anybody knows source code for programs using Index file, inverted file operations, usage of B and B++ trees in C++ language plz send them.
|
by: vd12005 |
last post by:
Hello,
While playing to write an inverted index (see:
http://en.wikipedia.org/wiki/Inverted_index), i run out of memory with
a classic dict, (i have thousand of documents and millions of terms,...
|
by: walterbyrd |
last post by:
Some think it will.
Up untill now, Java has never been standard across different versions
of Linux and Unix. Some think that is one reason that some developers
have avoided Java in favor of...
|
by: dmjpro |
last post by:
Two three days earlier i tested a code, single and double inverted comma represented differently in MS word and Notepad; actually what happened, i was writing something on MS word and finally put it...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
| |