473,379 Members | 1,344 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,379 software developers and data experts.

Obtaining Information from a file

am trying to write a program which gets a segment of text from a file and tells how many times particular words appeared in the text. The file where the text comes from is set up like this:

A common application
in many areas of data processing
involves searching the contents of a file
for occurrences
of a particular word.
end_of_data
processing
of
word
end_of_data

The first text before the "end_of_data" is the text that is to be searched. The text that is after the "end_of_data" and before the next end_of_data are the words that you are looking for in the text.

Like in this example, the program I am writing would return something like this:

processing appears in line(s) 2 a total of 1 time(s)
of appears in line(s) 2, 3, 5 a total of 3 time(s)
word appears in line(s) 5 a total of 1 time(s)

I have the first part of the text kept in an ArrayList using a loop like this:


public static void main(String[] args) throws IOException
{
ArrayList strList = new ArrayList();
Fill(strList);
System.out.println(strList);


}

public static void Fill(ArrayList list) throws IOException
{
Scanner input = new Scanner(new File("data.txt"));
String text = input.nextLine();
String end = "end_of_data";


while(text.compareTo(end)!= 0)
{
list.add(text);
text = input.nextLine();
}

}




How should I go about getting the search words out of the file? Like in this example I just want the words processing, of, and word. I've been trying to figure out some kind of loop to get them out but I can't seem to come up with anything.

Also, how should I store these words so that I can use them later to search the text?

Thanks in advance. :)
Mar 30 '07 #1
1 1179
r035198x
13,262 8TB
am trying to write a program which gets a segment of text from a file and tells how many times particular words appeared in the text. The file where the text comes from is set up like this:

A common application
in many areas of data processing
involves searching the contents of a file
for occurrences
of a particular word.
end_of_data
processing
of
word
end_of_data

The first text before the "end_of_data" is the text that is to be searched. The text that is after the "end_of_data" and before the next end_of_data are the words that you are looking for in the text.

Like in this example, the program I am writing would return something like this:

processing appears in line(s) 2 a total of 1 time(s)
of appears in line(s) 2, 3, 5 a total of 3 time(s)
word appears in line(s) 5 a total of 1 time(s)

I have the first part of the text kept in an ArrayList using a loop like this:


public static void main(String[] args) throws IOException
{
ArrayList strList = new ArrayList();
Fill(strList);
System.out.println(strList);


}

public static void Fill(ArrayList list) throws IOException
{
Scanner input = new Scanner(new File("data.txt"));
String text = input.nextLine();
String end = "end_of_data";


while(text.compareTo(end)!= 0)
{
list.add(text);
text = input.nextLine();
}

}




How should I go about getting the search words out of the file? Like in this example I just want the words processing, of, and word. I've been trying to figure out some kind of loop to get them out but I can't seem to come up with anything.

Also, how should I store these words so that I can use them later to search the text?

Thanks in advance. :)
Since you now have the words in an arraylist you should now be looking at the methods available in the arraylist to see how you can count the frequencies. You may also want to take a look at regular expressions.
Mar 30 '07 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

3
by: Jennlee2 | last post by:
Hello - I'm trying to write something that will traverse the directory structure of my web server directory and pull some particular information on particular file types to create a listing of...
5
by: Martin | last post by:
Dear Group Sorry for posting this here. I'm desperate for a solution to this problem and thought some of you might have come across it with .NET and SQL Server. Let's assume I've the following...
0
by: Lennart Hoglund | last post by:
Uploading an Image to a Server using the method described in the Knowledge Base article Q315832, works fine and smoothly. Obtaining the Image Size using; ImageSize.Text = New...
7
by: Privacy Advocate | last post by:
//crossposted to: comp.lang.javascript, alt.comp.lang.javascript in an effort to get factual answers from JavaScript experts// Simply put; Is it possible to obtain the real (actual) IP address of...
6
by: Pete | last post by:
Hi, I am writing a pre-handler page for aspx pages, in which I want to call a custom method on the code-behind class file for the aspx page. In order to do this I need to somehow get the...
11
by: John Nagle | last post by:
The Python SSL object offers two methods from obtaining the info from an SSL certificate, "server()" and "issuer()". The actual values in the certificate are a series of name/value pairs in ASN.1...
2
by: vishwaskothari | last post by:
when i click on the print button of the datareport there comes the msgbox "Error Obtaining Printer Information " what can be the reason ???? TIA regards vishwas
7
by: =?Utf-8?B?Sm9obiBTdGFnZ3M=?= | last post by:
Hello, Please read this all before giving an answer :) I'm doing some troubleshooting on a web application that my company wrote. It's written in asp.net 1.1. The error that the Event viewer...
1
by: mirainc | last post by:
Is there a way of obtaining and displaying System information of the current PC through IP address? in PHP of course.. Thanks
4
by: =?Utf-8?B?anBzaG9ydHN0dWZm?= | last post by:
Hi, not sure if this is the right place for this but it was definitely the closest I could find. I am having trouble with the GetFileVersionInformation/VerQueryValue functions and was wondering...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.