473,497 Members | 2,184 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

web crawlers in java

1 New Member
can ne body give me an insight on java cralers and what it takes to code web crawlers???
Dec 13 '09 #1
1 1869
sukatoa
539 Contributor
know what data you are going to extract
make an algorithm of it
beware of Object manipulations( specially for a 24/7 System, it consumes huge memory and increasing if not handled properly )

on threads, be careful to handle passed-by-reference instances

Select what database system you are going to use for storage on your extracted data

good luck
Dec 13 '09 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

3
2051
by: Araxes Tharsis | last post by:
Hi, This must be a very old and well studied question... I created a site using JSP, that permits the viewing of articles that are fully stored in a database. The url for the articles is something...
0
6783
by: Ravi Tallury | last post by:
Hi We are having issues with our application, certain portions of it stop responding while the rest of the application is fine. I am attaching the Java Core dump. If someone can let me know what...
5
1189
by: Ian Lane .enizin.net> | last post by:
Hello, Does anyone happen to have a Browsercaps update that properly sets the Crawler attribute? I am seeing that google and others are being recognized as browsers and not crawlers. Thank...
0
1008
by: TomislaW | last post by:
I try to trace users on my web page In global.asax.cs on Application_BeginRequest I check if user has my cookie, if not I give him new cookie (integer identity number from database). When...
0
1616
by: Stefano | last post by:
Hi all, I'm trying to create a browser definition file (.browser) that matches crawlers user agents. I don't want modify browser files in the Config system folder. I'd like to use App_Browsers...
3
2830
by: rooznamechi.h | last post by:
Hi, I use Url rewriting on my website and my website works normally , but I don't know why Google crawlers can not read my pages . For example look at address below :...
2
1152
by: disappearedng | last post by:
Hi Does anyone here have a good recommendation for an open source crawler that I could get my hands on? It doesn't have to be python based. I am interested in learning how crawling works. I think...
2
1656
by: disappearedng | last post by:
Hi Does anyone here have a good recommendation for an open source crawler that I could get my hands on? It doesn't have to be C++ based. I am interested in learning how crawling works. I think...
4
1880
by: =?Utf-8?B?Wm9sdA==?= | last post by:
Hi, Would someone know where I could get a list of the supported crawlers for the HttpBrowserCapabilities? Is there a way to add new ones/modify the list? I have a web site for which I want to...
0
7120
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7160
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
6878
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
1
4897
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4583
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3088
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1405
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
649
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
286
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.