473,385 Members | 1,337 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Need to crawl website

I am trying to crawl my site to get a list of all the links. I am using the
regular
expressions to get the href tags from the pages and reading the source pages
using
xmlhttp module.

Is there an efficient way to loop through the links?
I am looping through the links and avoiding the duplicate links, but it is
taking over 2 hours to crawl my site!!
What am i doing wrong? What is making it sooooo slow.

Thanks again
Nov 13 '05 #1
1 2547
there is a great program out there called Xenu.exe
http://home.snafu.de/tilman/xenulink.html

I have tried it and really liked it. One of the best programs out there.

Does that help???
---
Please immediately let us know (by phone or return email) if (a) this email
contains a virus
(b) you are not the intended recipient
(c) you consider this email to be spam.
We have done our utmost to make sure that
none of the above are applicable. THANK YOU
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.691 / Virus Database: 452 - Release Date: 26/05/2004
---
Please immediately let us know (by phone or return email) if (a) this email
contains a virus
(b) you are not the intended recipient
(c) you consider this email to be spam.
We have done our utmost to make sure that
none of the above are applicable. THANK YOU
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.691 / Virus Database: 452 - Release Date: 26/05/2004
Nov 13 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Steve Mauldin | last post by:
Can anyone recommend a good tool to crawl asp pages and make note of which Pages, Images, Includes, Etc. are used on a web site and which pages are not used? I have inherited a website that is...
1
by: Danny | last post by:
I am trying to crawl my site to get a list of links. I am using the regular expressions to get the href tags from the pages and reading the links using xmlhttp module. is there an efficient way...
1
by: Dave | last post by:
Is it possible to crawl a site using ASP & XML HTTP? I know you can hit one link, but how can you go through each link in a page and validate that it returns 200?
12
by: Snoopy33 | last post by:
I have a FE / BE setup on my database. One of the main reasons that I did this was so that I could edit the front end and upload it without causing problems with day to day activities. I have...
8
by: Chris LaVelle | last post by:
I don't know the easiest way to explain this so, I'm going to give an example of how it is today....and what I'm trying to accomplish. in the header... typedef struct XXX_ELEMENT_tag { UINT8 ...
7
by: =?Utf-8?B?SklNLkgu?= | last post by:
How to get search engines crawl data I have a web application that uses user controls and pulls data directly from database and shows it to users in the internet. So there is not html that has the...
2
by: maheswaran | last post by:
Hi Guys, I have a new task that search a text in a website. I need to provide a search option on website,that website is an existing one and there is no structure in that . All pages are static...
0
by: vinothPHP | last post by:
Hi Everyone, I want to make a crawler in php that will crawl site search results by images, links. when i am running the sample script in my localhost i am getting this error,...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.