473,396 Members | 2,016 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

.Net Web Spiders

et al,

Does anyone have any good links as to where to start in writing a
program like this?

I want to make something custom so that I can pull certain websites,
extract certain data from them, store in a SQL server to search
against.

I'm sure there's lots out there, I just can't seem to find it.

Albeit there might be programs out there to do this, but none that I
have seen to a decent job. And besides, I want to get some expirience
in VB.Net programming.

The first version can be just a simple spider that scans 1 website (I
have a test website w/ 6 pages on it). I figure this is small enough
for a start.
Any links anyone can provide would be greatly appreciated.

thx.

Nov 21 '05 #1
2 982
Hi,

You are probably not the first one, you can go at least two directions

Use the axwebbrowser or use let say HttpWebRequest.

To search a document the best way is in my opinion MSHTML
The axwebbrowser is easy, mshtml is really very hard stuff to go for a
newbie.

HttpWebRequest
http://msdn.microsoft.com/library/de...classtopic.asp

webbrowser
http://support.microsoft.com/?kbid=311303

some faqs
http://support.microsoft.com/default...b;EN-US;311284

mshtml
http://msdn.microsoft.com/library/de...ng/hosting.asp

I hope this helps a little bit?

Cor
<no**************@hotmail.com>
et al,

Does anyone have any good links as to where to start in writing a
program like this?

I want to make something custom so that I can pull certain websites,
extract certain data from them, store in a SQL server to search
against.

I'm sure there's lots out there, I just can't seem to find it.

Albeit there might be programs out there to do this, but none that I
have seen to a decent job. And besides, I want to get some expirience
in VB.Net programming.

The first version can be just a simple spider that scans 1 website (I
have a test website w/ 6 pages on it). I figure this is small enough
for a start.
Any links anyone can provide would be greatly appreciated.

thx.

Nov 21 '05 #2
Thanks. I'll do some reading and give one, if not all of them, a shot.
thx.

Nov 21 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Auction software | last post by:
Free download full version , all products http://netauction8.url4life.com/ Groupawy --------------- Google Groups Email spider. The first email spider for google groups. Millions of valid...
2
by: Steven Burn | last post by:
Aaron, Apologies for the ping but, your the only one I can think of offhand that will know what I'm looking for. Basically, to cut a long story short, I run an sURL service (surl.co.uk) that...
5
by: Simon Wigzell | last post by:
Will a search engine spider pick up and follow (and index!) a javascript popup window link like this : <a onclick="MyWindow=window.open('http://www..com','MyWindow','')"></a> Thanks!
3
by: RTL | last post by:
Hello all, Actually, we're letting several of our domain names go back into the field. I believe there must be some industry way to use some HTML element/tag to get the Spiders to start updating...
4
by: Tony | last post by:
Hi everyone, Does anyone have any idea whether the "modern" email spiders are capable to understand javascript. If one makes a search with google one finds out that it is widely believed that...
0
by: Auction software | last post by:
Free download full version , all products from Mewsoft dot com http://netauction8.url4life.com/ Groupawy --------------- Google Groups Email spider. The first email spider for google groups....
10
by: moondaddy | last post by:
I'm writing an ecommerce app in asp.net/vb.net and need to make the pages searchable and crawlable by spiders, particularly Google's. As far as I know, if my pages's contents are mostly populated...
0
by: nondisclosure007 | last post by:
et al, Me again. Anyone got some good links on writing Spiders? Espicially ones that take the data they gather and put them into a dB? thx.
6
by: ewolfman | last post by:
Hi, Recently we've hired professional SEO services to help up promote our website. They claim that pages which contain ASP.NET's Gridview with paging will not be scanned by the different spiders...
2
by: mosscliffe | last post by:
I am trying to create a back link, equivalent to the browser back action and I can not use java script. The target user does not allow java script. I am using HTTP_REFERER. I need to add the...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.