473,468 Members | 1,538 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Web Crawler

All,

I am trying to create a crawler to go out to a site, collect
information and store it. While I have addressed a lot of the issues in
the collection of the information I am stuck on one particular issue.

It apears that a page posts back to itself to move to the next result
set. The following is the "View Source" from the page:

<td align="right"><b>Page&nbsp;</b></td>
<td width="14" align="center"><input
name="_ctl1:PageControlTop:PageNumberEdit" type="text" value="1"
maxlength="3" id="_ctl1_PageControlTop_PageNumberEdit" size="2"
style="width:30px;" /></td>
<td><b>&nbsp;of 21</b></td>
<td>
<a id="_ctl1_PageControlTop_GoToBtn"
href="javascript:__doPostBack('_ctl1$PageControlTo p$GoToBtn','')"><b>Go<
/b></a>

</td>
<td>&nbsp;</td>
<td align="right">

Next&nbsp;</td>
<td width="21"><a id="_ctl1_PageControlTop_NextBtn"
href="javascript:__doPostBack('_ctl1$PageControlTo p$NextBtn','')"><img
src="/images/buttons/next_button.gif" width="21" height="15" border="0"
alt="Next"></a>

</td>

Any help would be greatly appreciated. This is a windows
application using C#. I can also use VB.NET also.

A Plan without Action is a DayDream
Action without a Plan is a Nightmare

*** Sent via Developersdex http://www.developersdex.com ***
Nov 17 '05 #1
7 5877
do a submit to the form's url with code (i think you know how to post with
code..).
for example, if case is :
__doPostBack('_ctl1$PageControlTop$GoToBtn','')
send __EVENTTARGET=_ctl1:PageControlTop:GoToBtn
and __EVENTARGUMENT=''
as post data.
hope this helps..
Nov 17 '05 #2
Crow,

Thank you, I will look at the process with writing the page back
with post, I dont know how to do it yet but I can at least now know what
to look for. Once again Thanx

A Plan without Action is a DayDream
Action without a Plan is a Nightmare

*** Sent via Developersdex http://www.developersdex.com ***
Nov 17 '05 #3
Crow,

I thought I did, I was using the following:
HttpWebRequest urlRequest = (HttpWebRequest)
WebRequest.Create(newUrl);
HttpWebResponse urlResponse = (HttpWebResponse)
urlRequest.GetResponse();
But how do you attache the current page and viewstate to the page
so I can post it back to itself?

A Plan without Action is a DayDream
Action without a Plan is a Nightmare

*** Sent via Developersdex http://www.developersdex.com ***
Nov 17 '05 #4
Crow,

I thought I did, I was using the following:
HttpWebRequest urlRequest = (HttpWebRequest)
WebRequest.Create(newUrl);
HttpWebResponse urlResponse = (HttpWebResponse)
urlRequest.GetResponse();
But how do you attache the current page and viewstate to the page
so I can post it back to itself?

A Plan without Action is a DayDream
Action without a Plan is a Nightmare

*** Sent via Developersdex http://www.developersdex.com ***
Nov 17 '05 #5
Just to identify, I set the Method to a POST:
urlRequest.Method="POST"

A Plan without Action is a DayDream
Action without a Plan is a Nightmare

*** Sent via Developersdex http://www.developersdex.com ***
Nov 17 '05 #6
I set the Method to POST:
urlRequest.Method="POST"

A Plan without Action is a DayDream
Action without a Plan is a Nightmare

*** Sent via Developersdex http://www.developersdex.com ***
Nov 17 '05 #7
You can easily write Web Crawler with SW Explorer Automation
(http://home.comcast.net/~furmana/SWIEAutomation.htm ).

SW Explorer Automation (SWEA) creates an object model (automation
interface) for any Web application running in Internet Explorer. It
allows visually generate test scripts based on the defined object
model.

Nov 17 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Gomez | last post by:
Hi, Is there a way to know if a session on my web server is from an actual user or an automated crawler. please advise. G
1
by: Benjamin Lefevre | last post by:
I am currently developping a web crawler, mainly crawling mobile page (wml, mobile xhtml) but not only (also html/xml/...), and I ask myself which speed I can reach. This crawler is developped in...
1
by: Steve Ocsic | last post by:
Hi, I've coded a basic crawler where by you enter the URL and it will then crawl the said URL. What I would like to do now is to take it one step further and do the following: 1. pick up the...
0
by: Nicolas | last post by:
I need HELP!!!!! The crawler (Google or other) don't index my web site unless the web site is currently visited If there is nobody visiting those .aspx page therefor activating the aspnet no...
3
by: Bill | last post by:
Has anyone used/tested Request.Browser.Crawler ? Is it reliable, or are there false positives/negatives? Thanks!
13
by: abhinav | last post by:
Hi guys.I have to implement a topical crawler as a part of my project.What language should i implement C or Python?Python though has fast development cycle but my concern is speed also.I want to...
3
by: Charles Zhang | last post by:
How can I get Request.Browser.Crawler to work correctly? Do I need to update <browserCapssection of the web configuration file? If yes, where can I find something that cover all browsers and...
3
by: mh121 | last post by:
I am trying to write a web crawler (for academic research purposes) that grabs the number of links different websites/domain names have from other websites, as listed on Google (for example, to get...
0
by: kishorealla | last post by:
Hello I need to create a web bot/crawler/spider that would go into different web sites and collect data for us and store in a database. The crawler needs to 'READ' the options on a website (either...
4
by: sonich | last post by:
I need simple web crawler, I found Ruya, but it's seems not currently maintained. Does anybody know good web crawler on python or with python interface?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.