I'm working on a web scraping application that needs to log into a website
before it can get the data to scrape. I've always been confused about how
the HttpWebRequest and HttpWebResponse objects work together with cookies,
and was hoping that someone here could clear it up for me!
Here's the steps I need to accomplish:
1) Set two cookies containing information I already have
2) Request the login page URL, and save a third cookie that is given to me
from that page
3) Submit a form via POST, along with the 3 cookies
4) Save a fourth cookie that is set after the form is submitted with valid
information
I don't need the entire code to do this, but rather some help with the order
to do it. Do I create a request and THEN set the cookies? Can I use the
same CookieContainer for each request? If someone could point me in the
right direction it would be greatly appreciated! 2 6946
Luis Esteban Valencia Muñoz wrote: I'm working on a web scraping application that needs to log into a website before it can get the data to scrape. I've always been confused about how the HttpWebRequest and HttpWebResponse objects work together with cookies, and was hoping that someone here could clear it up for me!
Here's the steps I need to accomplish:
1) Set two cookies containing information I already have 2) Request the login page URL, and save a third cookie that is given to me from that page 3) Submit a form via POST, along with the 3 cookies 4) Save a fourth cookie that is set after the form is submitted with valid information
I don't need the entire code to do this, but rather some help with the order to do it. Do I create a request and THEN set the cookies? Can I use the same CookieContainer for each request? If someone could point me in the right direction it would be greatly appreciated!
This topic has been beaten to death by now -- just search for
WebRequest and CookieContainer on Google: http://tinyurl.com/9yp6r
Just make sure to use one CookieContainer instance throughout your web
conversation. It will pick up all cookies, and you don't need to copy
cookies from responses back to it -- the CookieContainer picks them up
automatically.
You can set any request header (including cookies) as long as you
haven't written to the request stream or read from the response stream
yet.
Cheers,
-- http://www.joergjooss.de
mailto:ne****** **@joergjooss.d e
This example might be helpful: http://msdn.microsoft.com/library/de...ainerTopic.asp
The key is that you need to create and assign a cookiecontainer object to
the request object's CookieContainer property first. Then and only then
will you receive the response cookies.
--Buddy
"Luis Esteban Valencia Muñoz" <le********@ava nsoft.com> wrote in message
news:OM******** ******@TK2MSFTN GP14.phx.gbl... I'm working on a web scraping application that needs to log into a website before it can get the data to scrape. I've always been confused about how the HttpWebRequest and HttpWebResponse objects work together with cookies, and was hoping that someone here could clear it up for me!
Here's the steps I need to accomplish:
1) Set two cookies containing information I already have 2) Request the login page URL, and save a third cookie that is given to me from that page 3) Submit a form via POST, along with the 3 cookies 4) Save a fourth cookie that is set after the form is submitted with valid information
I don't need the entire code to do this, but rather some help with the
order to do it. Do I create a request and THEN set the cookies? Can I use the same CookieContainer for each request? If someone could point me in the right direction it would be greatly appreciated!
This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: TJO |
last post by:
Can someone at MS please reply to this. I am trying to post data so a web
form via ssl with the following code. I keep getting this error: "The
underlying connection was closed: Could not establish secure channel for
SSL/TLS"
private void mainHttpCalls(string postData)
{
HttpWebRequest objRequest1 ;
HttpWebRequest objRequest2 ;
|
by: Gregory A Greenman |
last post by:
I'm trying to write a program in vb.net to automate filling out a
series of forms on a website. There are three forms I need to
fill out in sequence. The first one is urlencoded. My program is
able to fill that one out just fine.
The second form is multipart/form-data. Unfortunately, I haven't
been able to fill that out in a way that makes...
|
by: Vasu |
last post by:
Hi,
I have a requirement to download a file from the web
site using a client tool.
Iam writing a C# program to download using WebRequest,
HttpRequest, WebResponse and
so on. The problem Iam having is to navigate thru
multiple pages. I have to login
|
by: Keith Patrick |
last post by:
I'm trying to programmatically post data to another page within my ASP.Net
app. Not POSTing is not an option (I can't store this data in my session,
context, hidden fields, or anything else...I've exhausted all my other
options, so I have to get this scenario working). Basically, if I POST the
data normally, it works fine, except the target...
|
by: Mark Waser |
last post by:
Hi all,
I'm trying to post multipart/form-data to a web page but seem to have
run into a wall. I'm familiar with RFC 1867 and have done this before (with
AOLServer and Tcl) but just can't seem to get it to work in Visual Basic. I
tried coding it once myself from scratch and then modified a class that I
found on a newsgroup (referenced...
| |
by: shankararaman.s |
last post by:
Hi,
I am trying to develop an interface which will fetch all my Yahoo
mails. I am not able to sign in to yahoo by posting the form with my
username & password. Please find my code below and correct me where am
going wrong.
string result;
System.Net.HttpWebRequest request,request1;
System.Net.HttpWebResponse response,response1;
|
by: branden.hughes |
last post by:
http://www.dcor.state.ga.us/GDC/OffenderQuery/jsp/OffQryForm.jsp?Institution=vDisclaimer=True
Every time I try it just kicks back the same HTML you'd find from the
link above rather than a page displaying a particular individuals
records. I've tried some generic cookie code to no avail. Any ideas?
I'm working to get a huge process...
|
by: adwooley2 |
last post by:
Hello. Have been losing plenty of hair over problem whereby I can't
make it off the login page. Trying to pass login info to a login page
and then move on to another page within the site so that I can download
some data, but I keep getting the login page.
Any ideas on this?
Here's the code:...
|
by: barrybevel |
last post by:
Hi,
I'm trying to login to the www.vodafone.ie website using
HttpWebRequest.
It works fine with IE/Firefox and the .NET Web Control too, just not
with my code.
I think it's a redirect 302 problem.
I'm using this code in a ASP.NET 2.0 application just in case that
matters,
maybe someone knows a better way to do this?
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it. ...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert...
| |
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |