473,320 Members | 1,823 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Debugging ScreenScrape Code

Hi All,
I have a very small screen scrape application, that has a small
problem. when I run the app and I have fiddler(an http tool to view
what is being sent by the requests/responses,
http://www.fiddlertool.com) the app works, and I am able to login to
the (intranet)website. If do not run the app while fiddler is running,
it does not work(the app returns html of the login page, instead of the
target page).

Here is the code, thanks in advance

Note it maybe easier to copy and paste this code into notepad to
view....

/*
* User: Mccollid
* Date: 10/3/2005
* Time: 11:25 AM
*
*/
using System;
using System.Drawing;
using System.Windows.Forms;
using System.Net;
using System.IO;
using System.Text;
using System.Web;

namespace ScreenScraper
{
/// <summary>
/// Description of MainForm.
/// </summary>
public class MainForm : System.Windows.Forms.Form
{
private System.Windows.Forms.Button button1;
private System.Windows.Forms.TextBox textBox1;
private string LOGIN_URL;
private string USERNAME;
private string PASSWORD;
private string SECRET_PAGE_URL;
private string COOKIEHOLDER;

public MainForm()
{
//
// The InitializeComponent() call is required for Windows Forms
designer support.
//

InitializeComponent();

//
// TODO: Add constructor code after the InitializeComponent() call.
//
}

[STAThread]
public static void Main(string[] args)
{
Application.Run(new MainForm());
}
void Button1Click(object sender, System.EventArgs e)
{
this.textBox1.Text="Connecting...";
this.LOGIN_URL="http://loginpage"; this.SECRET_PAGE_URL
="http://targetpage";
this.USERNAME ="UserName";
this.PASSWORD ="Password";

HttpWebRequest webrequest=WebRequest.Create(LOGIN_URL) as
HttpWebRequest;
StreamReader responseReader=new
StreamReader(webrequest.GetResponse().GetResponseS tream());

string responseData = responseReader.ReadToEnd();
//this.textBox1.Text=responseData;

//extract PathInfo value and build our post data
string pathInfo=ExtractPathInfo(responseData);
MessageBox.Show(pathInfo,pathInfo);
//string
postData=String.Format("pathInfo={0}&username={1}& password={2}&Login=Login",
pathInfo, USERNAME, PASSWORD);
string
postData=String.Format("username={1}&password={2}& pathInfo={0}",
pathInfo, USERNAME, PASSWORD);

this.textBox1.Text=postData;

//have a cookie container ready to receive the forms auth cookie
CookieContainer cookies=new CookieContainer();

//now post to the login form
webrequest=WebRequest.Create(LOGIN_URL) as HttpWebRequest;
webrequest.Method="Post";
webrequest.Credentials = CredentialCache.DefaultCredentials;
webrequest.UserAgent="User-Agent: Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 1.0.3705)";
webrequest.Accept="Accept: image/gif, image/x-xbitmap, image/jpeg,
image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel,
application/vnd.ms-powerpoint, application/msword, */*";
webrequest.ContentType="application/x-www-form-urlencoded";

webrequest.AllowAutoRedirect=true;
webrequest.CookieContainer=cookies;

webrequest.Referer="http://unatime.merck.com/unatime/action/home";


//write the form values into the request message
StreamWriter requestWriter = new
StreamWriter(webrequest.GetRequestStream());
requestWriter.Write(postData);
requestWriter.Close();
//we don't need the contents of the response, just the cookie

try
{
webrequest.GetResponse().Close();
}
catch (WebException ee)
{
// MessageBox.Show(ee.Message);
// this.textBox1.Text=ee.Message;
}

//webrequest.GetResponse().Close();

//now we can send out cookie along with a request for the protected
page
webrequest = WebRequest.Create(SECRET_PAGE_URL) as HttpWebRequest;
webrequest.Method="Post";
webrequest.Credentials = CredentialCache.DefaultCredentials;
webrequest.UserAgent="User-Agent: Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 1.0.3705)";
webrequest.Accept="Accept: image/gif, image/x-xbitmap, image/jpeg,
image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel,
application/vnd.ms-powerpoint, application/msword, */*";
webrequest.ContentType="application/x-www-form-urlencoded";
webrequest.AllowAutoRedirect=true;
webrequest.CookieContainer=cookies;
webrequest.Referer="http://unatime.merck.com/unatime/action/home";
responseReader=new
StreamReader(webrequest.GetResponse().GetResponseS tream());

//StreamReader readStream = new StreamReader
(webrequest.GetResponse().GetResponseStream(), Encoding.UTF8);

//and read the response
responseData = responseReader.ReadToEnd();
responseReader.Close();

//Response.Write(responseData);
this.textBox1.Text=responseData;

}
private string ExtractPathInfo(string s)
{
string viewStateNameDelimiter="pathInfo";
string valueDelimiter="value=\"";

int viewStateNamePosition=s.IndexOf(viewStateNameDelim iter);
int viewStateValuePosition=s.IndexOf(valueDelimiter,
viewStateNamePosition);

int viewStateStartPosition=viewStateValuePosition +
valueDelimiter.Length;
//int viewStateEndPosition=s.IndexOf("\"", viewStateStartPosition);
int viewStateEndPosition=s.IndexOf("\"", viewStateStartPosition);
return
HttpUtility.UrlEncodeUnicode(s.Substring(viewState StartPosition,
viewStateEndPosition-viewStateStartPosition));

}

}
}

Nov 17 '05 #1
6 1806
You are creating cookie container, which creates an empty cookie container.
You are assigning it to each web request. However, when you get your web
response, you aren't saving the cookies into the cookie container.
Therefore, on every call, you are failing to catch the cookies. It works
with the utility because that utility is catching the cookies for you.

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
"Dan McCollick" <mc*********@hotmail.com> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...
Hi All,
I have a very small screen scrape application, that has a small
problem. when I run the app and I have fiddler(an http tool to view
what is being sent by the requests/responses,
http://www.fiddlertool.com) the app works, and I am able to login to
the (intranet)website. If do not run the app while fiddler is running,
it does not work(the app returns html of the login page, instead of the
target page).

Here is the code, thanks in advance

Note it maybe easier to copy and paste this code into notepad to
view....

/*
* User: Mccollid
* Date: 10/3/2005
* Time: 11:25 AM
*
*/
using System;
using System.Drawing;
using System.Windows.Forms;
using System.Net;
using System.IO;
using System.Text;
using System.Web;

namespace ScreenScraper
{
/// <summary>
/// Description of MainForm.
/// </summary>
public class MainForm : System.Windows.Forms.Form
{
private System.Windows.Forms.Button button1;
private System.Windows.Forms.TextBox textBox1;
private string LOGIN_URL;
private string USERNAME;
private string PASSWORD;
private string SECRET_PAGE_URL;
private string COOKIEHOLDER;

public MainForm()
{
//
// The InitializeComponent() call is required for Windows Forms
designer support.
//

InitializeComponent();

//
// TODO: Add constructor code after the InitializeComponent() call.
//
}

[STAThread]
public static void Main(string[] args)
{
Application.Run(new MainForm());
}
void Button1Click(object sender, System.EventArgs e)
{
this.textBox1.Text="Connecting...";
this.LOGIN_URL="http://loginpage"; this.SECRET_PAGE_URL
="http://targetpage";
this.USERNAME ="UserName";
this.PASSWORD ="Password";

HttpWebRequest webrequest=WebRequest.Create(LOGIN_URL) as
HttpWebRequest;
StreamReader responseReader=new
StreamReader(webrequest.GetResponse().GetResponseS tream());

string responseData = responseReader.ReadToEnd();
//this.textBox1.Text=responseData;

//extract PathInfo value and build our post data
string pathInfo=ExtractPathInfo(responseData);
MessageBox.Show(pathInfo,pathInfo);
//string
postData=String.Format("pathInfo={0}&username={1}& password={2}&Login=Login",
pathInfo, USERNAME, PASSWORD);
string
postData=String.Format("username={1}&password={2}& pathInfo={0}",
pathInfo, USERNAME, PASSWORD);

this.textBox1.Text=postData;

//have a cookie container ready to receive the forms auth cookie
CookieContainer cookies=new CookieContainer();

//now post to the login form
webrequest=WebRequest.Create(LOGIN_URL) as HttpWebRequest;
webrequest.Method="Post";
webrequest.Credentials = CredentialCache.DefaultCredentials;
webrequest.UserAgent="User-Agent: Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 1.0.3705)";
webrequest.Accept="Accept: image/gif, image/x-xbitmap, image/jpeg,
image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel,
application/vnd.ms-powerpoint, application/msword, */*";
webrequest.ContentType="application/x-www-form-urlencoded";

webrequest.AllowAutoRedirect=true;
webrequest.CookieContainer=cookies;

webrequest.Referer="http://unatime.merck.com/unatime/action/home";


//write the form values into the request message
StreamWriter requestWriter = new
StreamWriter(webrequest.GetRequestStream());
requestWriter.Write(postData);
requestWriter.Close();
//we don't need the contents of the response, just the cookie

try
{
webrequest.GetResponse().Close();
}
catch (WebException ee)
{
// MessageBox.Show(ee.Message);
// this.textBox1.Text=ee.Message;
}

//webrequest.GetResponse().Close();

//now we can send out cookie along with a request for the protected
page
webrequest = WebRequest.Create(SECRET_PAGE_URL) as HttpWebRequest;
webrequest.Method="Post";
webrequest.Credentials = CredentialCache.DefaultCredentials;
webrequest.UserAgent="User-Agent: Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 1.0.3705)";
webrequest.Accept="Accept: image/gif, image/x-xbitmap, image/jpeg,
image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel,
application/vnd.ms-powerpoint, application/msword, */*";
webrequest.ContentType="application/x-www-form-urlencoded";
webrequest.AllowAutoRedirect=true;
webrequest.CookieContainer=cookies;
webrequest.Referer="http://unatime.merck.com/unatime/action/home";
responseReader=new
StreamReader(webrequest.GetResponse().GetResponseS tream());

//StreamReader readStream = new StreamReader
(webrequest.GetResponse().GetResponseStream(), Encoding.UTF8);

//and read the response
responseData = responseReader.ReadToEnd();
responseReader.Close();

//Response.Write(responseData);
this.textBox1.Text=responseData;

}
private string ExtractPathInfo(string s)
{
string viewStateNameDelimiter="pathInfo";
string valueDelimiter="value=\"";

int viewStateNamePosition=s.IndexOf(viewStateNameDelim iter);
int viewStateValuePosition=s.IndexOf(valueDelimiter,
viewStateNamePosition);

int viewStateStartPosition=viewStateValuePosition +
valueDelimiter.Length;
//int viewStateEndPosition=s.IndexOf("\"", viewStateStartPosition);
int viewStateEndPosition=s.IndexOf("\"", viewStateStartPosition);
return
HttpUtility.UrlEncodeUnicode(s.Substring(viewState StartPosition,
viewStateEndPosition-viewStateStartPosition));

}

}
}

Nov 17 '05 #2
When I try to add cookies my request times out?? I have indicated
where I think the problem lies(about half way down)...but I truely do
not understand.

HttpWebRequest wrequest= (HttpWebRequest) WebRequest.Create(LOGIN_URL)
as HttpWebRequest;

StreamReader responseReader = new
StreamReader(wrequest.GetResponse().GetResponseStr eam());

string responseData =responseReader.ReadToEnd().ToString();
responseReader.Close();

string pathInfo = ExtractPathInfo(responseData);
string postData=String.Format("username={1}&password={2}& pathInfo={0}",
pathInfo, USERNAME, PASSWORD);

wrequest=(HttpWebRequest) WebRequest.Create(LOGIN_URL) as
HttpWebRequest;
wrequest.Method="Post";
wrequest.ContentType = "application/x-www-form-urlencoded";
wrequest.CookieContainer= new CookieContainer();

MessageBox.Show("Cookie Section");
//cookies collected
//WHEN EXECUTING THIS CODE APP TIMES OUT
HttpWebResponse wresponse = (HttpWebResponse) wrequest.GetResponse();
wresponse.Cookies =
wrequest.CookieContainer.GetCookies(wrequest.Reque stUri);
this.textBox1.Text+=wresponse.StatusDescription;

StreamWriter rwriter= new StreamWriter(wrequest.GetRequestStream());
rwriter.Write(postData);
rwriter.Close();
MessageBox.Show("Target");
wrequest = (HttpWebRequest) WebRequest.Create(SECRET_PAGE_URL) as
HttpWebRequest;

//wrequest.CookieContainer= new CookieContainer();
responseReader = new
StreamReader(wrequest.GetResponse().GetResponseStr eam());

responseData=responseReader.ReadToEnd();
responseReader.Close();

this.textBox1.Text +=responseData.ToString();

this.textBox1.Text+="Done";

Nov 17 '05 #3
I am having trouble getting the HttpWebRequest.method ="Post"; When I
have this in the code, and run the app through fiddler, it is still
showing that a get was sent? am I calling this wrong?(it is
wrequest.method="post"; in my code).

Thanks
Dan

Nov 17 '05 #4
Also, when I check fiddler, it is saying that there are three cookies
being returned. Yet if i do a
MessageBox.Show(wresponse.getcookies(wrequest.requ est.uri).count.toString());
i only return 2? I can't figure out why this code works
sometimes...and then doesn't work, then works...then doesn't work...

Nov 17 '05 #5
Hi Dan,

First suggestion: use a different variable for the first request from the
second one.

Second suggestion: create the cookie container into a seperate variable
(like you were doing in the first snippet you posted)

Third suggestion: copy the cookies OUT of the first response into the cookie
container. Then assign these cookies IN to the second request.

You are discarding the cookies every time. The only reason it works at all
is by accident, because you've been using the same variable. However,
memory cookies won't transfer, only persistent cookies will, and your site
expects the memory cookie to hold the login token (pretty normal behavior).

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
"Dan McCollick" <mc*********@hotmail.com> wrote in message
news:11**********************@g49g2000cwa.googlegr oups.com...
When I try to add cookies my request times out?? I have indicated
where I think the problem lies(about half way down)...but I truely do
not understand.

HttpWebRequest wrequest= (HttpWebRequest) WebRequest.Create(LOGIN_URL)
as HttpWebRequest;

StreamReader responseReader = new
StreamReader(wrequest.GetResponse().GetResponseStr eam());

string responseData =responseReader.ReadToEnd().ToString();
responseReader.Close();

string pathInfo = ExtractPathInfo(responseData);
string postData=String.Format("username={1}&password={2}& pathInfo={0}",
pathInfo, USERNAME, PASSWORD);

wrequest=(HttpWebRequest) WebRequest.Create(LOGIN_URL) as
HttpWebRequest;
wrequest.Method="Post";
wrequest.ContentType = "application/x-www-form-urlencoded";
wrequest.CookieContainer= new CookieContainer();

MessageBox.Show("Cookie Section");
//cookies collected
//WHEN EXECUTING THIS CODE APP TIMES OUT
HttpWebResponse wresponse = (HttpWebResponse) wrequest.GetResponse();
wresponse.Cookies =
wrequest.CookieContainer.GetCookies(wrequest.Reque stUri);
this.textBox1.Text+=wresponse.StatusDescription;

StreamWriter rwriter= new StreamWriter(wrequest.GetRequestStream());
rwriter.Write(postData);
rwriter.Close();
MessageBox.Show("Target");
wrequest = (HttpWebRequest) WebRequest.Create(SECRET_PAGE_URL) as
HttpWebRequest;

//wrequest.CookieContainer= new CookieContainer();
responseReader = new
StreamReader(wrequest.GetResponse().GetResponseStr eam());

responseData=responseReader.ReadToEnd();
responseReader.Close();

this.textBox1.Text +=responseData.ToString();

this.textBox1.Text+="Done";

Nov 17 '05 #6
Thank you so much. Works great now.

Nov 17 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: R Millman | last post by:
under ASP.NET, single stepping in debug mode appears not to stop within event procedures. i.e. 1) Create web page with submit button and event procedure for the click event in the code behind...
0
by: ZMan | last post by:
Scenario: This is about debugging server side scripts that make calls to middle-tier business DLLs. The server side scripts are legacy ASP 3.0 pages, and the DLLs are managed DLLs...
5
by: Velvet | last post by:
Can someone tell me to what process I need to attach to be able to step through my classic ASP code in VS.net 2003. I'm working on an XP box with IIS installed. I also have VS.net 2005 (The...
5
by: snicks | last post by:
I'm trying to exec a program external to my ASP.NET app using the following code. The external app is a VB.NET application. Dim sPPTOut As String sPPTOut = MDEPDirStr + sID + ".ppt" Dim p As...
8
by: razael1 | last post by:
I am putting debugging messages into my program by putting blocks that look like this: #ifdef DEBUG errlog << "Here is some information"; #endif All these #ifdef blocks make the code bulky and...
5
by: phnimx | last post by:
Hi , We have developed a number of plug-in .NET Library Components that we typically deploy with our various applications by installing them into the GAC. Each of the applications contains an...
5
by: rn5a | last post by:
Can someone please suggest me a text editor especially for DEBUGGING ASP scripts apart from Microsoft Visual Interdev? I tried using Visual Interdev & created a project but Interdev generates...
0
jwwicks
by: jwwicks | last post by:
Introduction This tutorial describes how to use Visual Studio to create a new C++ program, compile/run a program, resume work on an existing program and debug a program. It is aimed at the...
2
jwwicks
by: jwwicks | last post by:
C/C++ Programs and Debugging in Linux This tutorial will give you a basic idea how to debug a program in Linux using GDB. As you are aware Visual Studio doesn’t run on Linux so you have to use...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.