Hi,
Hi have to do the followign and have been racking my brain with
various solutions that have had no so great results.
I want to use the System.Net.WebClient to submit data to a form (log a
user in) and then redirect to the correct article.
Here is the scenerio.
If you are not logged into the site for certain articles you are
redirected to a shtml login page. The login.shtml page posts to
another url for authentication and then lets you in. If have clicked
on an article that you have to log in to, then you are sent to the
login page with an appeneded URL, http://www.domainname.com?orq:http:/..._2653091.shtml.
I have tried setting a webclient request to the url that the above
login form posts too, but I keep getting Method Not Allowed.
Any Ideas? 14 7802
more info required, but here is typical login
1) you request a page with webclient
2) you are returned a redirect header to the login page.
3) you code detects the login redirect, then post the required form data to
the login page (manually view the login page to get the form fields required
and method).
note: an asp.net login site requires that you actually do a get to the
login page to get valid viewstate to postback. other systems may also
require scaping of the get data to before doing the actual post.
4) a successful post to the login will return a cookie value you must send
on subsequent requests, and a redirect header to the originally requested
page.
-- bruce (sqlwork.com)
"n8" <na********@yahoo.com> wrote in message
news:6a**************************@posting.google.c om...
| Hi,
|
| Hi have to do the followign and have been racking my brain with
| various solutions that have had no so great results.
|
| I want to use the System.Net.WebClient to submit data to a form (log a
| user in) and then redirect to the correct article.
|
| Here is the scenerio.
| If you are not logged into the site for certain articles you are
| redirected to a shtml login page. The login.shtml page posts to
| another url for authentication and then lets you in. If have clicked
| on an article that you have to log in to, then you are sent to the
| login page with an appeneded URL,
| http://www.domainname.com?orq:http:/...a_2653091.shtm
l.
| I have tried setting a webclient request to the url that the above
| login form posts too, but I keep getting Method Not Allowed.
|
| Any Ideas?
I have an exampe of this here: http://odetocode.com/Articles/162.aspx
It's basically posting to the login form, getting the cookie back, and
then making sure to send the cookie along when requesting the
protected content.
--
Scott http://www.OdeToCode.com/blogs/scott/
On 24 Nov 2004 13:55:23 -0800, na********@yahoo.com (n8) wrote: Hi,
Hi have to do the followign and have been racking my brain with various solutions that have had no so great results.
I want to use the System.Net.WebClient to submit data to a form (log a user in) and then redirect to the correct article.
Here is the scenerio. If you are not logged into the site for certain articles you are redirected to a shtml login page. The login.shtml page posts to another url for authentication and then lets you in. If have clicked on an article that you have to log in to, then you are sent to the login page with an appeneded URL, http://www.domainname.com?orq:http:/..._2653091.shtml. I have tried setting a webclient request to the url that the above login form posts too, but I keep getting Method Not Allowed.
Any Ideas?
Scott,
FYI - that was one of the best articles on the subject I ever read.
I was completely stuck on this issue about 6 months ago and I implemented it
straight away using the concepts you presented here.
Excellent work and explanation.
--
Joe Fallon
"Scott Allen" <bitmask@[nospam].fred.net> wrote in message
news:av********************************@4ax.com... I have an exampe of this here:
http://odetocode.com/Articles/162.aspx
It's basically posting to the login form, getting the cookie back, and then making sure to send the cookie along when requesting the protected content.
-- Scott http://www.OdeToCode.com/blogs/scott/
On 24 Nov 2004 13:55:23 -0800, na********@yahoo.com (n8) wrote:
Hi,
Hi have to do the followign and have been racking my brain with various solutions that have had no so great results.
I want to use the System.Net.WebClient to submit data to a form (log a user in) and then redirect to the correct article.
Here is the scenerio. If you are not logged into the site for certain articles you are redirected to a shtml login page. The login.shtml page posts to another url for authentication and then lets you in. If have clicked on an article that you have to log in to, then you are sent to the login page with an appeneded URL, http://www.domainname.com?orq:http:/..._2653091.shtml. I have tried setting a webclient request to the url that the above login form posts too, but I keep getting Method Not Allowed.
Any Ideas?
Thanks, Joe. I appreciate the feedback.
--
Scott http://www.OdeToCode.com/blogs/scott/
On Wed, 24 Nov 2004 20:48:24 -0500, "Joe Fallon"
<jf******@nospamtwcny.rr.com> wrote: Scott, FYI - that was one of the best articles on the subject I ever read. I was completely stuck on this issue about 6 months ago and I implemented it straight away using the concepts you presented here.
Excellent work and explanation.
Thanks for the example. I had seen your example earlier and had tried
it and always get to one particular point where I cannot seem to get
beyond. There are two hidden fields both called web.fixed_values that
appear to be something like a view state but the page is shtml. I am
and have been able to pull down the site, etc. but everytime I try and
post my data (with or without the web.fixed_values) I always get the
response Method Not Allowed. Below is the code I am using along with
the sire I am trying to access with my account. ANy further help on
this would be greatly appreciated.
private void Page_Load(object sender, System.EventArgs e)
{
string LOGIN_URL = "http://augustachronicle.com/login.shtml";
string cookieAge = "31536000";
try
{
HttpWebRequest webRequest = WebRequest.Create(LOGIN_URL) as
HttpWebRequest;
StreamReader responseReader = new
StreamReader(webRequest.GetResponse().GetResponseS tream());
string responseData = responseReader.ReadToEnd();
responseReader.Close();
// get the web fixed values
string fixedvalue1 = ExtractFixedValues1(responseData);
string fixedvalue2 = ExtractFixedValues2(responseData);
string postData = String.Format("web.fixed_values={0}&web.fixed_valu es={1}&ACTION=Login&USER={2}&PASS={3}&cookie_age={ 4}",fixedvalue1,fixedvalue2,userName,
password, cookieAge);
// have a cookie container ready to receive the forms auth cookie
CookieContainer cookies = new CookieContainer();
// now post to the login form
webRequest = WebRequest.Create(LOGIN_URL) as HttpWebRequest;
webRequest.Method = "POST";
webRequest.ContentType = "application/x-www-form-urlencoded";
webRequest.CookieContainer = cookies;
// write the form values into the request message
StreamWriter requestWriter = new
StreamWriter(webRequest.GetRequestStream());
requestWriter.Write(postData);
requestWriter.Close();
// we don't need the contents of the response, just the cookie it
issues
webRequest.GetResponse().Close();
// now we can send out cookie along with a request for the protected
page
webRequest = WebRequest.Create("http://augustachronicle.com/stories/112404/usc_FBC--SpurrierProfile.shtml")
as HttpWebRequest;
webRequest.CookieContainer = cookies;
responseReader = new
StreamReader(webRequest.GetResponse().GetResponseS tream());
// and read the response
responseData = responseReader.ReadToEnd();
responseReader.Close();
Response.Write(responseData);
}
catch (Exception ex)
{
Response.Write(ex.ToString());
}
}
private string ExtractFixedValues1(string s)
{
string viewStateNameDelimiter = "web.fixed_values";
string valueDelimiter = "value=\"";
int viewStateNamePosition = s.IndexOf(viewStateNameDelimiter);
int viewStateValuePosition = s.IndexOf(
valueDelimiter, viewStateNamePosition
);
int viewStateStartPosition = viewStateValuePosition +
valueDelimiter.Length;
int viewStateEndPosition = s.IndexOf("\"", viewStateStartPosition);
return HttpUtility.UrlEncodeUnicode(
s.Substring(viewStateStartPosition,
viewStateEndPosition - viewStateStartPosition
)
);
}
private string ExtractFixedValues2(string s)
{
string viewStateNameDelimiter = "web.fixed_values";
string valueDelimiter = "value=\"";
int viewStateNamePosition = s.IndexOf(viewStateNameDelimiter);
int viewStateValuePosition = s.IndexOf(valueDelimiter,
viewStateNamePosition
);
int viewStateStartPosition = viewStateValuePosition +
valueDelimiter.Length;
int viewStateEndPosition = s.IndexOf("\"", viewStateStartPosition);
string sTemp = s.Remove(0,viewStateEndPosition);
viewStateNamePosition = sTemp.IndexOf(viewStateNameDelimiter);
viewStateValuePosition = sTemp.IndexOf(
valueDelimiter, viewStateNamePosition
);
viewStateStartPosition = viewStateValuePosition +
valueDelimiter.Length;
viewStateEndPosition = sTemp.IndexOf("\"", viewStateStartPosition);
return HttpUtility.UrlEncodeUnicode(
sTemp.Substring(
viewStateStartPosition,
viewStateEndPosition - viewStateStartPosition
)
);
}
Scott Allen <bitmask@[nospam].fred.net> wrote in message news:<k8********************************@4ax.com>. .. Thanks, Joe. I appreciate the feedback.
-- Scott http://www.OdeToCode.com/blogs/scott/
On Wed, 24 Nov 2004 20:48:24 -0500, "Joe Fallon" <jf******@nospamtwcny.rr.com> wrote:
Scott, FYI - that was one of the best articles on the subject I ever read. I was completely stuck on this issue about 6 months ago and I implemented it straight away using the concepts you presented here.
Excellent work and explanation.
Everything looks like it is in order, Nathan. I'd examine the HTTP
traffic between your program and the server to make sure it all
matches exactly, even little things like the Agent header. I had one
financial site reject HttpWebRequests until I set the UserAgent
property to look just like IE. I guess it was a weak attempt at
preventing screen scraping programs.
--
Scott http://www.OdeToCode.com/blogs/scott/
n 27 Nov 2004 12:39:42 -0800, na********@yahoo.com (n8) wrote: Thanks for the example. I had seen your example earlier and had tried it and always get to one particular point where I cannot seem to get beyond. There are two hidden fields both called web.fixed_values that appear to be something like a view state but the page is shtml. I am and have been able to pull down the site, etc. but everytime I try and post my data (with or without the web.fixed_values) I always get the response Method Not Allowed. Below is the code I am using along with the sire I am trying to access with my account. ANy further help on this would be greatly appreciated.
private void Page_Load(object sender, System.EventArgs e) { string LOGIN_URL = "http://augustachronicle.com/login.shtml"; string cookieAge = "31536000";
try { HttpWebRequest webRequest = WebRequest.Create(LOGIN_URL) as HttpWebRequest;
StreamReader responseReader = new StreamReader(webRequest.GetResponse().GetResponse Stream());
string responseData = responseReader.ReadToEnd(); responseReader.Close();
// get the web fixed values string fixedvalue1 = ExtractFixedValues1(responseData);
string fixedvalue2 = ExtractFixedValues2(responseData);
string postData = String.Format("web.fixed_values={0}&web.fixed_valu es={1}&ACTION=Login&USER={2}&PASS={3}&cookie_age={ 4}",fixedvalue1,fixedvalue2,userName, password, cookieAge);
// have a cookie container ready to receive the forms auth cookie CookieContainer cookies = new CookieContainer();
// now post to the login form webRequest = WebRequest.Create(LOGIN_URL) as HttpWebRequest; webRequest.Method = "POST"; webRequest.ContentType = "application/x-www-form-urlencoded"; webRequest.CookieContainer = cookies;
// write the form values into the request message StreamWriter requestWriter = new StreamWriter(webRequest.GetRequestStream()); requestWriter.Write(postData); requestWriter.Close();
// we don't need the contents of the response, just the cookie it issues webRequest.GetResponse().Close();
// now we can send out cookie along with a request for the protected page webRequest = WebRequest.Create("http://augustachronicle.com/stories/112404/usc_FBC--SpurrierProfile.shtml") as HttpWebRequest; webRequest.CookieContainer = cookies; responseReader = new StreamReader(webRequest.GetResponse().GetResponse Stream());
// and read the response responseData = responseReader.ReadToEnd(); responseReader.Close();
Response.Write(responseData); } catch (Exception ex) { Response.Write(ex.ToString()); }
}
private string ExtractFixedValues1(string s) { string viewStateNameDelimiter = "web.fixed_values"; string valueDelimiter = "value=\"";
int viewStateNamePosition = s.IndexOf(viewStateNameDelimiter); int viewStateValuePosition = s.IndexOf( valueDelimiter, viewStateNamePosition );
int viewStateStartPosition = viewStateValuePosition + valueDelimiter.Length; int viewStateEndPosition = s.IndexOf("\"", viewStateStartPosition);
return HttpUtility.UrlEncodeUnicode( s.Substring(viewStateStartPosition, viewStateEndPosition - viewStateStartPosition ) ); }
private string ExtractFixedValues2(string s) { string viewStateNameDelimiter = "web.fixed_values"; string valueDelimiter = "value=\"";
int viewStateNamePosition = s.IndexOf(viewStateNameDelimiter); int viewStateValuePosition = s.IndexOf(valueDelimiter, viewStateNamePosition );
int viewStateStartPosition = viewStateValuePosition + valueDelimiter.Length; int viewStateEndPosition = s.IndexOf("\"", viewStateStartPosition);
string sTemp = s.Remove(0,viewStateEndPosition);
viewStateNamePosition = sTemp.IndexOf(viewStateNameDelimiter); viewStateValuePosition = sTemp.IndexOf( valueDelimiter, viewStateNamePosition );
viewStateStartPosition = viewStateValuePosition + valueDelimiter.Length; viewStateEndPosition = sTemp.IndexOf("\"", viewStateStartPosition);
return HttpUtility.UrlEncodeUnicode( sTemp.Substring( viewStateStartPosition, viewStateEndPosition - viewStateStartPosition ) ); }
Scott Allen <bitmask@[nospam].fred.net> wrote in message news:<k8********************************@4ax.com>. .. Thanks, Joe. I appreciate the feedback.
-- Scott http://www.OdeToCode.com/blogs/scott/
On Wed, 24 Nov 2004 20:48:24 -0500, "Joe Fallon" <jf******@nospamtwcny.rr.com> wrote:
>Scott, >FYI - that was one of the best articles on the subject I ever read. >I was completely stuck on this issue about 6 months ago and I implemented it >straight away using the concepts you presented here. > >Excellent work and explanation.
Scott,
Thanks for the information. I added a useragent to make it look like
IE, but I still get the 405 Method not allowed error message. What is
the best way to monitor the HTTP Traffic between my application and
the remote site? Are there any tools i can download to show me what
is going back and forth?
Thanks in advance,
n8
Scott Allen <bitmask@[nospam].fred.net> wrote in message news:<2b********************************@4ax.com>. .. Everything looks like it is in order, Nathan. I'd examine the HTTP traffic between your program and the server to make sure it all matches exactly, even little things like the Agent header. I had one financial site reject HttpWebRequests until I set the UserAgent property to look just like IE. I guess it was a weak attempt at preventing screen scraping programs.
-- Scott http://www.OdeToCode.com/blogs/scott/
n 27 Nov 2004 12:39:42 -0800, na********@yahoo.com (n8) wrote:
Thanks for the example. I had seen your example earlier and had tried it and always get to one particular point where I cannot seem to get beyond. There are two hidden fields both called web.fixed_values that appear to be something like a view state but the page is shtml. I am and have been able to pull down the site, etc. but everytime I try and post my data (with or without the web.fixed_values) I always get the response Method Not Allowed. Below is the code I am using along with the sire I am trying to access with my account. ANy further help on this would be greatly appreciated.
private void Page_Load(object sender, System.EventArgs e) { string LOGIN_URL = "http://augustachronicle.com/login.shtml"; string cookieAge = "31536000";
try { HttpWebRequest webRequest = WebRequest.Create(LOGIN_URL) as HttpWebRequest;
StreamReader responseReader = new StreamReader(webRequest.GetResponse().GetResponse Stream());
string responseData = responseReader.ReadToEnd(); responseReader.Close();
// get the web fixed values string fixedvalue1 = ExtractFixedValues1(responseData);
string fixedvalue2 = ExtractFixedValues2(responseData);
string postData = String.Format("web.fixed_values={0}&web.fixed_valu es={1}&ACTION=Login&USER={2}&PASS={3}&cookie_age={ 4}",fixedvalue1,fixedvalue2,userName, password, cookieAge);
// have a cookie container ready to receive the forms auth cookie CookieContainer cookies = new CookieContainer();
// now post to the login form webRequest = WebRequest.Create(LOGIN_URL) as HttpWebRequest; webRequest.Method = "POST"; webRequest.ContentType = "application/x-www-form-urlencoded"; webRequest.CookieContainer = cookies;
// write the form values into the request message StreamWriter requestWriter = new StreamWriter(webRequest.GetRequestStream()); requestWriter.Write(postData); requestWriter.Close();
// we don't need the contents of the response, just the cookie it issues webRequest.GetResponse().Close();
// now we can send out cookie along with a request for the protected page webRequest = WebRequest.Create("http://augustachronicle.com/stories/112404/usc_FBC--SpurrierProfile.shtml") as HttpWebRequest; webRequest.CookieContainer = cookies; responseReader = new StreamReader(webRequest.GetResponse().GetResponse Stream());
// and read the response responseData = responseReader.ReadToEnd(); responseReader.Close();
Response.Write(responseData); } catch (Exception ex) { Response.Write(ex.ToString()); }
}
private string ExtractFixedValues1(string s) { string viewStateNameDelimiter = "web.fixed_values"; string valueDelimiter = "value=\"";
int viewStateNamePosition = s.IndexOf(viewStateNameDelimiter); int viewStateValuePosition = s.IndexOf( valueDelimiter, viewStateNamePosition );
int viewStateStartPosition = viewStateValuePosition + valueDelimiter.Length; int viewStateEndPosition = s.IndexOf("\"", viewStateStartPosition);
return HttpUtility.UrlEncodeUnicode( s.Substring(viewStateStartPosition, viewStateEndPosition - viewStateStartPosition ) ); }
private string ExtractFixedValues2(string s) { string viewStateNameDelimiter = "web.fixed_values"; string valueDelimiter = "value=\"";
int viewStateNamePosition = s.IndexOf(viewStateNameDelimiter); int viewStateValuePosition = s.IndexOf(valueDelimiter, viewStateNamePosition );
int viewStateStartPosition = viewStateValuePosition + valueDelimiter.Length; int viewStateEndPosition = s.IndexOf("\"", viewStateStartPosition);
string sTemp = s.Remove(0,viewStateEndPosition);
viewStateNamePosition = sTemp.IndexOf(viewStateNameDelimiter); viewStateValuePosition = sTemp.IndexOf( valueDelimiter, viewStateNamePosition );
viewStateStartPosition = viewStateValuePosition + valueDelimiter.Length; viewStateEndPosition = sTemp.IndexOf("\"", viewStateStartPosition);
return HttpUtility.UrlEncodeUnicode( sTemp.Substring( viewStateStartPosition, viewStateEndPosition - viewStateStartPosition ) ); }
Scott Allen <bitmask@[nospam].fred.net> wrote in message news:<k8********************************@4ax.com>. .. Thanks, Joe. I appreciate the feedback.
-- Scott http://www.OdeToCode.com/blogs/scott/
On Wed, 24 Nov 2004 20:48:24 -0500, "Joe Fallon" <jf******@nospamtwcny.rr.com> wrote:
>Scott, >FYI - that was one of the best articles on the subject I ever read. >I was completely stuck on this issue about 6 months ago and I implemented it >straight away using the concepts you presented here. > >Excellent work and explanation.
You might try a program called httplook. I think it is http://www.httplook.com if not, google for it...
"n8" <na********@yahoo.com> wrote in message
news:6a**************************@posting.google.c om... Scott,
Thanks for the information. I added a useragent to make it look like IE, but I still get the 405 Method not allowed error message. What is the best way to monitor the HTTP Traffic between my application and the remote site? Are there any tools i can download to show me what is going back and forth?
Thanks in advance,
n8 Scott Allen <bitmask@[nospam].fred.net> wrote in message news:<2b********************************@4ax.com>. .. Everything looks like it is in order, Nathan. I'd examine the HTTP traffic between your program and the server to make sure it all matches exactly, even little things like the Agent header. I had one financial site reject HttpWebRequests until I set the UserAgent property to look just like IE. I guess it was a weak attempt at preventing screen scraping programs.
-- Scott http://www.OdeToCode.com/blogs/scott/
n 27 Nov 2004 12:39:42 -0800, na********@yahoo.com (n8) wrote:
>Thanks for the example. I had seen your example earlier and had tried >it and always get to one particular point where I cannot seem to get >beyond. There are two hidden fields both called web.fixed_values that >appear to be something like a view state but the page is shtml. I am >and have been able to pull down the site, etc. but everytime I try and >post my data (with or without the web.fixed_values) I always get the >response Method Not Allowed. Below is the code I am using along with >the sire I am trying to access with my account. ANy further help on >this would be greatly appreciated. > >private void Page_Load(object sender, System.EventArgs e) >{ >string LOGIN_URL = "http://augustachronicle.com/login.shtml"; >string cookieAge = "31536000"; > >try >{ >HttpWebRequest webRequest = WebRequest.Create(LOGIN_URL) as >HttpWebRequest; > >StreamReader responseReader = new >StreamReader(webRequest.GetResponse().GetResponse Stream()); > >string responseData = responseReader.ReadToEnd(); >responseReader.Close(); > >// get the web fixed values >string fixedvalue1 = ExtractFixedValues1(responseData); > >string fixedvalue2 = ExtractFixedValues2(responseData); > >string postData = >String.Format("web.fixed_values={0}&web.fixed_val ues={1}&ACTION=Login&USER={2}&PASS={3}&cookie_age= {4}",fixedvalue1,fixedvalue2,userName, >password, cookieAge); > >// have a cookie container ready to receive the forms auth cookie >CookieContainer cookies = new CookieContainer(); > >// now post to the login form >webRequest = WebRequest.Create(LOGIN_URL) as HttpWebRequest; >webRequest.Method = "POST"; >webRequest.ContentType = "application/x-www-form-urlencoded"; >webRequest.CookieContainer = cookies; > >// write the form values into the request message >StreamWriter requestWriter = new >StreamWriter(webRequest.GetRequestStream()); >requestWriter.Write(postData); >requestWriter.Close(); > >// we don't need the contents of the response, just the cookie it >issues >webRequest.GetResponse().Close(); > >// now we can send out cookie along with a request for the protected >page >webRequest = >WebRequest.Create("http://augustachronicle.com/stories/112404/usc_FBC--SpurrierProfile.shtml") >as HttpWebRequest; >webRequest.CookieContainer = cookies; >responseReader = new >StreamReader(webRequest.GetResponse().GetResponse Stream()); > >// and read the response >responseData = responseReader.ReadToEnd(); >responseReader.Close(); > >Response.Write(responseData); >} >catch (Exception ex) >{ >Response.Write(ex.ToString()); >}
>} > >private string ExtractFixedValues1(string s) >{ >string viewStateNameDelimiter = "web.fixed_values"; >string valueDelimiter = "value=\""; > >int viewStateNamePosition = s.IndexOf(viewStateNameDelimiter); >int viewStateValuePosition = s.IndexOf( >valueDelimiter, viewStateNamePosition >); > >int viewStateStartPosition = viewStateValuePosition + >valueDelimiter.Length; >int viewStateEndPosition = s.IndexOf("\"", viewStateStartPosition); > >return HttpUtility.UrlEncodeUnicode( >s.Substring(viewStateStartPosition, > viewStateEndPosition - viewStateStartPosition >) >); >} > > >private string ExtractFixedValues2(string s) >{ >string viewStateNameDelimiter = "web.fixed_values"; >string valueDelimiter = "value=\""; > >int viewStateNamePosition = s.IndexOf(viewStateNameDelimiter); >int viewStateValuePosition = s.IndexOf(valueDelimiter, >viewStateNamePosition > ); > >int viewStateStartPosition = viewStateValuePosition + >valueDelimiter.Length; >int viewStateEndPosition = s.IndexOf("\"", viewStateStartPosition); > >string sTemp = s.Remove(0,viewStateEndPosition); > >viewStateNamePosition = sTemp.IndexOf(viewStateNameDelimiter); >viewStateValuePosition = sTemp.IndexOf( >valueDelimiter, viewStateNamePosition >); > >viewStateStartPosition = viewStateValuePosition + >valueDelimiter.Length; >viewStateEndPosition = sTemp.IndexOf("\"", viewStateStartPosition); > >return HttpUtility.UrlEncodeUnicode( >sTemp.Substring( >viewStateStartPosition, >viewStateEndPosition - viewStateStartPosition >) >); >} > > >Scott Allen <bitmask@[nospam].fred.net> wrote in message >news:<k8********************************@4ax.com> ... >> Thanks, Joe. I appreciate the feedback. >> >> -- >> Scott >> http://www.OdeToCode.com/blogs/scott/ >> >> On Wed, 24 Nov 2004 20:48:24 -0500, "Joe Fallon" >> <jf******@nospamtwcny.rr.com> wrote: >> >> >Scott, >> >FYI - that was one of the best articles on the subject I ever read. >> >I was completely stuck on this issue about 6 months ago and I >> >implemented it >> >straight away using the concepts you presented here. >> > >> >Excellent work and explanation.
Also, if you get a fix - please let us know.
"n8" <na********@yahoo.com> wrote in message
news:6a**************************@posting.google.c om... Scott,
Thanks for the information. I added a useragent to make it look like IE, but I still get the 405 Method not allowed error message. What is the best way to monitor the HTTP Traffic between my application and the remote site? Are there any tools i can download to show me what is going back and forth?
Thanks in advance,
n8 Scott Allen <bitmask@[nospam].fred.net> wrote in message news:<2b********************************@4ax.com>. .. Everything looks like it is in order, Nathan. I'd examine the HTTP traffic between your program and the server to make sure it all matches exactly, even little things like the Agent header. I had one financial site reject HttpWebRequests until I set the UserAgent property to look just like IE. I guess it was a weak attempt at preventing screen scraping programs.
-- Scott http://www.OdeToCode.com/blogs/scott/
n 27 Nov 2004 12:39:42 -0800, na********@yahoo.com (n8) wrote:
>Thanks for the example. I had seen your example earlier and had tried >it and always get to one particular point where I cannot seem to get >beyond. There are two hidden fields both called web.fixed_values that >appear to be something like a view state but the page is shtml. I am >and have been able to pull down the site, etc. but everytime I try and >post my data (with or without the web.fixed_values) I always get the >response Method Not Allowed. Below is the code I am using along with >the sire I am trying to access with my account. ANy further help on >this would be greatly appreciated. > >private void Page_Load(object sender, System.EventArgs e) >{ >string LOGIN_URL = "http://augustachronicle.com/login.shtml"; >string cookieAge = "31536000"; > >try >{ >HttpWebRequest webRequest = WebRequest.Create(LOGIN_URL) as >HttpWebRequest; > >StreamReader responseReader = new >StreamReader(webRequest.GetResponse().GetResponse Stream()); > >string responseData = responseReader.ReadToEnd(); >responseReader.Close(); > >// get the web fixed values >string fixedvalue1 = ExtractFixedValues1(responseData); > >string fixedvalue2 = ExtractFixedValues2(responseData); > >string postData = >String.Format("web.fixed_values={0}&web.fixed_val ues={1}&ACTION=Login&USER={2}&PASS={3}&cookie_age= {4}",fixedvalue1,fixedvalue2,userName, >password, cookieAge); > >// have a cookie container ready to receive the forms auth cookie >CookieContainer cookies = new CookieContainer(); > >// now post to the login form >webRequest = WebRequest.Create(LOGIN_URL) as HttpWebRequest; >webRequest.Method = "POST"; >webRequest.ContentType = "application/x-www-form-urlencoded"; >webRequest.CookieContainer = cookies; > >// write the form values into the request message >StreamWriter requestWriter = new >StreamWriter(webRequest.GetRequestStream()); >requestWriter.Write(postData); >requestWriter.Close(); > >// we don't need the contents of the response, just the cookie it >issues >webRequest.GetResponse().Close(); > >// now we can send out cookie along with a request for the protected >page >webRequest = >WebRequest.Create("http://augustachronicle.com/stories/112404/usc_FBC--SpurrierProfile.shtml") >as HttpWebRequest; >webRequest.CookieContainer = cookies; >responseReader = new >StreamReader(webRequest.GetResponse().GetResponse Stream()); > >// and read the response >responseData = responseReader.ReadToEnd(); >responseReader.Close(); > >Response.Write(responseData); >} >catch (Exception ex) >{ >Response.Write(ex.ToString()); >}
>} > >private string ExtractFixedValues1(string s) >{ >string viewStateNameDelimiter = "web.fixed_values"; >string valueDelimiter = "value=\""; > >int viewStateNamePosition = s.IndexOf(viewStateNameDelimiter); >int viewStateValuePosition = s.IndexOf( >valueDelimiter, viewStateNamePosition >); > >int viewStateStartPosition = viewStateValuePosition + >valueDelimiter.Length; >int viewStateEndPosition = s.IndexOf("\"", viewStateStartPosition); > >return HttpUtility.UrlEncodeUnicode( >s.Substring(viewStateStartPosition, > viewStateEndPosition - viewStateStartPosition >) >); >} > > >private string ExtractFixedValues2(string s) >{ >string viewStateNameDelimiter = "web.fixed_values"; >string valueDelimiter = "value=\""; > >int viewStateNamePosition = s.IndexOf(viewStateNameDelimiter); >int viewStateValuePosition = s.IndexOf(valueDelimiter, >viewStateNamePosition > ); > >int viewStateStartPosition = viewStateValuePosition + >valueDelimiter.Length; >int viewStateEndPosition = s.IndexOf("\"", viewStateStartPosition); > >string sTemp = s.Remove(0,viewStateEndPosition); > >viewStateNamePosition = sTemp.IndexOf(viewStateNameDelimiter); >viewStateValuePosition = sTemp.IndexOf( >valueDelimiter, viewStateNamePosition >); > >viewStateStartPosition = viewStateValuePosition + >valueDelimiter.Length; >viewStateEndPosition = sTemp.IndexOf("\"", viewStateStartPosition); > >return HttpUtility.UrlEncodeUnicode( >sTemp.Substring( >viewStateStartPosition, >viewStateEndPosition - viewStateStartPosition >) >); >} > > >Scott Allen <bitmask@[nospam].fred.net> wrote in message >news:<k8********************************@4ax.com> ... >> Thanks, Joe. I appreciate the feedback. >> >> -- >> Scott >> http://www.OdeToCode.com/blogs/scott/ >> >> On Wed, 24 Nov 2004 20:48:24 -0500, "Joe Fallon" >> <jf******@nospamtwcny.rr.com> wrote: >> >> >Scott, >> >FYI - that was one of the best articles on the subject I ever read. >> >I was completely stuck on this issue about 6 months ago and I >> >implemented it >> >straight away using the concepts you presented here. >> > >> >Excellent work and explanation.
One I've used with success is Fiddler. http://www.fiddlertool.com/fiddler/
--
Scott http://www.OdeToCode.com/blogs/scott/
On 28 Nov 2004 13:36:40 -0800, na********@yahoo.com (n8) wrote: Scott,
Thanks for the information. I added a useragent to make it look like IE, but I still get the 405 Method not allowed error message. What is the best way to monitor the HTTP Traffic between my application and the remote site? Are there any tools i can download to show me what is going back and forth?
Thanks in advance,
n8
Scott,
I loaded the fiddler tool and traced the HTTP traffic. Everything
getting sent look sno different than when i go directly to it. Am I
to assume that they have a way of blocking screen scrapes and if so,
how would I explain this?
Thanks,
n8
Scott Allen <bitmask@[nospam].fred.net> wrote in message news:<mq********************************@4ax.com>. .. One I've used with success is Fiddler.
http://www.fiddlertool.com/fiddler/
-- Scott http://www.OdeToCode.com/blogs/scott/
On 28 Nov 2004 13:36:40 -0800, na********@yahoo.com (n8) wrote:
Scott,
Thanks for the information. I added a useragent to make it look like IE, but I still get the 405 Method not allowed error message. What is the best way to monitor the HTTP Traffic between my application and the remote site? Are there any tools i can download to show me what is going back and forth?
Thanks in advance,
n8
Hmm - I'm running out of ideas n8.
I know there are sites out there blocking scrapers, but they usually
either block an IP or use client side script and DHTML to try to screw
up programs. If your app is sending the same traffic as the browser
that wouldn't be an issue.
So, my last idea is this:
Last year I had a site that would occasionaly reject my web request
from a screen scraping program. It was in a loop moving through a
paged result set, and I couldn't figure out the random failures. On a
whim I put in a few Thread.Sleep calls to slow the scraper down
between requests and it never failed. I'm not sure if they monitored
requests by IP to only allow so many per second or minute or what,
though it was definitely timing related.
I guess the only other thing I'd do is really double check those HTTP
payloads and make sure everything matches - the headers, the POST data
is properly encoded, the cookie is sent, etc. etc.
HTH!
--
Scott http://www.OdeToCode.com/blogs/scott/\
On 29 Nov 2004 11:42:48 -0800, na********@yahoo.com (n8) wrote: Scott,
I loaded the fiddler tool and traced the HTTP traffic. Everything getting sent look sno different than when i go directly to it. Am I to assume that they have a way of blocking screen scrapes and if so, how would I explain this?
Thanks,
n8
Scott Allen <bitmask@[nospam].fred.net> wrote in message news:<mq********************************@4ax.com>. .. One I've used with success is Fiddler.
http://www.fiddlertool.com/fiddler/
-- Scott http://www.OdeToCode.com/blogs/scott/
On 28 Nov 2004 13:36:40 -0800, na********@yahoo.com (n8) wrote:
>Scott, > >Thanks for the information. I added a useragent to make it look like >IE, but I still get the 405 Method not allowed error message. What is >the best way to monitor the HTTP Traffic between my application and >the remote site? Are there any tools i can download to show me what >is going back and forth? > >Thanks in advance, > >n8 >
a different approach. since i have been rackign my head against the
wall with this approach I thought I would try another. I thought I
would create the cookies on the fly that the site requires for the
user account and everything would be create. I can create the cookies
exactly, BUT if i change the domain property or use the domain
property the cookie does not get written, if i leave the property (do
not use it), the cooie gets written as localhost. how do i get around
this so I can set the domain name property?
thanks again
n8
Scott Allen <bitmask@[nospam].fred.net> wrote in message news:<0f********************************@4ax.com>. .. Hmm - I'm running out of ideas n8.
I know there are sites out there blocking scrapers, but they usually either block an IP or use client side script and DHTML to try to screw up programs. If your app is sending the same traffic as the browser that wouldn't be an issue.
So, my last idea is this:
Last year I had a site that would occasionaly reject my web request from a screen scraping program. It was in a loop moving through a paged result set, and I couldn't figure out the random failures. On a whim I put in a few Thread.Sleep calls to slow the scraper down between requests and it never failed. I'm not sure if they monitored requests by IP to only allow so many per second or minute or what, though it was definitely timing related.
I guess the only other thing I'd do is really double check those HTTP payloads and make sure everything matches - the headers, the POST data is properly encoded, the cookie is sent, etc. etc.
HTH!
-- Scott http://www.OdeToCode.com/blogs/scott/\
On 29 Nov 2004 11:42:48 -0800, na********@yahoo.com (n8) wrote:
Scott,
I loaded the fiddler tool and traced the HTTP traffic. Everything getting sent look sno different than when i go directly to it. Am I to assume that they have a way of blocking screen scrapes and if so, how would I explain this?
Thanks,
n8
Scott Allen <bitmask@[nospam].fred.net> wrote in message news:<mq********************************@4ax.com>. .. One I've used with success is Fiddler.
http://www.fiddlertool.com/fiddler/
-- Scott http://www.OdeToCode.com/blogs/scott/
On 28 Nov 2004 13:36:40 -0800, na********@yahoo.com (n8) wrote:
>Scott, > >Thanks for the information. I added a useragent to make it look like >IE, but I still get the 405 Method not allowed error message. What is >the best way to monitor the HTTP Traffic between my application and >the remote site? Are there any tools i can download to show me what >is going back and forth? > >Thanks in advance, > >n8 >
I remember trying a similar approach once, but I believe it is a
security feature that doesn't let us create a cookie from another
domain. The IE ActiveX control wouldn't let me pass cookies in at all
programaticaly. Argh.
--
Scott http://www.OdeToCode.com/blogs/scott/
On 30 Nov 2004 07:51:33 -0800, na********@yahoo.com (n8) wrote: a different approach. since i have been rackign my head against the wall with this approach I thought I would try another. I thought I would create the cookies on the fly that the site requires for the user account and everything would be create. I can create the cookies exactly, BUT if i change the domain property or use the domain property the cookie does not get written, if i leave the property (do not use it), the cooie gets written as localhost. how do i get around this so I can set the domain name property?
thanks again
n8 This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Jason Steeves |
last post by:
I have one .aspx form that my users fill out and this then takes that
information and populates a second .aspx form via session variables. I need
to screen scrape the second .aspx form and e-mail...
|
by: Ollie |
last post by:
I know you can screen scrape a website using the System.Net.HttpWebResponse
& System.Net.HttpWebRequest classes.
But how do you screen scrape a secured website (https) that takes a username
&...
|
by: Rob Lauer |
last post by:
I have written two completely separate web applications that cannot
talk directly to one another (applications "A" and "B"). Application
"A" has a form that takes some input (radio buttons,...
|
by: Steve |
last post by:
I am working on an application to screen scrape information from a web page.
I have the base code working but the problem is I have to login before I can
get the info I need. The page is hosted on...
|
by: crjunk |
last post by:
I have a screen scrape page that allows the user to submit a url. When
they hit submit, the page is returned back to them on my screen scrape
page. Which computer actuall connects to the url to...
|
by: Swanand Mokashi |
last post by:
Hi all --
I would like to create an application(call it Application "A") that I would
like to mimic exactly as a form on a foreign system (Application "F").
Application "F" is on the web (so...
|
by: ljr2600 |
last post by:
Hello,
I'm very new to python and still familiarizing myself with the
language, sorry if the post seems moronic or simple.
For a side project I'm working on I need to be able to scrape a...
|
by: Gregory A Greenman |
last post by:
I'm trying to screen scrape a site that requires a password. If I
access the site's login page in my browser and view the source, I
see that it does not contain a viewstate.
When my program...
|
by: newdev |
last post by:
Hi All,
Can somebody maybe please help me?
- how do i screen scrape data from a dos application / window to .net application by using c#?
- how do i screen scrape data from a dos application /...
|
by: Kemmylinns12 |
last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
|
by: jalbright99669 |
last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made but the http to https rule only works for...
|
by: antdb |
last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine
In the overall architecture, a new "hyper-convergence" concept was proposed, which integrated multiple engines and...
|
by: Arjunsri |
last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and credentials and received a successful connection...
|
by: WisdomUfot |
last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific technical details, Gmail likely implements measures...
|
by: Matthew3360 |
last post by:
Hi,
I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web server and have made sure to enable curl. I get a...
|
by: Oralloy |
last post by:
Hello Folks,
I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA.
My problem (spelled failure) is with the synthesis of my design into a bitstream, not the C++...
|
by: Carina712 |
last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
|
by: Rahul1995seven |
last post by:
Introduction:
In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python has gained popularity among beginners and experts...
| |