473,394 Members | 1,785 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

ScreenScraping and Viewstate

I'm writing a screenscraper in Visual Basic .NET that is scraping an ASP .NET
website. I've used a tool that echos what my browser submits to the website
and what my scraper submits to the website. The submissions are identical
EXCEPT for the viewstate. I'm having a horrible time finding the right
encoding.

I can successfully parse the viewstate from the page. My parsed results
contain lots of + signs and ends with two = signs. When looking at the
browser submission, I see that these have been changed to %2B and %3D
respectively. I've tried running this viewstate string through the
HttpUtils.urlEncodeUnicode method but no luck; my results still do not match
the web browser submission. Instead the urlEncodeUnicode method changes the +
and = to lowercase %2b and %3d.

Can someone explain the encoding to me? When looking at the view->encoding
for the page I'm trying to scrape in IE, I see the encoding is set to UTF-8.
Am I correct in thinking that ALL I have to do is parse the viewstate, encode
it properly, and send it right back to the server?

There are no cookies involved on this site. Thanks.

Rob Reagan
ro*@nospam.digitallabsinc.com
Nov 19 '05 #1
2 2625
when you scrape the screen the viewstate is html encoded, so you must first
html decode the viewstate value. when you post the viewstate value, it must
be urlencoded.

-- bruce (sqlwork.com)

"Rob Reagan" <Ro*******@discussions.microsoft.com> wrote in message
news:3E**********************************@microsof t.com...
| I'm writing a screenscraper in Visual Basic .NET that is scraping an ASP
..NET
| website. I've used a tool that echos what my browser submits to the
website
| and what my scraper submits to the website. The submissions are identical
| EXCEPT for the viewstate. I'm having a horrible time finding the right
| encoding.
|
| I can successfully parse the viewstate from the page. My parsed results
| contain lots of + signs and ends with two = signs. When looking at the
| browser submission, I see that these have been changed to %2B and %3D
| respectively. I've tried running this viewstate string through the
| HttpUtils.urlEncodeUnicode method but no luck; my results still do not
match
| the web browser submission. Instead the urlEncodeUnicode method changes
the +
| and = to lowercase %2b and %3d.
|
| Can someone explain the encoding to me? When looking at the view->encoding
| for the page I'm trying to scrape in IE, I see the encoding is set to
UTF-8.
| Am I correct in thinking that ALL I have to do is parse the viewstate,
encode
| it properly, and send it right back to the server?
|
| There are no cookies involved on this site. Thanks.
|
| Rob Reagan
| ro*@nospam.digitallabsinc.com
Nov 19 '05 #2
This explains it all:
http://odetocode.com/Articles/162.aspx
--
Joe Fallon

"Rob Reagan" <Ro*******@discussions.microsoft.com> wrote in message
news:3E**********************************@microsof t.com...
I'm writing a screenscraper in Visual Basic .NET that is scraping an ASP
.NET
website. I've used a tool that echos what my browser submits to the
website
and what my scraper submits to the website. The submissions are identical
EXCEPT for the viewstate. I'm having a horrible time finding the right
encoding.

I can successfully parse the viewstate from the page. My parsed results
contain lots of + signs and ends with two = signs. When looking at the
browser submission, I see that these have been changed to %2B and %3D
respectively. I've tried running this viewstate string through the
HttpUtils.urlEncodeUnicode method but no luck; my results still do not
match
the web browser submission. Instead the urlEncodeUnicode method changes
the +
and = to lowercase %2b and %3d.

Can someone explain the encoding to me? When looking at the view->encoding
for the page I'm trying to scrape in IE, I see the encoding is set to
UTF-8.
Am I correct in thinking that ALL I have to do is parse the viewstate,
encode
it properly, and send it right back to the server?

There are no cookies involved on this site. Thanks.

Rob Reagan
ro*@nospam.digitallabsinc.com

Nov 19 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: John Kirksey | last post by:
I have a page that uses an in-place editable DataGrid that supports sorting and paging. EnableViewState is turned ON. At the top of the page are several search fields that allow the user to filter...
3
by: Steve Drake | last post by:
All, I have a CONTROL that contains 1 control (Control ONE), the 1 control that it can contain 1 or 2 control (Control A and B). Control A, raises and event and Control ONE receives this event...
10
by: neo | last post by:
hi, I am studying ASP.NET and have few questions - 1) The session ID and values of controls is stored in VIEWSTATE variable. So now when we put EnableViewState="false" in Page directive and...
7
by: et | last post by:
I'm not sure I understand the use of the ViewState. Do I understand correctly that values of controls are automatically held in a hidden control called ViewState? If so, then why can't we get...
9
by: Mark Broadbent | last post by:
Been a while since I've touched asp.net but one thing that always seems to fustrate me is the loss of state on variable declarations. Is there anyway (i.e. assigning an attribute etc) to instruct...
6
by: hitendra15 | last post by:
Hi I have created web user control which has Repeater control and Linkbutton in ItemTemplate of repeater control, following is the code for this control On first load it runs fine but when...
1
by: Philipp Lenssen | last post by:
Hi! I'm having some problems correctly screenscraping and outputting e.g. Chinese characters from a Google translator search result. The output is always a garbled mess, not Chinese characters....
12
by: Nick C | last post by:
Hi How can i reduce the viewstate for my asp.net application. It is getting very large now. What is a good solution? thanks N
1
by: Dan Stromberg - Datallegro | last post by:
Is there a method, with python, of screenscraping a web page, if that web page uses javascript? I know about BeautifulSoup, but AFAIK at this time, BeautifulSoup is for HTML that doesn't have...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.