ScreenScraping and Viewstate

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Rob Reagan (RobReagan_at_discussions.microsoft.com)
Date: 12/07/04


Date: Tue, 7 Dec 2004 07:09:02 -0800

I'm writing a screenscraper in Visual Basic .NET that is scraping an ASP .NET
website. I've used a tool that echos what my browser submits to the website
and what my scraper submits to the website. The submissions are identical
EXCEPT for the viewstate. I'm having a horrible time finding the right
encoding.

I can successfully parse the viewstate from the page. My parsed results
contain lots of + signs and ends with two = signs. When looking at the
browser submission, I see that these have been changed to %2B and %3D
respectively. I've tried running this viewstate string through the
HttpUtils.urlEncodeUnicode method but no luck; my results still do not match
the web browser submission. Instead the urlEncodeUnicode method changes the +
and = to lowercase %2b and %3d.

Can someone explain the encoding to me? When looking at the view->encoding
for the page I'm trying to scrape in IE, I see the encoding is set to UTF-8.
Am I correct in thinking that ALL I have to do is parse the viewstate, encode
it properly, and send it right back to the server?

There are no cookies involved on this site. Thanks.

Rob Reagan
rob@nospam.digitallabsinc.com



Relevant Pages

  • Re: Anchor doesnt work
    ... > When i put an anchor link directly after the website address the browser ... > jumps to the top of the page. ... Perhaps it has something to do with the encoding ...
    (microsoft.public.windows.inetexplorer.ie6.browser)
  • help with network problem
    ... ('binary' encoding is not supported, ... I have found that I am not able to browser one website at home. ... Doesnt help. ...
    (Security-Basics)
  • Re: Japanese encoding
    ... IE can't display my website properly unless the ... encoding is set to Unicode. ... Japanese like most Japanese websites, for instance, ... All web pages should advise the browser what encoding is being used. ...
    (sci.lang.japan)
  • What is included in ViewState when I have set EnableViewState="false"
    ... I have switched off ViewState by setting EnableViewState="false" in the page ... I just wonder when I right click in the browser and select show source when ... the webSite is running I can see that ... So does anyone have the slightest idea what can this contain when the ...
    (microsoft.public.dotnet.languages.csharp)
  • What is included in ViewState when I have set EnableViewState="false"
    ... I have switched off ViewState by setting EnableViewState="false" in the page ... I just wonder when I right click in the browser and select show source when ... the webSite is running I can see that ... So does anyone have the slightest idea what can this contain when the ...
    (microsoft.public.dotnet.framework.aspnet)