Screen Scraping a Password Protected Site
- From: Gregory A Greenman <see@xxxxxxxxx>
- Date: Sat, 16 Dec 2006 16:52:32 -0600
I'm trying to screen scrape a site that requires a password. If I
access the site's login page in my browser and view the source, I
see that it does not contain a viewstate.
When my program posts the login information, the response I get
is the same page as if I had logged in using my browser. In the
page it says "Welcome" followed by my name. The cookie collection
returned doesn't contain any cookies (response.cookies.count =
0).
When I access other pages, the login screen is returned instead
of the desired page.
Obviously, I need to somehow maintain the session in subsequent
calls, but how do I do that when there are no cookies and there
is no viewstate?
If I use Fiddler to see what happens when I access the site from
my browser, I can see that the first line for the site (where the
result is 200 and the host says "CONNECT") says "SessionID:
empty" under Session Inspector - Textview for the request. For
the response it says "SessionID: " then several bytes of data.
Subsequent 200/CONNECT lines have that same data for both the
request and the response. This must be what I need to maintain my
session. If anyone can help me figure out how to get this
information and use it, I'll be very grateful.
(I'm using VB in VS2003.)
Thanks.
--
Greg
----
http://www.spencerbooksellers.com
greg00 -at- spencersoft -dot- com
.
- Follow-Ups:
- Re: Screen Scraping a Password Protected Site
- From: Blake
- Re: Screen Scraping a Password Protected Site
- Prev by Date: Re: Ongoing "Out of Memory" problem
- Next by Date: Re: SP1 Install - was it successful?
- Previous by thread: Re: Ongoing "Out of Memory" problem
- Next by thread: Re: Screen Scraping a Password Protected Site
- Index(es):
Relevant Pages
|