RE: System.Net.Webclient screen scraping: how to gracefully handle 403 (and other) errors?

Tech-Archive recommends: Fix windows errors by optimizing your registry



Hello KF,

Based on your description, you're using the webclient class to request many
web pages programmatically in ASP.NET page code. However, since some page
may raise some exception, your client loop code in ASP.NET page break,
correct?

As for the 403 error, it is normally caused by the security authorization
checking at server-side fails. I'm not sure whether there is any other
particular scenario here, however, if what you want is simply captuer and
ignore such error and continue the loop, you can just add a try catch block
around your webclient class's downloadXXX method call and if any exception
captured you can simply ignore it and skip the current loop. e.g.

=======================
foreach (DataRow dr in s.Tables[0].Rows)

{

counter++;

System.Net.WebClient wc = new WebClient();

try
{

string strData =
wc.DownloadString("http://whatever.org/article.asp?articleid="; +
dr[0].ToString());

}catch(Exception ex)
{
//ignore and continue the loop
}

....................................

}
=========================

Does this work for your scenario?

Sincerely,

Steven Cheng

Microsoft MSDN Online Support Lead



==================================================

Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.



Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/subscriptions/support/default.aspx.

==================================================



This posting is provided "AS IS" with no warranties, and confers no rights.

.



Relevant Pages

  • Re: FileCopy vs. Read & Write or CopyFile API
    ... I like Jialiang's solution to loop whiel there is an error copying the file, ... Dim errNum as Integer ... Microsoft Online Community Support ... where an initial response from the community or a Microsoft Support ...
    (microsoft.public.vb.general.discussion)
  • RE: Oracle Client ORA-03113 error hangs my windows service
    ... To answer your question, Kevin, the service exception handling works ... service does use multiple threads, but they are all handled by the .NET ... Microsoft Online Community Support ... where an initial response from the community or a Microsoft Support ...
    (microsoft.public.dotnet.framework.adonet)
  • Re: making code native in a C++/CLI program
    ... which makes the exception very strange. ... // but can't mix and match in-place copy construction ... Microsoft Online Community Support ... where an initial response from the community or a Microsoft Support ...
    (microsoft.public.dotnet.languages.vc)
  • RE: Ok to call Application.get_Selection()?
    ... We get an exception that says COMException ... Cannot create a Selection object when this dialog is active. ... Microsoft Online Community Support ... where an initial response from the community or a Microsoft Support ...
    (microsoft.public.office.developer.com.add_ins)
  • RE: Strange LocalDataStoreSlot storage has been freed exception
    ... that this exception is generated in the finalizer of LocalDataStoreSlot. ... However using reflector I don't see such an exception being thrown from the ... Microsoft Online Community Support ... nature are best handled working with a dedicated Microsoft Support Engineer ...
    (microsoft.public.dotnet.framework.clr)