Re: How to read contents of html table with .net?

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



"Vadym Stetsyak" <vadym_s@xxxxxxx> wrote in message
news:uWweLX3yGHA.1292@xxxxxxxxxxxxxxxxxxxxxxx
Hello, Jim!

JS> I have a need to read the contents of an html table on a remote web
JS> page into a variable. I guess this is called screen scraping but not
JS> sure. I'm not sure where to start or what the best practices are to
JS> accomplish this. For instance; I have a healthcare app that need to
JS> check a gov't we page for a user's license no# periodically. There is
JS> no login and I can put the user info in the request URL no problem but
JS> not sure how to read the response data in the tables. What is the
JS> namespace and class(s) I should be looking at?

After receiving table you can parse it. You can use XML parser for this
( System.Xml ).
--
Regards, Vadym Stetsyak
www: http://vadmyst.blogspot.com

Beware that most web pages aren't written with well formed, valid XML (HTML
isn't as strict as XML). The XML parser might not work in that case.
Googling for "screen scraping .NET" should get you some alternatives.

/claes


.



Relevant Pages

  • Re: javascript and XML help
    ... Some Text and html tags ... That is not well-formed XML so any XML parser will give a parse error. ... You need to be aware however that such an XML parsing will give you XML DOM nodes, it does not help that some of them might have the same tag name as HTML elements e.g. ...
    (comp.lang.javascript)
  • Re: Java script problem
    ... Understand that an XML parser cannot care about this. ... even a HTML UA MUST NOT care about this if the encoding specified ... with the Content-Type HTTP header is different; ...
    (comp.lang.javascript)
  • Re: Java Regex Problem
    ... an XML parser and trying to locate particular nodes. ... For the HTML I'm parsing, there won't be any nested list items. ... task, despite the real risks you identified (which, BTW, I've already ...
    (comp.lang.java.programmer)
  • Re: Java Regex Problem
    ... problem, since list-items in HTML can be nested, e.g. ... <exampleHtmlSnippet> ... Generally yes I would use an XML parser but I don't think the HTML will ...
    (comp.lang.java.programmer)
  • Re: How to read contents of html table with .net?
    ... JS> I have a need to read the contents of an html table on a remote web ... JS> no login and I can put the user info in the request URL no problem but ... JS> not sure how to read the response data in the tables. ...
    (microsoft.public.dotnet.languages.csharp)