Re: Parsing HTML pages

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance




"MisterKen" <MisterKen@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:538B602D-D2A2-417D-B777-278C67C6BCDA@xxxxxxxxxxxxxxxx
If I have the html from webpage loaded into a string. How would I use
regex
to return sections from within that html string?

I want to be able to get the "text" back between two different tags.
Basically I want to scrape some web pages and populate a database.

Does anybody have a snippet of code that could me out get the "text"?

Is it XHTML? If so you can just read it as an XmlDocument.


.



Relevant Pages

  • Re: Regex help
    ... Basically I need to parse a page for certain information which ... will be fed back into CURL to post to a site. ... I don't need any other tags. ... i'd apply another regex to break ...
    (comp.lang.php)
  • Re: Regex help
    ... be fed back into CURL to post to a site. ... I don't need any other tags. ... i'd apply another regex to break ... I was thinking of trying to just get everything for a single element ...
    (comp.lang.php)
  • Re: "negative" regex matching?
    ... I've done some digging in Friedl's RegEx book but I'm not sure if I ... for nested tags. ... Sarah likes Johnny's cooking ... Because Johnny does good cooking ...
    (comp.lang.perl.misc)
  • Re: Regex Help
    ... I had considered just trimming the text inside the tags and then untrimming until a word end, but I figured there would be a regex that would do it all at once. ... chars plus the rest of the current word if the 50'th char lands in the ... David A. Black dblack@xxxxxxxxxxxx ...
    (comp.lang.ruby)
  • Re: Regex help
    ... fed back into CURL to post to a site. ... I don't need any other tags. ... i'd apply another regex to break out ... was thinking of trying to just get everything for a single element type ...
    (comp.lang.php)