Re: extract text from html



Patrick ,

if you mean your Goal is just simply removing the HTML tags from a string

i made a function for this purpose with some Regex

Private Function stripHTML(ByVal strHTML) As String

Dim objRegExp As New System.Text.RegularExpressions.Regex("<(.|\n)+?>")

Return objRegExp.Replace(strHTML, "")

End Function

i use this in a winforms app that stripes websites for valuable information
with a webclient



hth



Michel Posseth [MCP]


"Patrick" <praft@xxxxxxxxx> wrote in message
news:%23UkeVbv$FHA.1600@xxxxxxxxxxxxxxxxxxxxxxx
> I've got some text with a few HTML tags, such as the following
> <Bold>Hello</Bold>There buddy<p>please .....
>
> I need to be able to extract just the text, which would be
> Hello there buddy please....
>
> Note, this is a Windows App, and not a Web App.
> Any ideas anyone?
>


.



Relevant Pages

  • Re: extract text from html
    ... > if you mean your Goal is just simply removing the HTML tags from a string ... > Private Function stripHTMLAs String ... >> Note, this is a Windows App, and not a Web App. ...
    (microsoft.public.dotnet.languages.vb)
  • Re: Replacing html tags
    ... I'm not all that bad at Regex, but i'm stumped on how to approach my ... I need to parse a string and remove all html tags except hyperlinks. ...
    (microsoft.public.dotnet.framework.aspnet)
  • Replacing html tags
    ... I'm not all that bad at Regex, but i'm stumped on how to approach my ... I need to parse a string and remove all html tags except hyperlinks. ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: How to remove tag
    ... How can I Use RegEx to remove the HTML tags for a string? ... How can I remove the html tag form this string with C# ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: How to remove tag
    ... You would create a RegEx to find the variety of tags; ... How can I Use RegEx to remove the HTML tags for a string? ...
    (microsoft.public.dotnet.languages.csharp)