Replacing html tags



Hello all,
I'm not all that bad at Regex, but i'm stumped on how to approach my
problem.

I need to parse a string and remove all html tags except hyperlinks.

I can remove all the html tags using: Regex.Replace(inputText,
@"<(/?[^\>]+)>", "");
But this also removes any hyperlinks, which i need to keep.

I've also written a regex for finding hyperlinks:
<a[\s]href=["'][^"]+[.\s]*["'][^<]+[.\s]*</a>
but my problem is trying to put all this together.

I've thought of using Regex.Matches and checking each instance but
can't get that to work.

Any ideas and/ or code would be great - i'm used to C# but VB's cool as
well.

Cheers in advance,
max

.



Relevant Pages

  • Re: extract text from html
    ... if you mean your Goal is just simply removing the HTML tags from a string ... i made a function for this purpose with some Regex ... Private Function stripHTMLAs String ... > Note, this is a Windows App, and not a Web App. ...
    (microsoft.public.dotnet.languages.vb)
  • Re: Replacing html tags
    ... me from writing a lot of code using the WebBrowser control to do some ... I'm not all that bad at Regex, but i'm stumped on how to approach my ... I need to parse a string and remove all html tags except hyperlinks. ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: Replacing html tags
    ... I'm not all that bad at Regex, but i'm stumped on how to approach my ... I need to parse a string and remove all html tags except hyperlinks. ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: How to remove tag
    ... How can I Use RegEx to remove the HTML tags for a string? ... How can I remove the html tag form this string with C# ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Replacing html tags
    ... probably adapt quite quickly to keep tags. ... I'm not all that bad at Regex, but i'm stumped on how to approach my ... I need to parse a string and remove all html tags except hyperlinks. ...
    (microsoft.public.dotnet.framework.aspnet)