Re: Replacing html tags

Tech Tip: Click here to run a free scan for Windows Errors and optimize PC performance



Woohoo! This is a great control library. Glad you posted it here as it saved
me from writing a lot of code using the WebBrowser control to do some
similar HTML manipulation.


--
Thanks again,
Mark Fitzpatrick
Former Microsoft FrontPage MVP 199?-2006


"Chris Fulstow" <chrisfulstow@xxxxxxxxxxx> wrote in message
news:1159974982.558465.258060@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
You could do this with the HTML Agility Pack:
http://www.codeplex.com/Wiki/View.aspx?ProjectName=htmlagilitypack

I think it comes with an example that strips HTML tags, which you could
probably adapt quite quickly to keep <a> tags.

jumblesale wrote:
Hello all,
I'm not all that bad at Regex, but i'm stumped on how to approach my
problem.

I need to parse a string and remove all html tags except hyperlinks.

I can remove all the html tags using: Regex.Replace(inputText,
@"<(/?[^\>]+)>", "");
But this also removes any hyperlinks, which i need to keep.

I've also written a regex for finding hyperlinks:
<a[\s]href=["'][^"]+[.\s]*["'][^<]+[.\s]*</a>
but my problem is trying to put all this together.

I've thought of using Regex.Matches and checking each instance but
can't get that to work.

Any ideas and/ or code would be great - i'm used to C# but VB's cool as
well.

Cheers in advance,
max



.



Relevant Pages

  • Replacing html tags
    ... I'm not all that bad at Regex, but i'm stumped on how to approach my ... I need to parse a string and remove all html tags except hyperlinks. ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: Replacing html tags
    ... I'm not all that bad at Regex, but i'm stumped on how to approach my ... I need to parse a string and remove all html tags except hyperlinks. ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: Replacing html tags
    ... probably adapt quite quickly to keep tags. ... I'm not all that bad at Regex, but i'm stumped on how to approach my ... I need to parse a string and remove all html tags except hyperlinks. ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: Complex regex help
    ... The regex doesn't look right at all, ... bold tags has no effect, and the] in the character class needn't be escaped. ... which will return all the text between the HTML tags, but this will fall down if ...
    (perl.beginners)
  • a regex question (sample code with comments provided)
    ... I've been trying to parse blurbs of text formatted with HTML tags and ... this in PHP. ... Perl's regex are more powerful. ...
    (php.general)