Re: regex puzzle!
From: Niki Estner (niki.estner_at_cube.net)
Date: 11/24/04
- Next message: SparvHok: "Re: What is ASP.net"
- Previous message: aaj: "Re: What is ASP.net"
- Maybe in reply to: G. Stewart: "regex puzzle!"
- Next in thread: G. Stewart: "Re: regex puzzle!"
- Reply: G. Stewart: "Re: regex puzzle!"
- Reply: G. Stewart: "Re: regex puzzle!"
- Messages sorted by: [ date ] [ thread ]
Date: Wed, 24 Nov 2004 11:55:33 +0100
"G. Stewart" <galenstewart@yahoo.com> wrote in
news:258fa3a8.0411240026.670605c5@posting.google.com...
> Niki:
>
> Thanks. The HTML source that I am extracting from does not include
> <html>, <head>, or <body> tags. Just the block or in-line element
> tags: <p>, <i>, <em>, <a>, etc.
>
> What I want to do is to extract a snippet or preview of the source
> block, while preserving all the html tags in the snippet/preview,
> including formatting and links. Any ideas?
The point is that matching paranthesis is possible with regex's, but it's
quite tricky (i.e.: I'd have to look it up in a book myself...). However, I
still don't see why you need that; Consider an input like this:
"This text contains <i>italic</i>,<em>bold</em> and even <a
...>hyperlinked</a> text"
If you extract 20 characters from it, not counting tag-characters (using a
regex like the one I've suggested in my previous post) you'd get:
"This text contains <i>i"
Now, if you'd put this in an HTML element like:
"<span>This text contains <i>i</span>..."
So you'd produce correct HTML (not XML). I think this should work for any
input, since the closing-tag's for <p>, <i>, <em>... are all optional.
Niki
- Next message: SparvHok: "Re: What is ASP.net"
- Previous message: aaj: "Re: What is ASP.net"
- Maybe in reply to: G. Stewart: "regex puzzle!"
- Next in thread: G. Stewart: "Re: regex puzzle!"
- Reply: G. Stewart: "Re: regex puzzle!"
- Reply: G. Stewart: "Re: regex puzzle!"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|