Re: DOM text extraction



I got to this using createRange and then pulling the metrics out of the
splited strings. While this gives me a good "planar" textual surface, I
am more interested in representing all my text within a tree.

I tried to recurse thru IHTMLElement.children but I face the following
problem: When do I actually ask for text? given the folowing example:
<p>
some text<a href="...">toto</a>
</p>

If I asked for text at each level I would get something like this:

p (some text toto)
|__ a (toto)

I wold thus have redundancy of information.

How can I avoid this???

_Stephane

.