Re: Accessing HTML DOM from cscript.exe

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Dave Methvin (news0110_at_methvin.com)
Date: 11/21/04


Date: Sun, 21 Nov 2004 14:53:22 -0500

If you have mastered XPath expressions then I bow before you. :-)

If XMLDOM works for the particular HTML documents you're trying to parse,
then I would think you could use its XPath support to find the node you
want.

The good (pragmatic) thing about using IE is that it will make some sense
out of whatever mangled HTML it gets. I would probably just walk the DOM
tree or use getElementBy* to find what I wanted. It isn't as elegant as
XPath but I would guess it's more likely to succeed on real-live malformed
HTML docs.

"Kaushik Sridharan" <skaushik@hotmail.com> wrote in message
news:ab6981a.0411172211.699db739@posting.google.com...
"Dave Methvin" <news0110@methvin.com> wrote in message
news:<uivnPNLzEHA.1300@TK2MSFTNGP14.phx.gbl>...
> Or you could fire up an IE instance and use its parser if that's easier.
>
> var ieo = WScript.CreateObject("InternetExplorer.Application");
> // ieo.visible = 1;
> ieo.navigate("http://www.microsoft.com");
> while (ieo.Busy) { WScript.Sleep(10);} // wait for load
> var doc = ieo.document;
> // show the contents of the <title> tag
> WScript.Echo(doc.getElementsByTagName("title")(0).innerText);
> ieo.Quit();
> WScript.Quit(0);

Excellent! Thank you!

Dave's pattern-matching approach isn't going to help, I'm afraid. I
really need to navigate the DOM and do some complex stuff -- I used
title extraction just as an example. But InternetExplorer.Application
suits my needs perfectly.

Another question: Can I use XPath expressions to find nodes in an HTML
DOM tree? From what I understand, you can use XPath only with XML
documents.

(Just to give some context, I am trying to do some sophisticaed
screen-scraping which requires more than simple pattern matching.)

Thanks again for your earlier response,

-K



Relevant Pages