Re: Reading IE browser contents?



I suspect that automation is probably the best way to go
in getting access to a runnig instance of IE,
rather than trying to go through IE's hWnd.

The following demo shows how to get a running instance
of IE and get access to its DOM. You'll need to
understand somewhat about the DOM (document object
model). The Document object represents the loaded page.
That contains the Body object, various elements, etc.
There's also the Window object (Document.parentWindow).
In early-bound code they're named slightly differently:
HTMLDocument, HTMLBody, etc.
The DOM is what is worked with in DHTML. You can
use it to get access to every detail of the loaded webpage.
Each page element is an object in the document hierarchy.
Each has a style property that provides access to CSS
properties. ... etc.

To run this demo -

1) Start a project and put a button on a form.
2) Reference the following in Project -> References:

"Microsoft Shell Controls and Automation"
(This is the Shell object, which represents all the
"active desktop" functionality. You can read about it in
MSDN under "Shell Object".)

"Microsoft Internet Controls"
(This has both the WB and IE. You'll need it to
get the IE object.)

"Microsoft HTML Object Library"
(This will give you "intellisense" auto-completion
for the DOM hierarchy of objects.)

3) Put the following code into your form:

------------------------------------

Private Sub Command1_Click()
Dim ie2 As InternetExplorer
Dim Doc As HTMLDocument
Set ie2 = GetIE("test.html")
If Not ie2 Is Nothing Then
Set Doc = ie2.document
Debug.Print Doc.body.innerHTML
Set Doc = Nothing
End If
Set ie2 = Nothing
End Sub

Private Function GetIE(sWebpageName As String) As InternetExplorer
Dim SH As Shell
Dim IE As InternetExplorer
Dim Wins As ShellWindows
Dim i2 As Long
Dim sLoc As String

Set SH = New Shell
Set Wins = SH.Windows
For i2 = 0 To Wins.Count - 1
Set IE = Wins.Item(i2)
If IE.LocationName = sWebpageName Then
Exit For
Else
Set IE = Nothing
End If
Next
If Not IE Is Nothing Then
Set GetIE = IE
End If
Set IE = Nothing
Set Wins = Nothing
Set SH = Nothing
End Function
---------------------------------

4) Create an HTML file named test.html, with some
kind of HTML content, and open it in IE.

5) Run the project and click the button. You
should see the innerHTML of test.html appear in
the debug window.

Explanation:

The Shell object has a ShellWindows collection. Oddly
enough, ShellWindows is a collection of Explorer folder
windows AND IE windows. All are returned as IE objects.
(This is part of the "WebView" nonsense that MS started
with "active desktop". Folder windows are treated as IE
instances.)

So if, say, you have test.html open and you also have
www.somewhere.com/index.html open, and you also have
the Windows folder open, the LocationName values
returned in the GetIE loop will be "test.html", "index.html"
and "Windows".

The GetIE function loops through all open folder and
IE windows checking for a match to your webpage name.
If a match is found it returns that IE object. (You could
also use LocationURL, but that would have to be cleaned
up if the page is local because it gets converted to
web format. For instance, "C:\Program files" would be
returned as "file///C:/Program%20files".)

See the properties of the IE object in the object
browser to see where I got the LocationName property.
Once you have the IE object then you just reference its
Document property to get the document object. From
there you have access to the whole shebang.



.