extracting readable text

From: Ben S. Terry (fake_email_at_spammers_suck.com)
Date: 10/12/04


Date: Tue, 12 Oct 2004 17:16:44 -0600

Hello,

I would like to simply extract the readable text from an HTML document, e.g.
the text that is shown in the browser window. I believe using the
webbrowser control is the best way to do this but can't seem to find an
interface on the control for just getting at the readable text. Does anyone
have any thoughts on how to do this?

Ben S. Terry



Relevant Pages

  • Re: HOWTO Use WebBrowser Control to Open & Edit HTML File
    ... WebBrowser control to be the user interface, I need to open a text file, ... open the *.html document using WebBrowser.Navigate2 then merely rewrite the ... > Private Sub Command1_Click ...
    (microsoft.public.vb.general.discussion)
  • Re: Focus in webbrowser Control
    ... I'm using a webbrowser control in my project, ... You will have to manipulate the DOM (exposed as the webbrowser.Document ... exposes a Document property; it is the document object of the web page ... Giving focus to an element within an HTML document requires DOM-level calls; ...
    (microsoft.public.vb.controls)
  • Howto: WebBrower control in ATL (WTL) ?
    ... a webbrowser control in ATL ... - Create Blank HTML document ... be able to Append text to bottom of HTML control (with color formatting) ... TIA, Heiko ...
    (microsoft.public.vc.atl)
  • Opening URL?
    ... I am using the WebBrowser control in a VB6 project. ... user clicks a CommandButton, a new browser window ...
    (microsoft.public.vb.general.discussion)
  • WebBrowser control -- managing new windows
    ... The 2.0 WebBrowser control is a wonderful thing, ... a new browser window seems to need access to NewWindow2, ... In theory it should be possible to hook the unmanaged event, ...
    (microsoft.public.dotnet.general)