InStr and HTML Screen Scraping problem

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Phil396 (anonymous_at_discussions.microsoft.com)
Date: 03/10/04


Date: Wed, 10 Mar 2004 09:00:48 -0800

I am trying to fix an application that was working
before but now it is not. The VB program will take
an html file, parse out the data to fill with
variables. Here is the problem the Html file
is malformed which will cause the vb program
to scrape the wrong varaibles. Here is the HTML

<TD width="3%">rk&nbsp; </TD>
<TD width="3%">9&nbsp;
<TD width="29%">Jim Ray&nbsp; </TD>

Here is the code that scrapes the HTML file

nStartSpot = InStr(nEndSpot + 1, LCase
(m_DataString), "<td")

nEndSpot = InStr(nStartSpot + 1, LCase
(m_DataString), "</td>")

which will work for other vb variables.
The code wants a </td> for each value in
the HTML table, but as you can see there
is not any </td> at the end of <TD width="3%">9&nbsp;
I tried this
     
nStartSpot = InStr(nEndSpot + 1, LCase
(m_DataString), "<td")

nEndSpot = InStr(nStartSpot + 1, LCase
(m_DataString), "&nbsp;")
        
but with no success. I could potentially crash the
system so I do not want to debug. I also know
&nbsp will be converted to an empty string when
the program removes html tags. Any help why my solution
did not fix the problem would be helpful.



Relevant Pages