Re: Parsing large amounts of data (200,000 entries) with XML?

From: Joseph M. Ferris (josephmferris_at_cox.net)
Date: 03/17/04


Date: Wed, 17 Mar 2004 08:34:46 -0700

Bonj wrote:

> Unfortunately the fact that DLL hell exists at all with it is enough to put me off - it doesn't make it any better that it's capable of being cured.
The fact is if I go down this route I'm likely to end up having produced
something that works beautifully and then my boss will just look at me
like an idiot because it doesn't work on his PC because he hasn't got
the right version of the library. DLL-hell puts it in a whole different
ball-park really - if you are competing with things that have to accept
that DLL-hell may exist, well it's a whole different area of
competition. Learn the simple golden rule - *early binding sucks*!!!!

It is just a dependency, like other dependencies that exist within an
application. You can't really expect every user to have every component
at any given time. If you build it with the 3.0 version of the parser,
it will work on your boss's machine. If you build it with 4.0, you will
most likely have to distribute a dependancy with the build. Of course,
if you build an install package, the dependencies will be resolved and
included with the distribution.

Your golden rule does not cut it for me, though. It is definitely an
opinion that I don't share. Early binding is faster since it does not
need to determine which object to create at runtime, provides
Intellisense, and captures syntax errors at compile-time instead of at
run-time. I only use late binding when I absolutely need to. I will
use it when creating objects in ASP, when working with COM+ (although it
is not always required), and when creating an unknown object that
implements a known interface.

> Should, should, should. If everything that 'should' happen did happen, just think what a magnificently perfect situation we'd be in.

Granted. I am a realist, so I use 'should' wherever appropriate.
Ironically, 'should' is used a lot more often when talking about
Microsoft products than it *should* be. <g>

> If you can get it working with late-binding that'd be amazing. Thanks.

I believe that this should work for you, then:

Option Explicit

Private Function CreateDOM() As Object

Dim xdoLocal As Object

     Set xdoLocal = CreateObject("MSXML.DOMDocument")
     xdoLocal.async = False
     xdoLocal.validateOnParse = False
     xdoLocal.resolveExternals = False
     Set CreateDOM = xdoLocal

End Function

Private Sub Form_Load()

Dim xdoSource As Object
Dim xdoStyle*** As Object
Dim xdoOutput As Object
Dim lngFree As Long
Dim str As String

     Set xdoSource = CreateDOM
     xdoSource.Load App.Path & "\test.xml"

     Set xdoStyle*** = CreateDOM
     xdoStyle***.Load App.Path & "\test.xsl"

     str = xdoSource.transformNode(xdoStyle***)
     Debug.Print "xdoSource.transformNode: " & vbNewLine & str

     lngFree = FreeFile

     Open App.Path & "\test.html" For Output As #lngFree

     Print #lngFree, str

     Close #lngFree

End Sub

Additionally, let me give you the new XSL that I worked out to support
the non-breaking spaces that Larry was looking for:

<?xml version="1.0"?>
<!DOCTYPE xsl:style*** [
<!ENTITY nbsp "&#160;">
]>
<xsl:style*** version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:template match="/">
         <html>
             <body>
                 <table border="2">
                     <tr>
                         <td>Name</td>
                         <td>Address Line 1</td>
                         <td>Address Line 2</td>
                         <td>City</td>
                         <td>State</td>
                         <td>Postal Code</td>
                         <td>Phone</td>
                     </tr>
                 <xsl:for-each select="customerlist/customeritem">
                     <tr>
                         <td><xsl:value-of select="name_last"/>,
<xsl:value-of select="name_first"/>&nbsp;</td>
                         <td><xsl:value-of select="address1"/>&nbsp;</td>
                         <td><xsl:value-of select="address2"/>&nbsp;</td>
                         <td><xsl:value-of select="city"/>&nbsp;</td>
                         <td><xsl:value-of select="state"/>&nbsp;</td>
                         <td><xsl:value-of select="postal"/>&nbsp;</td>
                         <td><xsl:value-of select="phone"/>&nbsp;</td>
                     </tr>
                 </xsl:for-each>
                 </table>
             </body>
         </html>
     </xsl:template>
</xsl:style***>

And so you don't need to look for it in the other post, here is the XML
again:

<?xml version="1.0"?>
<customerlist>
      <customeritem>
          <name_last>Smith</name_last>
          <name_first>Bill</name_first>
          <address1>123 Any Lane</address1>
          <address2></address2>
          <city>New York</city>
          <state>NY</state>
          <postal>10003</postal>
          <phone>917.222.2324</phone>
      </customeritem>
      <customeritem>
          <name_last>Johnson</name_last>
          <name_first>Amy</name_first>
          <address1>341 Durango Curve</address1>
          <address2>Suite 123</address2>
          <city>Oakland</city>
          <state>CA</state>
          <postal>98765</postal>
          <phone>512.215.5412</phone>
      </customeritem>
      <customeritem>
          <name_last>Hurtz</name_last>
          <name_first>Richard</name_first>
          <address1>4923 Viagra Way</address1>
          <address2></address2>
          <city>Upton</city>
          <state>CA</state>
          <postal>99999</postal>
          <phone>874.654.2541</phone>
      </customeritem>
      <customeritem>
          <name_last>Doe</name_last>
          <name_first>Jane</name_first>
          <address1>123 Your Street</address1>
          <address2></address2>
          <city>Your City</city>
          <state>AL</state>
          <postal>23423</postal>
          <phone>976.654.2154</phone>
      </customeritem>
      <customeritem>
          <name_last>Johnson</name_last>
          <name_first>Howard</name_first>
          <address1>9834 Main St.</address1>
          <address2>Box 12</address2>
          <city>Fort Lauderdale</city>
          <state>FL</state>
          <postal>12343</postal>
          <phone>984.354.2465</phone>
      </customeritem>
</customerlist>

That should get you started. ;-)

--Joseph