Re: XMLTextReader reading too many characters

Tech-Archive recommends: Fix windows errors by optimizing your registry



Just wanted to follow up on this. Still not finding the problem. See the
end of this message for a copy of the XML document from the previous
message. In this message I'm going to post a code snippet which I hope
might spark something in someone's mind about the problem I'm having.
Actually, this could run as a full blown command/console program but it's
only about an eight or so of the full code. What I left out is code for
other entities besides DistrictPrecinctParts.

Anyhow, thanks for any input.

John

--------------------------------------------------------

Option Explicit On
Option Strict On



Imports System.Data.SqlClient
Imports System.Configuration.ConfigurationSettings


Module Module1
Private numErrors As Int32
Private tw As IO.TextWriter
Private numErrorLineNumber As Int64
Private blnAdvanceLine As Boolean

Sub Main()
Dim myS As IO.Stream
Dim strFilePath, strName, strValue, strMode, strPath As String
Dim blnContinue As Boolean = True
Dim trans As SqlTransaction
Dim cmd As SqlCommand = New SqlCommand
Dim conn As SqlClient.SqlConnection

strMode = AppSettings("mode").ToString
If strMode = "test" Then
conn = New SqlClient.SqlConnection(AppSettings("testconnection"))
strPath = AppSettings("testXMLFilePath").ToString
Else 'prod
conn = New SqlClient.SqlConnection(AppSettings("prodconnection"))
strPath = AppSettings("prodXMLFilePath").ToString
End If
Dim key_district, key_precinctpart As String


If IO.File.Exists(strPath & "VoterViewImportErrors.Log") Then
IO.File.Delete(strPath & "VoterViewImportErrors.Log")
End If
tw = IO.File.CreateText(strPath & "VoterViewImportErrors.Log")
strFilePath = strPath & AppSettings("XMLFileName")

Try
myS = IO.File.OpenRead(strFilePath)
Catch err As SystemException
tw.WriteLine(Now.ToString & vbTab & err.Message)
GoTo ExitRoutine
End Try

Dim xr As Xml.XmlTextReader = New Xml.XmlTextReader(myS)

Try
myS.Position = 0
conn.Open()
trans = conn.BeginTransaction("LoadXML")
cmd.Transaction = trans
cmd.Connection = conn
cmd.CommandTimeout = 1800
Catch err As SystemException
tw.WriteLine(Now.ToString & vbTab & err.Message)
End Try

blnAdvanceLine = False
numErrors = 0
numErrorLineNumber = -1

xr.Read()
Do Until Not blnContinue
strName = xr.Name.ToLower
Select Case strName
Case "xml"
xr.Read()
GoTo skipNext
Case "voterview"
xr.Read()
GoTo skipNext
Case "districtprecinctparts"
Try
xr.Read()
Do While Trim(xr.Name) = "" And Not xr.EOF
xr.Read()
Loop
If xr.Name.ToLower <> "districtprecinctpart" Then
GoTo skipNext
End If
nextDPP:
xr.Read()
Do While Trim(xr.Name) = "" And Not xr.EOF
xr.Read()
Loop
key_district = xr.ReadString
If key_district = "" Then key_district = "null"
xr.Read()
key_precinctpart = xr.ReadString
If key_precinctpart = "" Then key_precinctpart = "null"

cmd.CommandText = "insert into district_precinct_part_new values(" & _
key_district & ", " & _
key_precinctpart & ", getDate(), null, null" & _
")"

cmd.ExecuteNonQuery()
contDPP:
xr.Read()
Do Until xr.Name.ToLower = "districtprecinctpart" And Not xr.EOF
xr.Read()
Loop
xr.Read()
Do Until xr.Name.ToLower <> "" And Not xr.EOF
xr.Read()
Loop
If xr.Name.ToLower = "districtprecinctpart" Then GoTo nextDPP
tw.WriteLine(Now.ToString & vbTab & "districtprecinctpart done")
Catch err As SystemException
If blnAdvanceLine Then
errMessage(xr.Name, xr.ReadString, xr.LineNumber, xr.LinePosition,
myS.Position, err, cmd.CommandText)
tw.WriteLine(Now.ToString & vbTab & "Errors report limited to 100 on
single line.")
GoTo skipNext
Else
errMessage(xr.Name, xr.ReadString, xr.LineNumber, xr.LinePosition,
myS.Position, err, cmd.CommandText)
GoTo contDPP
End If
End Try
Case Else
Try
xr.Read()
Catch err As SystemException
tw.WriteLine(Now.ToString & vbTab & err.Message)
End Try
End Case
skipNext:
If xr.EOF Then blnContinue = False
tw.Flush()
Loop

myCommit:
Try
trans.Commit()
tw.WriteLine(Now.ToString & vbTab & "transaction committed!")
Catch err As SystemException
tw.WriteLine(Now.ToString & vbTab & "commit failed!")
End Try

ExitRoutine:
If xr.ReadState <> Xml.ReadState.Closed Then xr.Close()
Try
myS.Close()
Catch
End Try
Try
tw.Close()
Catch
End Try
trans.Dispose()
If conn.State = ConnectionState.Open Then conn.Close()
conn.Dispose()
xr = Nothing
myS = Nothing
tw = Nothing
End Sub

Private Sub errMessage(ByVal strName As String, ByVal strValue As String,
ByVal ln As Int64, ByVal lp As Int64, ByVal pos As Int64, ByVal err As
SystemException, Optional ByVal ct As String = "")
Dim strMsg As String

strMsg = Now.ToString & vbTab & "LOAD ERROR -- Name: " & strName & vbTab
& "Value: " & strValue & vbTab & "Line Number: " & ln.ToString & _
vbTab & "Line Position: " & lp.ToString & vbTab & "Stream Position: "
& pos.ToString & vbTab & _
"Err: " & Replace(Replace(err.Message, Chr(10), " "), Chr(13), "")
If ct = "" Then
tw.WriteLine(strMsg)
Else
tw.WriteLine(strMsg)
tw.WriteLine(Now.ToString & vbTab & "LOAD ERROR -- " & ct)
End If

If ln = numErrorLineNumber Then
If numErrors > 100 Then 'arbritrary choice of 100, seems to be enough
'need to set a flag indicating that the line needs to be advance in the
file
blnAdvanceLine = True
numErrors = 0 'reset the count
numErrorLineNumber = -1
Else
numErrors += 1
'should only need to set flag once, but this is an easy method
blnAdvanceLine = False
End If
Else
numErrorLineNumber = ln
End If

End Sub

End Module



"JohnB" <do_not_spam_me_john.bidondo@xxxxxxxxxxx> wrote in message
news:%23D1m3UE2HHA.5980@xxxxxxxxxxxxxxxxxxxxxxx
I'm stumped though I have an idea of what might be happening. I would
appreciate any help someone might give/suggest.

I have a well formed XML document. Here is an example below. The real
thing (a file) is almost a gig in size.

<?xml version="1.0" encoding="UTF-8"?><VoterView
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";><DistrictPrecinctParts><DistrictPrecinctPart>

<key_district>182496</key_district><key_precinctpart>447484</key_precinctpart></DistrictPrecinctPart><DistrictPrecinctPart>

<key_district>71</key_district><key_precinctpart>181435</key_precinctpart></DistrictPrecinctPart><DistrictPrecinctPart>

<key_district>182431</key_district><key_precinctpart>181435</key_precinctpart></DistrictPrecinctPart><DistrictPrecinctPart>

<key_district>182520</key_district><key_precinctpart>181523</key_precinctpart></DistrictPrecinctPart><DistrictPrecinctPart>

<key_district>3</key_district><key_precinctpart>181011</key_precinctpart></DistrictPrecinctPart><DistrictPrecinctPart>

<key_district>3</key_district><key_precinctpart>181012</key_precinctpart></DistrictPrecinctPart>
</DistrictPrecinctParts></VoterView>

Okay, the document is big, this is just a small sample. Anyway, the
problem is that I go through and read the file, parsing out the elements
into strings using ReadString but it gets to about the sample section of
the file and throws an exception every time. Darned if I know why, though
again I have my suspicions. Here is the exception information I'm
getting.

"System.Xml.XmlException: The 'key_precinctpart' start tag on line '79'
does not match the end tag of 'key_precinctpart<'. Line 79, position 65.
at System.Xml.XmlTextReader.ParseTag()
at System.Xml.XmlTextReader.ParseBeginTagExpandCharEntities()
at System.Xml.XmlTextReader.Read()
at System.Xml.XmlReader.GetTextContent()
at System.Xml.XmlReader.ReadString()
at VoterXMLDataLoad.Module1.Main() in D:\VS Net
Projects\VOTERXMLDataLoad\Module1.vb:line 293"

Looking at the XML at line 79, position 65, it's all well formed and there
is nothing in the data that would indicate a problem. Any clues would be
helpful, I'm out of ideas. My one idea was that the stream needs to be
flushed, but since I'm not flushing it for other sections of the document
which work, why would I need to in this section?

thanks

John



.


Quantcast