Re: XmlReaders and fragments



Lee Crabtree wrote:
When reading fragments, it seems like XmlReaders try to read too much. I'm working on a file parser for a new file format, and I've run into a problem. The format has an XML fragment for a header, then a (frequently) large amount of binary data beneath. In certain situations, there may be XML fragments further down in the file. An example of the file might look like this:

<header>
<name>some name</name>
<size>1000</size>
<stuff>more data</stuff>
<otherstuff>19.2</otherstuff>
</header>...some huge block of binary data...

The header isn't a fixed size. In writing a little test to try and parse out the header, I ran into what seemed like a really weird decision on the part of the XmlReader. When I try to stream the data through using ReadOuterXml (to get the header for processing later), it would throw an exception regarding invalid characters MUCH further down in the file. In other words, the reader had gotten past the ending element of the fragment, then kept going. ReadOuterXml is just supposed to get the tags and children of the current node, which in this case would have been the node labelled "header".

ReadOuterXml() when positioned on an element reads everything including the end tag and positions the reader on the next node. If there is binary data after the end tag then you get an error. I am not sure what you expect, ConformanceLevel.Fragment does only mean there is no requirement to have exactly one root element, it does not mean binary data is allowed.
If you want to consume an element but avoid that the reader is positioned after the end tag then you might want to try whether using ReadSubtree() does what you want, it gives you a second XmlReader you can work with to consume only the element, once you close it the first main reader is positioned on the end tag, not after it.


--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
.



Relevant Pages

  • Re: XmlReaders and fragments
    ... The format has an XML fragment for a header, ... the end tag and positions the reader on the next node. ...
    (microsoft.public.dotnet.xml)
  • Re: parsing csv files class
    ... to import the "string" module. ... for row in reader: ... You might consider keeping the header line separate. ... I'm not sure why this is a member of the class; it doesn't use any of the ...
    (comp.lang.python)
  • Re: Get ID-Tags of MP3s
    ... {- Fixed bug with tag ... {ID3v2 frame header} ... function ExtractTrack(const TrackString: string): Byte; ...
    (comp.lang.pascal.delphi.misc)
  • Re: Get ID-Tags of MP3s
    ... {- Fixed bug with tag ... {ID3v2 frame header} ... function ExtractTrack(const TrackString: string): Byte; ...
    (alt.comp.lang.borland-delphi)
  • Re: 1. CfV zur Einrichtung von de.comp.virtualisierung
    ... solltest vielleicht den Reader wechseln. ... meiner mir das Newslesen angenehm macht? ... Es ist allgemein ein schlechter Stil, einen F'up im Header zu setzen, darauf aber nicht im Body hinzuweisen - auch wenn Newsreader existieren, die diese Information anzeigen. ...
    (de.soc.netzkultur.umgangsformen)

Loading