Handling multiple schemas and large files in XML



Hi

I hope that this is the correct place to post this question.

I'm looking at developing an application which will enable me to import
and process some data that is made available to me as XML.

One complication is that the providers of the data have published two
different schema versions. Whilst effectively describing the same data,
the 2nd schema is a significant refactoring of the first and so is
almost totally different in structure. I also can't rule out the
possibility that they will issue further versions too. I'd ideally like
to be able to handle both of these schemas and I also like to be able to
support for new versions with the minimum of fuss.
From knowlege of the application domain, I am also fairly sure that the
essential data will be stable change across schema versions.

I originally considered defining a class for each schema version and
using the XmlSerializer class to construct the appropriate one from the
xml document. However, this is where another potential issue raises it's
head: the xml files are rather large: 50+ Mb and over 1 million lines.

I suspect that using the XmlSerializer with documents of this size is
probably not appropriate. Am I correct?

Thankfully, it's not necessary to load the entire document in one go as
the user won't need to visualise *all* the data at once. Instead, they
will home into a section of the data and drill down for detail in
tree-like fashion. Because of this, the application's internal object
model can represent just the data that the user is interested in.

Bearing this in mind, I could construct the object model by using an
XmlTextReader and analysing XmlTextReader.NodeType. The downside to this
is that AIUI, I will then have to manually handle the schema differences.

I'd appreciate it if anyone could suggest better approaches. I'm fairly
new to both .NET and XML so please point out if I'm completely off the
mark here. Any suggestions at all are greatly appreciated.

TIA
MikeB
.



Relevant Pages

  • Re: how to return xml document from a web service
    ... what specific XML you expect. ... If you have a schema that defines what you expect, ... The second issue with this approach is that XML is not a string. ... >> methods from the wire transport. ...
    (microsoft.public.dotnet.framework.aspnet.webservices)
  • my first Tck/Tk program... and an XML question
    ... program which takes an XML Schema file, ... a basic XML tree and allows the user to save it. ... proc open_schema { ...
    (comp.lang.tcl)
  • RE: Data Insertion
    ... >The physical database structure is already in place. ... >I can determine the XML file and whether it contains a schema. ... In this particular case XML Schema is used to create a DataSet schema (set ... you're probably wondering how it's possible to load XML without XML ...
    (microsoft.public.dotnet.framework.compactframework)
  • Re: GEDOM as a database format
    ... format for meaning, that is, structure. ... If at least a basic XML schema is agreed on and XML used in any fashion, ... mothers: the autosomal/X mother and the mitochondrial mother .. ...
    (soc.genealogy.computing)
  • [ANN] New XML Editor in Visual Studio 2005 Beta 1
    ... Full syntax coloring for all XML and DTD syntax. ... Intellisense based on any DTD, ... Schema picker dialog for overriding schemas used for validation, which is then remembered as a document property in your solution. ... Goto Definition command for navigating between elements and their associated DTD, XDR or XSD schema definitions. ...
    (microsoft.public.dotnet.xml)