Re: What is the best way to parse and validate an arbitary XML documen



"Confused XML hacker" <Confused XML hacker@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote
in message news:F7576F6D-6775-4319-A1F8-CD963532AD7B@xxxxxxxxxxxxxxxx
My application needs to be able to parse and validate either a DTD or
schema
based document without knowing in advance which form of grammar a document
is
using. (New documents presented to my system are schema based while the
older
ones are DTD - conversion is not an option as these document represent
legally binding contracts and they must be processed as is).

In the .Net 1.1 version of my code I used a XmlValidatingReader instance
configured with ValidationType.Auto which handled both document types. Now
that n I am porting the code to 2.0 I am trying to use the new
XmlReader.Create method but these readers must be configured as either DTD
validating or schema validating and can not do both at once.

I found a several examples on the web of nested XmlReaders and have tried
this approach (see following code) but the data in the <!DOCTYPE> (my
application needs the PUBLIC name intact to determine the grammar version)
node of a DTD based instance seems to be lost as it passed through the
readers.

XmlReaderSettings settings;

settings = new XmlReaderSettings ();
settings.XmlResolver = GetResolver ();
settings.ValidationType = ValidationType.DTD;
settings.ProhibitDtd = false;

if (eventHandler != null)
settings.ValidationEventHandler += eventHandler;

XmlReader inner = XmlReader.Create (stream, settings);

settings = new XmlReaderSettings ();
settings.Schemas = GetSchemaSet ();
settings.ValidationType = ValidationType.Schema;
settings.ProhibitDtd = false;

if (eventHandler != null)
settings.ValidationEventHandler += eventHandler;

XmlReader reader = XmlReader.Create (inner, settings);

document.Load (reader);
return (document);

Can someone please confirm if I am using the right approach and whether
the
problems with DOCTYPE are bugs, and if so is there a work around. If not
then
it looks like I will have to remain with the old and now 'obsolete'
XmlValidatingReader interface which handles this scenario correctly.

The changes to the XML API in 2.0 seem to forget that there is a
substantial
amount of information coded using DTD based grammars that will be with us
for
many years. Some of the contracts I process represent 10 or 20 year deals.
Not everyone using XML is writing webservices where the XML is simply a
transient wire format. Efficient support for legacy documents and mixed
DTD/schema environments is commercially very important.

Could you first attempt it using a schema, and if that fails, using DTD? Or
should I read your post more carefully?

John


.



Relevant Pages

  • Re: Validating a SAML 1.1 Response
    ... So to have the XmlSchemaSet process that schema you need to load it explicitly with an XmlReader and XmlReaderSettings having ProhibitDtd as false. ... For better error messages/diagnostics you might want to set up a ValidationEventHandler on the Schemas property of your XmlReaderSettings, that way you get a warning that the import did not work. ... XmlReaderSettings settings = new XmlReaderSettings; ... settings.Schemas.ValidationEventHandler += delegate(object sender, ValidationEventArgs vargs) ...
    (microsoft.public.dotnet.xml)
  • Re: Validating a SAML 1.1 Response
    ... So to have the XmlSchemaSet process that schema you need to load it explicitly with an XmlReader and XmlReaderSettings having ProhibitDtd as false. ... For better error messages/diagnostics you might want to set up a ValidationEventHandler on the Schemas property of your XmlReaderSettings, that way you get a warning that the import did not work. ... XmlReaderSettings have ProhibitDtd set to false, that way the imported schemas will be loaded with the same settings and that way the import and the XmlSchemaSet compilation works so that you will get the validation error then for your XML document: ... settings.Schemas.ValidationEventHandler += delegate(object sender, ValidationEventArgs vargs) ...
    (microsoft.public.dotnet.xml)
  • Re: ValidationType.Auto and reading schema attribute
    ... You can chain multiple readers and hence wrap a dtd validating ... reader with a schema validating reader to get the same file to pass through ... XmlReaderSettings settings = new XmlReaderSettings; ...
    (microsoft.public.dotnet.xml)
  • Re: Remove XML node before validating
    ... for this is that we want to validate against a schema instead (which ... the DTD before I can remove it, so I'm assuming there's a better way ... If you want the XmlReader to ignore the referenced DTD then you can try to set the XmlResolver property (of the XmlReaderSettings you create your XmlReader with http://msdn.microsoft.com/en-us/library/system.xml.xmlreadersettings.xmlresolver.aspx) to null. ... That way the reader will not fetch any resources. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: validate xml using xsd
    ... You need to load the schema with a setting to allow DTDs: ... XmlReaderSettings dtdSettings = new XmlReaderSettings; ... XmlReaderSettings settings = new XmlReaderSettings; ... Martin Honnen --- MVP XML ...
    (microsoft.public.dotnet.languages.csharp)