Re: Reading XML stream using unmanaged c++
- From: Joseph M. Newcomer <newcomer@xxxxxxxxxxxx>
- Date: Sat, 18 Nov 2006 10:01:58 -0500
Actually, XML is a great deal more powerful than its superficial syntax.
Since I've actually built non-tree implelementations which read and write XML, including
cross-references (there actually is an XML document describing one way of doing this), I
can say that it not only is possible, but it is easy.
You represent the hyperlinks as symbolic XML values and after reading the tree you make a
pass over the tree resolving the hyperlinks. I actually designed this as an explicit part
of an XML-like system we built in 1977, and its generalization to XML took only a couple
days to fully implement. If I'd used one of the XML packages that support this, it would
have taken hours, but we chose a slightly different set of goals.
<doc:node label="keyword" text="text goes here"/>
<doc:node text="this is some text"/> <doc:link ref="keyword"/> <doc:node text="more
text"/>
When we generated the actual text, the node whose link name matched the doc:link ref was
serialized out in the HTML simply by going to that HTML node and serializing its text to
the HTML stream.
THis meant that we only needed one definition of anything, and we could include its text
"by reference" anywhere we wanted it. The result was a highly-cross-linked internal XML
structure.
We had to add tests to keep circular references from becoming infinite, so if we hit a
circular reference we created an internal hyperlink (#link) within the HTML stream.
The entire document was about 1500 HTML pages, but it was all automatically generated. One
of the project authors is now working on an open-source XML system for doing this (our
work was proprietary), and the notion of internal non-tree XML is fundamental to the
design. (Don't confuse representation with structure, the mistake the DOM people made).
****
On Sat, 18 Nov 2006 12:11:10 GMT, Daniel James <wastebasket@xxxxxxxxxxxxxxxx> wrote:
In article news:<7o8pl2d94hsk3gnmhhfr65s2gmboh0tt0l@xxxxxxx>, Joseph M.****
Newcomer wrote:
The biggest problem with DOM is that it is tree-oriented with each child
having a pointer to its parent.
So is XML, by its very nature.
The pointer-to-parent is the mistake. You don't need it, since it can be derived during
the walk, and it limits the ability to manipulate the internal structure
****
****
This makes it impossible to create interesting structures such as shared
subtrees.
These can't be represented in the structure of XML, either.
Yes, they can be. Take a look at the XLink specification
http://www.xml.com/pub/a/2000/09/xlink/part2.html
****
****
I recently did a truly massive application in XML where we were building
internal hyperlinks to multiple subtrees; this is not possible in DOM.
How do you represent the internal hyperlinks? I'm not sure what you're doing
here, but it seems that your links must be represented by part of the CDATA
within the XML document -- not by any part of the XML itself?
No CDATA anywhere. Not needed. See the XLink document for the key ideas.
****
****
In that case you're complaining that DOM can't represent something that is
outside the scope of the document structure that it was designed to model ...
no, of course it can't do that. Whyever would you think it should?
See the XLink specification
****
****
DOM was designed by people who only saw trivial problems. Real problems
can't use it.
DOM was designed to model the structure of an XML document. It does that fairly
well (not perfectly). You seem to want to model some other sort of structure
that you express partly in XML and partly in the data themselves -- as your
structure isn't all expressed by the XML, so it's no wonder that it can't all
be represented in DOM.
No, I just want to not be constrained by the limits of DOM. There is nothing intrinsic in
XML that requires it be tree-structured; that is just a textual representation of
information. The *structure* of the information is not the *textual representation*, and
in fact you can create textual representations of dags and cyclic structures, which I know
because I was doing it 30 years ago. DOM confuses the ideas and says "the map IS the
country". Not so.
The system we did was called the Linear Graph (LG) system. After we used it for a few
years, one of the team, David Lamb, designed IDL (his PhD dissertation). Both LG and IDL
have as their basis the concept that the map is not the country; that the textual
representation is merely a flattening of the intrinsic structure, and the textual
representation contains the necessary hyperlinks so the intrinsic structure can be
re-created by the reader. For example, see Nestor, Newcomer, Gianinni & Stone, "IDL: The
Language and its Implementation" (Prentice-Hall, 1990). XML has all the power we need;
DOM artificially restricts this power by confusing textual representation with content.
****
****
There are real problems in this world whose data structures can be modelled in
XML -- they're not uncommon -- and if you can represent the model in XML you
can represent it in DOM. It's only when the structure of the model can't be
completely expressed in XML that DOM starts to creak around the edges.
Not true. I can represent dags and cyclic structures in XML but not in DOM. And I can
express it completely in XML (no CDATA). I know I can do this, because I've done it (as
recently as two years ago). It takes a slighly more sophisticated reader and writer, but
XML does not preclude richer structures being represented. It took two days to code the
better reader/writer. Anything that simple isn't Rocket Science. (All you have to do is
recognize labels and references, and keep a list of unresolved references and resolve them
when the definitions are found, and ultimately handle undefined references--not a big
deal).
****
Joseph M. Newcomer [MVP]
Cheers,
Daniel.
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
.
- Follow-Ups:
- Re: Reading XML stream using unmanaged c++
- From: Daniel James
- Re: Reading XML stream using unmanaged c++
- References:
- Reading XML stream using unmanaged c++
- From: Mayur
- Re: Reading XML stream using unmanaged c++
- From: Joseph M . Newcomer
- Re: Reading XML stream using unmanaged c++
- From: David Ching
- Re: Reading XML stream using unmanaged c++
- From: Joseph M . Newcomer
- Re: Reading XML stream using unmanaged c++
- From: Daniel James
- Reading XML stream using unmanaged c++
- Prev by Date: Re: CString problem!!!
- Next by Date: Re: CString problem!!!
- Previous by thread: Re: Reading XML stream using unmanaged c++
- Next by thread: Re: Reading XML stream using unmanaged c++
- Index(es):
Relevant Pages
|