Re: Can't read CString after serialization
- From: Joseph M. Newcomer <newcomer@xxxxxxxxxxxx>
- Date: Tue, 11 Mar 2008 20:12:16 -0500
Remember that the machines of that era were 1000x slower than a low-end consumer machine,
had 18-bit or 24-bit addresses, "huge" machines (like the one I used) had 4MB of memory (I
tell people "The first 4MB module I purchased required a fork lift to install", which is
true, but not quite accurate; it required an "extended fork" fork lift to install it). We
actually had an XML-like notation we used (I think I may have invented XML in 1974 while
working on my dissertation, simply because I needed a textual representation of complex
graphs). When we started the PQCC (Production Quality Compiler-Compiler) project at CMU
in 1977, I insisted we have a textual representation of the data structures. Steve Hobbs
designed what would have been the DTD and wrote the first parser. Parsing it was slow
enough that we had a tagged-binary representation we could also use. David Dill (noted
for his work on Verified Voting) wrote the tagged-binary I/O module for us, around 1978.
The tagged binary version was smaller than the textual version, and read in much faster.
Eventually I got to the point where the binary reader was blindingly fast compared to the
text reader, about three iterations from our first implementation.
We used it to represent compiler intermediate state. Our parse trees, flow graphs,
generated-code lists, etc. were all represented in this format. I wrote the second
generation tooling that did the equivalent of parsing a DTD and creating a table-driven
interpreter that would convert the external form to internal form, as well as truly pretty
pictures of the data structures. I later wrote the same package for IDL when we were at
Tartan Laboratories (a compiler company in Pittsburgh). John Nestor and I wrote the
interface to the implementation language we were using. David Lamb and John Nestor took
the work we had done on the "LG" (Linearized Graph) package and invented the IDL language,
which gave us platform-independent representations of very rich data structures (it was
David's PhD dissertation). We could write out the IDL text from a compiler on one machine
and read it into a code generator hosted on another machine. John Nestor and Paola
Gianinni wrote the formal (denotational semantic) description of IDL, which supported
multiple inheritance and subclassing. Don Stone helped us implement the open-source
portable version described in our book.
joe
On Tue, 11 Mar 2008 11:13:23 -0700, "Tom Serface" <tom.nospam@xxxxxxxxxxxxx> wrote:
That is very cool and may prove (easily) to be more compact than XML. OfJoseph M. Newcomer [MVP]
course XML is slower and bigger, but much easier to read in Notepad so easy
to debug and update. I guess it all depends on the size of the data. If
the data is several megabytes or grows continuously, a better "database"
type file format is warranted, but if it's something like configuration data
or start up data or last state data then something extremely readable is
usually a great way to implement it imo.
Tom
"Joseph M. Newcomer" <newcomer@xxxxxxxxxxxx> wrote in message
news:bpedt3195v7nk2c453f4vc8egsn3h5riop@xxxxxxxxxx
Prior to XML, we loved "tagged binary" format. In tagged-binary, you
would write out a
block which had a <type,length> pair, then each field had a <field-tag,
length> pair. The
nice thing was that it was backward-compatible without much effort, for
certain kinds of
backward-compatibility. Two approaches could be used (and the same
applies to XML): throw
away any tag you don't recognize (loses information), silently (pop up a
warning that a
newer version file is being read; upon save, pop up a warning that
information has been
discarded, do you want to overwrite, but no popups during read).
Alternatively, keep a
list of the unknown tags and write out verbatim on output (might work,
depends on what
info is in the tags). In the latter case, a header bit says that the file
had been
processed by an earlier version, and the newer version knew whether or not
it could trust
certain values, or if they had to be recomputed, discarded, etc.
The nice thing about tagged binary was that you could easily write a
simple program to
parse the file and show it (for example, a columnar display of tag id, tag
value in
decimal and hex, etc.). We could write out complex structures because we
used a technique
similar to based pointers, writing pointers out as relative offsets. If
the target had
not been written out, a set of fixup blocks were appended to the end of
the file and what
was stored was a flagged value that said "I'm a fixup reference" or "I'm a
real offset".
We built huge systems using these techniques. I specified this technique
back in 1977,
and it was ultimately used in our IDL system [Nestor, Newcomer, et al.,
"IDL: The Language
and its Implementation", Prentice-Hall, 1989].
joe
On Tue, 11 Mar 2008 08:09:50 -0700, "Tom Serface"
<tom.nospam@xxxxxxxxxxxxx> wrote:
Oen of the biggest problem is similar to what OP is experiencing. It isJoseph M. Newcomer [MVP]
just too hard to read the resulting file so debugging it is very
difficult.
If you don't want to use XML (perhaps too much overhead) even doing a
binary
file with a format you know is better since you can open the file in
binary
mode in VS and check it out. The serialized file structure was always
difficult for me to use especially when adding new items for future
releases. OK, I'll stop preaching now.
Tom
"Giovanni Dicanio" <giovanni.dicanio@xxxxxxxxxxx> wrote in message
news:ObKaEr1gIHA.4396@xxxxxxxxxxxxxxxxxxxxxxx
"Tom Serface" <tom.nospam@xxxxxxxxxxxxx> ha scritto nel messaggio
news:4C5C98CB-BE7E-4A22-9B81-2085E04C6046@xxxxxxxxxxxxxxxx
These kinds of problems are why I typically don't use serializing any[...]
more. I like to use XML
Hi Tom,
I do agree with you.
The serialization mechanism of MFC template containers like CArchive is
not good IMHO; for example, I don't like very much the fact that we
should
define global helper functions template specializations for the
particular
type to serialize.
G
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
.
- Follow-Ups:
- Re: Can't read CString after serialization
- From: Giovanni Dicanio
- Re: Can't read CString after serialization
- From: Tom Serface
- Re: Can't read CString after serialization
- References:
- Can't read CString after serialization
- From: Alexh
- Re: Can't read CString after serialization
- From: Tom Serface
- Re: Can't read CString after serialization
- From: Giovanni Dicanio
- Re: Can't read CString after serialization
- From: Tom Serface
- Re: Can't read CString after serialization
- From: Joseph M . Newcomer
- Re: Can't read CString after serialization
- From: Tom Serface
- Can't read CString after serialization
- Prev by Date: Re: MFC/C++ vs .Net/C#
- Next by Date: Re: Can't read CString after serialization
- Previous by thread: Re: Can't read CString after serialization
- Next by thread: Re: Can't read CString after serialization
- Index(es):