RE: Object serialization and NetworkStream - extraneous characters in output
From: Steven Cheng[MSFT] (v-schang_at_online.microsoft.com)
Date: 12/13/04
- Next message: Travis: "Re: Unsafe code: Converting "byte *" to "[] byte""
- Previous message: Etienne Fortin: "Localization of Exception messages"
- In reply to: jwallison: "Object serialization and NetworkStream - extraneous characters in output"
- Next in thread: jwallison: "Re: Object serialization and NetworkStream - extraneous characters in output"
- Reply: jwallison: "Re: Object serialization and NetworkStream - extraneous characters in output"
- Messages sorted by: [ date ] [ thread ]
Date: Mon, 13 Dec 2004 04:16:57 GMT
Hi Jim ,
Thanks for your posting. From your description, you're using the dotnet's
XmlSerializer to serialize a certain class instance out to a NetWorkStream
and at the other side, when you retrieve the stream and try reading the
xmlcontent out, you found there is an additional header "o;?" at the
begining of the xml stream ,yes?
As for the problem you mentioned, I think it is likely due to the encoding
problem. First, as for UNICODE text stream, there will has a header which
indicate the Unicode stream's encoding type. And "o;?" is the one for
UTF-8, and when using other ones such as UTF-16, you will get other value(
ASCII stream won't have such a header). To verify this, you can also use a
UltraEdit to open a unicode(UTF-8) txt file and use hex format to see it,
you'll found the header, it is composed of three bytes 239,187,191 ,
they're all ascii char, and will display as "o;?" if you print them as
ascii string. For example:
byte[] bytes = {239,187,191};
MessageBox.Show(System.Text.Encoding.ASCII.GetString(bytes));
So, when you use XmlSerializer to serialize an object into a certain
stream, if using Unicode encoding type(use UTF-8 for instance), the header
will be added( the first three bytes). However, if you read the xml back
from the stream via UTF-8 encoding, you won't get this three bytes, the
UTF-8 encoding system will automatically remove the header and return the
sequential bytes bebind the header. Here is a simple code snippet to show
this:
===============================
byte[] buffer = null;
XmlSerializer serializer = new XmlSerializer(typeof(userInfo));
userInfo ui = new userInfo();
ui.userName = "steven cheng";
ui.age = 20;
ui.email = "steven@microsoft.com";
MemoryStream ms = new MemoryStream();
StreamWriter sw = new StreamWriter(ms,System.Text.Encoding.UTF8);
serializer.Serialize(sw,ui);
buffer = ms.GetBuffer();
// will return the xml with "o;?" because we use ASCII to decode the byte
which is incorrect
MessageBox.Show(System.Text.Encoding.ASCII.GetString(buffer));
// won't display the "o;?" since the UIF-8(correct encoding) will bypass it
MessageBox.Show(System.Text.Encoding.UTF8.GetString(buffer));
==================================
So, If you found the problems occur in your java client that recieve this
stream, I suggest you check the java code to see whether it is reading the
stream and conver the bytes to string using the correct encoding
type(utf-8). I suspect that it is using the default ASCII encoding to read
the bytes so that the "o;?" come out.
Please have a look at the above things, if there is anything unclear,
please feel free to post here.
HTH.
Regards,
Steven Cheng
Microsoft Online Support
Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)
- Next message: Travis: "Re: Unsafe code: Converting "byte *" to "[] byte""
- Previous message: Etienne Fortin: "Localization of Exception messages"
- In reply to: jwallison: "Object serialization and NetworkStream - extraneous characters in output"
- Next in thread: jwallison: "Re: Object serialization and NetworkStream - extraneous characters in output"
- Reply: jwallison: "Re: Object serialization and NetworkStream - extraneous characters in output"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|