RE: Object serialization and NetworkStream - extraneous characters in output

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Steven Cheng[MSFT] (v-schang_at_online.microsoft.com)
Date: 12/13/04


Date: Mon, 13 Dec 2004 04:16:57 GMT

Hi Jim ,

Thanks for your posting. From your description, you're using the dotnet's
XmlSerializer to serialize a certain class instance out to a NetWorkStream
and at the other side, when you retrieve the stream and try reading the
xmlcontent out, you found there is an additional header "o;?" at the
begining of the xml stream ,yes?

As for the problem you mentioned, I think it is likely due to the encoding
problem. First, as for UNICODE text stream, there will has a header which
indicate the Unicode stream's encoding type. And "o;?" is the one for
UTF-8, and when using other ones such as UTF-16, you will get other value(
ASCII stream won't have such a header). To verify this, you can also use a
UltraEdit to open a unicode(UTF-8) txt file and use hex format to see it,
you'll found the header, it is composed of three bytes 239,187,191 ,
they're all ascii char, and will display as "o;?" if you print them as
ascii string. For example:

byte[] bytes = {239,187,191};
MessageBox.Show(System.Text.Encoding.ASCII.GetString(bytes));

So, when you use XmlSerializer to serialize an object into a certain
stream, if using Unicode encoding type(use UTF-8 for instance), the header
will be added( the first three bytes). However, if you read the xml back
from the stream via UTF-8 encoding, you won't get this three bytes, the
UTF-8 encoding system will automatically remove the header and return the
sequential bytes bebind the header. Here is a simple code snippet to show
this:

===============================
byte[] buffer = null;

XmlSerializer serializer = new XmlSerializer(typeof(userInfo));
                        
userInfo ui = new userInfo();
ui.userName = "steven cheng";
ui.age = 20;
ui.email = "steven@microsoft.com";

MemoryStream ms = new MemoryStream();

StreamWriter sw = new StreamWriter(ms,System.Text.Encoding.UTF8);

serializer.Serialize(sw,ui);

buffer = ms.GetBuffer();
                        
// will return the xml with "o;?" because we use ASCII to decode the byte
which is incorrect
MessageBox.Show(System.Text.Encoding.ASCII.GetString(buffer));
                        
// won't display the "o;?" since the UIF-8(correct encoding) will bypass it
MessageBox.Show(System.Text.Encoding.UTF8.GetString(buffer));
==================================

So, If you found the problems occur in your java client that recieve this
stream, I suggest you check the java code to see whether it is reading the
stream and conver the bytes to string using the correct encoding
type(utf-8). I suspect that it is using the default ASCII encoding to read
the bytes so that the "o;?" come out.

Please have a look at the above things, if there is anything unclear,
please feel free to post here.
HTH.

Regards,

Steven Cheng
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)



Relevant Pages

  • Re: Object serialization and NetworkStream - extraneous characters in output
    ... > and at the other side, when you retrieve the stream and try reading the ... > As for the problem you mentioned, I think it is likely due to the encoding ... > ASCII stream won't have such a header). ... If you found the problems occur in your java client that recieve this ...
    (microsoft.public.dotnet.framework)
  • Re: Object serialization and NetworkStream - extraneous characters in output
    ... > and at the other side, when you retrieve the stream and try reading the ... > As for the problem you mentioned, I think it is likely due to the encoding ... > ASCII stream won't have such a header). ... If you found the problems occur in your java client that recieve this ...
    (microsoft.public.dotnet.framework)
  • RE: Object serialization and NetworkStream - extraneous characters in output
    ... > and at the other side, when you retrieve the stream and try reading the ... > As for the problem you mentioned, I think it is likely due to the encoding ... > ASCII stream won't have such a header). ... If you found the problems occur in your java client that recieve this ...
    (microsoft.public.dotnet.framework)
  • Re: Sending Chr(255) to Serial Port
    ... so settings a stream to "ASCII" lets one write anything up to ... What do you mean by ".NET does not include 7 bit encoding"? ... ASCII, which is 7-bit... ...
    (microsoft.public.dotnet.general)
  • Re: Sending Chr(255) to Serial Port
    ... so settings a stream to "ASCII" lets one write anything up to ... What do you mean by ".NET does not include 7 bit encoding"? ... ASCII, which is 7-bit... ... Jon Skeet - ...
    (microsoft.public.dotnet.general)