Re: Non-ascii characters in VS.NET service




"Jon Skeet [C# MVP]" <skeet@xxxxxxxxx> wrote in message
news:MPG.203630616d52d59b98d83c@xxxxxxxxxxxxxxxxxxxxxxx
Mike Schilling <mscottschilling@xxxxxxxxxxx> wrote:
I've created a simple .NET 1.1 web service using VS.NET 2003: it has one
method that takes a string parameter. It iterates through the input
string,
turning each character into hex and appending it to an output string, and
returns the result.

How is it turning the character into hex?

ret += "0x" + ((int)s[i]).ToString("X");


I now send this service SOAP messages containing non-ASCII characters in
the
field that becomes the input string. Each SOAP message has an XML header
that correctly describes the format of the non-ASCII characters. (I've
tried both iso-8859-1 and utf-8).

What do you mean by "an XML header"? It should just be in the XML
delcaration.

Exactly. Each SOAP message specifies the correct encoding in its XML
declaration, as shown below.


For some reason, each XML character that's non-ASCII has been turned into
a
question mark "?" in the input string. Actually, a UTF-8 character that
contains two bytes becomes two question marks.

That suggests that whatever's producing the XML file is wrong, *or*
that you're looking at the XML in an inappropriate editor. How are you
looking at the XML?

<?xml version="1.0" encoding="iso-8859-1"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xmlns:xsd="
http://www.w3.org/2001/XMLSchema";
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/";>
<soap:Body>
<ToHex xmlns="http://tempuri.org/";>
<s>aéìæf</s>
</ToHex>
</soap:Body>
</soap:Envelope>

NNTP is likely to garble the non-ASCII characters, but in hex the string
inside the <s> tags is

141 351 354 346 146

verified by using od -b.

In iso-8859-1, these are respectively

a, e with an acute accent, i with a grave accent, ae, f

And I can parse the same file locally and observe that it's correct (e.g.
with the program below). It only acts oddly when processed by the web
service.

using System;
using System.Xml;

namespace XMLParser
{
class ParseXML
{
static void Main(string[] args)
{
XmlDocument doc = new XmlDocument();
doc.Load("c:\\java\\toHex.xml");
dumpStrings(doc);
Console.WriteLine("<Done>");
}

private static void dumpStrings(XmlNode node)
{
if (node is XmlCharacterData)
{
Console.Out.WriteLine(node.Value);
}
else
{
for (XmlNode child = node.FirstChild;
child != null;
child = child.NextSibling)
{
dumpStrings(child);
}
}
}
}
}



.



Relevant Pages

  • Re: Searching by Unicode codes
    ... "Suzanne S. Barnhill" wrote: ... Well, I've tried hex and decimal, and it didn't change anything. ... search string ^U0xnnnn, for example - no luck there, and similarly no luck ... maybe the u) to indicate some kind of special character, ...
    (microsoft.public.word.application.errors)
  • Re: numbers written in ascii... looking for all common (and some uncommon) forms....
    ... 4-character text string for bytes, 6 character text string for words ... now if it's not being placed into an assembly language file... ... it's a hex value that's being displayed to the screen (usually in such ...
    (alt.lang.asm)
  • Re: Illegal Charaters in path
    ... I am downloading this file using ... Stripping the first character solved the problem though. ... I have a small XML file, I uploaded to a web page. ... XmlDocument.Load doesn't have an overlaod that loads XML from a string. ...
    (microsoft.public.dotnet.languages.csharp)
  • RE: System.ArgumentException: Illegal characters in path
    ... But I don't use any xml string at all in my web ... It is a default data type string and I wonder it ... cannot accept latin character since string accepts all utf-8 characters. ... Microsoft XML 3.0 SP1 ...
    (microsoft.public.dotnet.framework.webservices)
  • RE: Xml deserialization problem..help needed.
    ... "The '*' character, hexadecimal value 0x2A, cannot begin with a name. ... set of characters...in the value of an xml element. ... I am deserializing the xml data into a c# class I have created. ... All I want to do is take a string of xmldata and deserialize it into a class. ...
    (microsoft.public.dotnet.framework.webservices)