Re: Unicode to ASCII string conversion
From: Jay B. Harlow [MVP - Outlook] (Jay_Harlow_MVP_at_msn.com)
Date: 09/14/04
- Next message: Jay B. Harlow [MVP - Outlook]: "Re: Decompiler.NET reverse engineers your CLS compliant code"
- Previous message: Jay B. Harlow [MVP - Outlook]: "Re: Decompiler.NET reverse engineers your CLS compliant code"
- In reply to: Cor Ligthert: "Re: Unicode to ASCII string conversion"
- Next in thread: Jay B. Harlow [MVP - Outlook]: "Re: Unicode to ASCII string conversion"
- Messages sorted by: [ date ] [ thread ]
Date: Tue, 14 Sep 2004 15:52:45 -0500
Cor,
Read my post. ;-) I only discussed reading & writing strings to ASCII, ANSI,
and UTF8 files (7 & 8 bit encodings).
You are correct System.String & System.Char are UTF-16 (16 bit Unicode),
files can be ANSI, ASCII, UTF7, UTF8, EBCDIC, UTF16 and many other
encodings.
FWIW: VS.NET 2005 (.NET 2.0, aka Whidbey, due out in 2005) appears to
support UTF-32 encoding for files.
http://msdn2.microsoft.com/library/ts575t62.aspx
Hope this helps
Jay
"Cor Ligthert" <notfirstname@planet.nl> wrote in message
news:%23DvHNymmEHA.3968@TK2MSFTNGP11.phx.gbl...
> Jay,
>
> Because of Ger's answer, now I become curious. I did not read it in your
> message, however what is the solution, Ger told he wanted a straight
> string
> to string conversion and explicitly no bytearray, however now I understand
> he can convert Unicode to a 8 bits ANSI String in VBNet? (And I am not
> talking about writing a file with 8 bits chars by decoding the char)
>
> I showed in this thread with a link to an MSDN page that a String contains
> forever 16 bits Chars.
>
> Is that documentation wrong or do I not understand it or maybe even
> something complete different..
>
> Cor
>
>
>
>
> "Jay B. Harlow [MVP - Outlook]" <Jay_Harlow_MVP@msn.com> schreef in
> bericht
> news:%23RwyVzlmEHA.3328@TK2MSFTNGP10.phx.gbl...
>> Ger,
>> > Ah, now I think I get the idea. So when I convert a (Unicode) string
> into
>> > an
>> > ascii byte array, and then the byte array back into a string, I still
> have
>> > Unicode, right?
>> Correct, just remember that you will loose some characters going to &
>> from
>> ASCII.
>>
>> > So that is of no use when you want to write ASCII to a
>> > filestream.
>> If you need an ASCII file, then use a ASCII encoding. It really depends
>> on
>> what is going to read the file again.
>>
>> I would recommend with an ANSI encoding (see below) or UTF8 encoding.
>> With
>> ASCII you will loose all extended characters (ASCII is 7 bit encoding),
> with
>> ANSI you will loose characters that are outside of your regional ANSI
>> code
>> page. UTF8 preserves all Unicode characters. I would recommend ANSI
> encoding
>> if the file was going to be opened by casual users in Notepad. I would
>> recommend UTF8 if full Unicode support is required. ANSI & UTF8 are both
>> 8
>> bit encodings.
>>
>>
>> > Is the code below then writing ASCII output to my filestream?
>>
>> Yes that code is writing ASCII, as you included the ASCII encoding on the
>> StreamWriter constructor.
>>
>> The text file itself will contain ASCII characters, when you subsequently
>> open that text stream and read it (with a StreamReader) it will be
> converted
>> back to Unicode strings. When reading the file back try to use the same
>> encoding as written. For example if you wrote ANSI, then use ANSI to
>> read.
>> If you wrote UTF8, then use UTF8 to read. As ANSI & UTF8 encode
>> characters
>> 127 to 255 differently. Remember that Encoding.UTF8 is used on the stream
>> writer if you do not give one, if you are reading text files created by
>> Notepad, then you want Encoding.Default.
>>
>> I would recommend:
>>
>> > Dim wOutput As New StreamWriter(fsOutput, System.Text.Encoding.Default)
>>
>> Which will write the file in your current ANSI code page as defined by
>> the
>> regional settings in Windows Control Panel. Which will preserve extended
>> characters.
>>
>> Remember that ANSI is an 8 bit encoding that is dependent on region (code
>> page). While ASCII is a 7 bit encoding, ASCII does not support extended
>> characters such as ë. It will be converted into either a normal e or a ?.
>>
>> Hope this helps
>> Jay
>>
>> "Ger" <ger.rietman@rathernospam.sailsoft.nl> wrote in message
>> news:uU3WK3kmEHA.2772@tk2msftngp13.phx.gbl...
>> > Ah, now I think I get the idea. So when I convert a (unicode) string
> into
>> > an
>> > ascii byte array, and then the byte array back into a string, I still
> have
>> > Unicode, right? So that is of no use when you want to write ASCII to a
>> > filestream.
>> >
>> > Is the code below then writing ASCII output to my filestream?
>> >
>> > Dim UnicodeString As String = "abcdëfg"
>> > Dim fsOutput as New FileStream(..)
>> > Dim wOutput As New StreamWriter(fsOutput, System.Text.Encoding.ASCII)
>> > wOutput.WriteLine(UnicodeString)
>> >
>> > Thank you for your reply.
>> >
>> > /Ger
>> >
>> >
>> > "Cor Ligthert" <notfirstname@planet.nl> schreef in bericht
>> > news:eWAgM%23imEHA.3564@tk2msftngp13.phx.gbl...
>> >> Ger,
>> >>
>> >> > Thanks for your reply, but this returns a byte array. I ment
>> >> > straight
>> >> > forward string-to-string conversion. It is possible ofcourse to
>> >> > write
> a
>> >> > simple function to do this and using the encoding class, but I was
> just
>> >> > wondering why the framework does not support the "direct
>> >> string-to-string".
>> >>
>> >> In the dotNet is a "String" is forever a string of unicode Chars. What
>> >> you
>> >> call "ascii string" is forever a bytearray.
>> >>
>> >> Therefore as an answer there is nothing more than Herfried suggested.
>> >> Although you can create an array of objects which contains bytes,
> however
>> >> that is no solution in my opinion.
>> >>
>> >> I hope this helps to get the idea?
>> >>
>> >> Cor
>> >>
>> >>
>> >>
>> >
>> >
>>
>>
>
>
- Next message: Jay B. Harlow [MVP - Outlook]: "Re: Decompiler.NET reverse engineers your CLS compliant code"
- Previous message: Jay B. Harlow [MVP - Outlook]: "Re: Decompiler.NET reverse engineers your CLS compliant code"
- In reply to: Cor Ligthert: "Re: Unicode to ASCII string conversion"
- Next in thread: Jay B. Harlow [MVP - Outlook]: "Re: Unicode to ASCII string conversion"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|