Re: Unicode to ASCII string conversion

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Ger (ger.rietman_at_rathernospam.sailsoft.nl)
Date: 09/14/04


Date: Tue, 14 Sep 2004 16:31:43 +0200

Thank you very much guys for your help and clearing up the issue for me.
I will go for Jay's solution and use ANSI 8-bit.
/Ger

"Jay B. Harlow [MVP - Outlook]" <Jay_Harlow_MVP@msn.com> schreef in bericht
news:%23RwyVzlmEHA.3328@TK2MSFTNGP10.phx.gbl...
> Ger,
> > Ah, now I think I get the idea. So when I convert a (Unicode) string
into
> > an
> > ascii byte array, and then the byte array back into a string, I still
have
> > Unicode, right?
> Correct, just remember that you will loose some characters going to & from
> ASCII.
>
> > So that is of no use when you want to write ASCII to a
> > filestream.
> If you need an ASCII file, then use a ASCII encoding. It really depends on
> what is going to read the file again.
>
> I would recommend with an ANSI encoding (see below) or UTF8 encoding. With
> ASCII you will loose all extended characters (ASCII is 7 bit encoding),
with
> ANSI you will loose characters that are outside of your regional ANSI code
> page. UTF8 preserves all Unicode characters. I would recommend ANSI
encoding
> if the file was going to be opened by casual users in Notepad. I would
> recommend UTF8 if full Unicode support is required. ANSI & UTF8 are both 8
> bit encodings.
>
>
> > Is the code below then writing ASCII output to my filestream?
>
> Yes that code is writing ASCII, as you included the ASCII encoding on the
> StreamWriter constructor.
>
> The text file itself will contain ASCII characters, when you subsequently
> open that text stream and read it (with a StreamReader) it will be
converted
> back to Unicode strings. When reading the file back try to use the same
> encoding as written. For example if you wrote ANSI, then use ANSI to read.
> If you wrote UTF8, then use UTF8 to read. As ANSI & UTF8 encode characters
> 127 to 255 differently. Remember that Encoding.UTF8 is used on the stream
> writer if you do not give one, if you are reading text files created by
> Notepad, then you want Encoding.Default.
>
> I would recommend:
>
> > Dim wOutput As New StreamWriter(fsOutput, System.Text.Encoding.Default)
>
> Which will write the file in your current ANSI code page as defined by the
> regional settings in Windows Control Panel. Which will preserve extended
> characters.
>
> Remember that ANSI is an 8 bit encoding that is dependent on region (code
> page). While ASCII is a 7 bit encoding, ASCII does not support extended
> characters such as ë. It will be converted into either a normal e or a ?.
>
> Hope this helps
> Jay
>
> "Ger" <ger.rietman@rathernospam.sailsoft.nl> wrote in message
> news:uU3WK3kmEHA.2772@tk2msftngp13.phx.gbl...
> > Ah, now I think I get the idea. So when I convert a (unicode) string
into
> > an
> > ascii byte array, and then the byte array back into a string, I still
have
> > Unicode, right? So that is of no use when you want to write ASCII to a
> > filestream.
> >
> > Is the code below then writing ASCII output to my filestream?
> >
> > Dim UnicodeString As String = "abcdëfg"
> > Dim fsOutput as New FileStream(..)
> > Dim wOutput As New StreamWriter(fsOutput, System.Text.Encoding.ASCII)
> > wOutput.WriteLine(UnicodeString)
> >
> > Thank you for your reply.
> >
> > /Ger
> >
> >
> > "Cor Ligthert" <notfirstname@planet.nl> schreef in bericht
> > news:eWAgM%23imEHA.3564@tk2msftngp13.phx.gbl...
> >> Ger,
> >>
> >> > Thanks for your reply, but this returns a byte array. I ment straight
> >> > forward string-to-string conversion. It is possible ofcourse to write
a
> >> > simple function to do this and using the encoding class, but I was
just
> >> > wondering why the framework does not support the "direct
> >> string-to-string".
> >>
> >> In the dotNet is a "String" is forever a string of unicode Chars. What
> >> you
> >> call "ascii string" is forever a bytearray.
> >>
> >> Therefore as an answer there is nothing more than Herfried suggested.
> >> Although you can create an array of objects which contains bytes,
however
> >> that is no solution in my opinion.
> >>
> >> I hope this helps to get the idea?
> >>
> >> Cor
> >>
> >>
> >>
> >
> >
>
>



Relevant Pages

  • Re: Keeping track of paper files
    ... ascii character encoding large among them. ... Yeah, there have been other encoding schemes, like EBSDIC, but even the ... codes were assigned to function as SHIFT/UNSHIFT characters... ...
    (soc.genealogy.misc)
  • =?utf-8?B?UmU6IFN0cmluZyAiw6LigqzihKIiIHRyYW5zbGF0ZWQgdG8gYXBvc3Ryb3BoZS4gV2h5Pw==?=
    ... it works), though it seems to use mostly just Ascii characters, representing ... but the author is not making the best possible use of UTF-8. ... They don't map it to ASCII apostrophe, ... Latin 1 encoding. ...
    (alt.html)
  • Re: what does "serialization" mean?
    ... it's the most important piece of the ASCII ... ANSI recognized that 128 characters were ... ASCII committee hasn't met to discuss character encoding formats for many, ... Space Invaders or LEM games. ...
    (comp.programming)
  • Re: Apostrophe
    ... The octets do not belong to the ASCII range at all. ... referred to as ASCII characters, and this is a gross and essential error, though I tried to deal with it with silk gloves before you forced me to say it more explicitly. ... When working just within one data processing system and one 8-bit encoding, such misrepresentations are little more than terminological errors. ... On the contrary, it would be worse than not specifying the encoding at all, since when declared as ASCII encoded, the data (or at least all octets larger than 7F hexadecimal) should be treated as erroneous and malformed, instead of making heuristic guesses. ...
    (comp.infosystems.www.authoring.html)
  • Re: How do I get unicode support in python?
    ... unable to print any characters outside of ascii. ... What do I need to do to get python on the web server to have unicode ... For Python to be able to "print" unicode characters to the console, ... know the encoding of the console. ...
    (freebsd-questions)