Re: REWARD: chr() not working for Chinese "Locale"

From: Jon Skeet [C# MVP] (skeet_at_pobox.com)
Date: 10/13/04


Date: Wed, 13 Oct 2004 08:31:11 +0100

John Cody <newsgroups@noospammmax-soft.com> wrote:
> Basically, I have a routine that encrypts a string. Each character of the
> resulting string can have a value of 0-255. It seems that with a "Latin"
> based character set, characters 128 to 255 are all unique, so no matter what
> value the encryption routine generated for each character, it was
> reversible - meaning the original string could be de-crypted back to its
> original value.

Urgh. That's a nasty idea. It may well work, but it's not at all nice.
Base64 is a much better way to convert what is essentially binary data
into a string which is likely to survive whatever you throw at it.

Actually, ISO-8859-1 sort of has a hole between 128 and 140 (or 160 -
can't remember now). (I used to disagree with this based on the Unicode
spec stating that the first 256 values of Unicode were the same as ISO-
8859-1, but now I'm not so sure. It's a shame that the spec itself
costs money...) What you're probably referring to is something like
Code Page 1252, which is the same as ISO-8859-1 for every value other
than the "hole".
 
> However, it appears that when a "cero" character set is used, there are a
> lot of invalid/duplicate characters in the 128-255 range, so when the
> encryption routine tries to set a character to a value in this range, say
> 213 using CHR(213), it would get transposed to some default character code
> like 63, which obviously would corrupt the string so it could not be
> de-crypted using ASC(x), because it would not be the same value as the
> original character it is suppose to be.
>
> But, by using chrw(x) and Ascw(x), proper unique character are
> generated.

It's still basically a bad idea to keep binary data as a string like
that though. Base64 encoding is easy and doesn't add *very* much bulk
to the data.

-- 
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too


Relevant Pages

  • RfD: Escaped Strings version 4
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... as an escape character for the entry of characters that cannot be ... \b BS (backspace, ASCII 8) ...
    (comp.lang.forth)
  • RfD: Escaped Strings version 4
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... as an escape character for the entry of characters that cannot be ... \b BS (backspace, ASCII 8) ...
    (comp.lang.forth)
  • Re: RfD: Escaped Strings
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... \b BS (backspace, ASCII 8) ... \ ** escapes to characters much as C does. ...
    (comp.lang.forth)
  • Re: embedded NULLS in any vc++ 7.1 object
    ... check the formal specs for character names. ... > Get iFileNum,, sBlockOfFileData ... >Any suggestions or workarounds on how I can fill a string object with ... >binary data including NULLS? ...
    (microsoft.public.vc.mfc)
  • Re: A note on computing thugs and coding bums
    ... code is valid for any character set that is legal in C (which is a ... characters in the required source character set ... A String, in C Sharp or Java, can be redefined. ... allow programmers to handle some other data format, ...
    (comp.programming)