Re: problem with asc function

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



guoqi zheng <no@xxxxxxxxx> wrote:
> >>That is converting the character to a byte using the default encoding of
> the current thread - is that
> definitely what you want to do?
>
> I think I know why I have error but you don't have. Because we have
> different codepage/encoding.

Almost certainly.

> I am trying to decode yEnc encoded binary post on newsgroup, those bytes
> will be written to a output stream directly. I am not sure what kind of
> encoding they used originally to decode those binary, what I know is that I
> need to remove \r\n, convert it back to byte and write it to a binary file.

Without knowing the encoding, you can't recognise the "\r\n".

> Actually, it was always my question, what kind of encoding I should use to
> convert bytes I received from NNTP to string? I used ISO-8859-1 for now.

Well, it sounds to me like you don't really need to convert the bytes
at all. If you assume that the "\r\n" are encoded as bytes 13 and 10
respectively, you should be able to do it all without ever treating it
as text data.

If you *have* to treat it as character data, using 8859-1 is probably a
good bet. In theory I believe it doesn't contain characters for bytes
128-139, but in practice I believe the encoding treats them as Unicode
128-139.

Alternatively, just cast each character to a byte using CByte.

I've just had a look at the yEnc spec, and unfortunately it seems to
have been written by someone who doesn't appreciate the difference
between binary data and text data, and also doesn't understand that
ASCII doesn't have any values > 127...

--
Jon Skeet - <skeet@xxxxxxxxx>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
.



Relevant Pages

  • =?ISO-8859-1?Q?Re=3A_How_to_upload_a_=A3?=
    ... A reference to a character that will display as this glyph ... Correctly encoding some bytes so as to be recognised as this ... ASCII-like encodings are old and only cope with a character set of up ... straight for UTF-8. ...
    (alt.html)
  • Re: C# and encodings
    ... But if windows has numerous code pages, ... encoding, and thus have only 255 code points matched to characters? ... Unicode can't be represented in only 8-bits, ... But Notepad supports Unicode and yet it only recognizes 255 character, ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: UTF-8 JavaScript files
    ... If the adopted encoding form is not otherwise ... That is a subset of a character set, ... Well, what I know is that when talking about HTML, the difference ... Whether UTF-8 would be most widely used was ...
    (comp.lang.javascript)
  • Re: Writing to the newsgroup?
    ... you should be able to set the encoding and use the encoding you ... I'm not familiear with Unitype Global writer, ... However, if you use its help feature to inquire about 'character encoding', ... Here's the UTF-8 test. ...
    (sci.lang.japan)
  • Re: [PHP] First stupid post of the year. [SOLVED]
    ... one can argue how many bytes are needed to represent a character ... in what encoding, but that doesn't change the character. ... Unicode it is called U+00A0. ... there are a few ways to encode U+00A0. ...
    (php.general)