UTF8/UTF7/ASCII problem while reading from text file

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Lenard Gunda (frenzy_at_fbi.hu)
Date: 08/06/04


Date: Fri, 6 Aug 2004 14:32:58 +0300

hi!

I have the following problem. I need to read data from a TXT file our
company receives.
I would use StreamReader, and process it line by line using ReadLine,
however, the following problem occurs.

The file contains characters with ASCII codes above 128. But the file is
still text (nothing like UTF7/8 or the like). It also might contain + signs.
As a result:

UTF8 encoding doesn't read characters above 128
UTF7 encoding reads everything ok, except eats the + signs, and some
characters after them
ASCII encoding reads the + sign ok, however, characters above 128 are
disappear.

Because the file arrives in this form, I do not have any control on how it
looks like. The best idea so far was to create an own ReadLine method, that
reads the file byte after byte, and converts using UTF7, while taking
special care to feed the + character (ASCII code 46) to an ASCII encoder.
This way I could build a string from a line, that contains exactly what's in
the file.

But would there be a nicer way, or just this do-it-yourself-manually?

thanx

-Lenard



Relevant Pages

  • Re: Simple program: who can write a shorter code?
    ... (defun foo (string) ... The only sure way to get ASCII would be to write your own ASCII-CODE ... function to translate characters into their ASCII codes. ...
    (comp.lang.lisp)
  • Re: Writing extended ascii characters to text file.
    ... John ... so in order to get real ASCII codes you should use the GetBytes ... >> I am just trying to port an existing simple encryption routine to C#. ... >> however when I encrypt the file, several characters are corrupted. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Simple program: who can write a shorter code?
    ... (defun foo (string) ... The only sure way to get ASCII would be to write your own ASCII-CODE ... function to translate characters into their ASCII codes. ...
    (comp.lang.lisp)
  • Re: UTF8/UTF7/ASCII problem while reading from text file
    ... > The file contains characters with ASCII codes above 128. ... encoding is still a text file. ... The best idea so far was to create an own ReadLine method, ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: "a" or "an" before "h----"
    ... I fear not, as plain text means "characters with the ASCII codes 0--127, ... moved beyond ASCII codes 0--127 as The One And Only Standard, ...
    (alt.usage.english)