Re: Missing characters after file rewrite using File.OpenText



Zark3 <Zark3net@xxxxxxxxx> wrote:
Unsure if this is the best group to place this, but here it is anyway
;).
I've got a large text file that needs rewriting into a different
format, and decided to try it using C#, which usually does my
programming tricks... However, this time I've got a difference of
opinion with the result :(
In words using accents and special chars (i.e. façade [c cedilla] or
één [e acute]) the result of my efforts just omits these characters.
Not the words entirely, just those letters. (i.e. façade turns into
faade). Pretty much, my question is why? I'm probably just forgetting
to set a text-encoding variable somewhere, but I can't seem to find out
where it should go.

If your input file isn't in UTF-8, you should specify the encoding when
you create your StreamReader.

If your output file isn't meant to be UTF-8, you should specify the
encoding when you create your StreamWriter.

--
Jon Skeet - <skeet@xxxxxxxxx>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
.



Relevant Pages

  • Re: Print Spanish characters in Perl?
    ... and ensure that your file is saved in the UTF-8 format. ... encoding then your display device expects. ... forgetting to specify UTF-8 as charset. ... To avoid this kind of problem, make sure that all the characters are ...
    (comp.lang.perl.misc)
  • Re: Foreign Characters in XML
    ... The StreamReader and StreamWriter classes have overloaded constructors to ... System.Text.Encoding encoding) ... Can I specify the ...
    (microsoft.public.dotnet.languages.vb)
  • Re: Encoding in ASP.NET sites
    ... must be saved in UTF-8, ... The response encoding is set as charset attribute as part of the Content-Type HTTP header, which overrules META tags in every civilized browser in use. ... then alll you need is to specify the default encoding for your app ... But all of this isn't required, *if* requestEncoding and responseEncoding are set to UTF-8 in web.config, which is the default. ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: streamreader will not read UK pound sign!!!
    ... The default encoding for StreamReader is Unicode UTF-8, ... If you want to use UTF-8, make sure the textfile is saved as UTF-8, if you use something ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: ?Unicode char(?) getting lost in ?HttpWebResponse.
    ... > The response XML also contains the encoding information in its XML header. ... but if the StreamReader is already trying to decode it as UTF-8 ... assuming your response isn't mostly made up of Japanese ...
    (microsoft.public.dotnet.framework)