Re: wofstream

From: tom_usenet (tom_usenet_at_hotmail.com)
Date: 07/23/04


Date: Fri, 23 Jul 2004 16:18:07 +0100

On Mon, 19 Jul 2004 12:34:56 +0400, "Vladimir" <voinkovv@mail.ru>
wrote:

>Hi all,
>
>I found that wide char file stream doesn't write national symbols. That is
>unlike narrow stream.
>Is it a bug or there is something I should know? I noticed also that the
>stream writes wide char
>as byte. Is it UTF8? How can I switch text encoding? Could you point me some
>docs or
>samples. Thanks!

To change the encoding you have to embue the filestream with a locale
that includes a codecvt facet for the conversion. Unfortunately, MSVC
doesn't come with any except the text one (that does \n \r\n
conversions), and the not very useful default wstream one which just
converts wchars to chars by chopping off the second byte!

The best solution is to buy Dinkumware's CoreX library
(www.dinkumware.com); it includes support for a huge number of
encodings. If that's not an option, and UTF8 is all you really need,
then you can download a UTF8 conversion facet from here:

http://www.rrsd.com/boost/index.htm
with docs here:
http://www.rrsd.com/boost/libs/serialization/doc/codecvt.html

Tom



Relevant Pages

  • Re: bytes, chars, and strings, oh my!
    ... encoding may vary from system to system. ... (PostScript is a language that doesn't distinguish between byte and char, because it was invented back in the 1980s era). ... Its char->byte conversion is dropping the zero high-byte, and treating all chars beyond '\u00FF' as being illegal. ...
    (comp.lang.java.programmer)
  • Re: Unicode conversion
    ... what encoding your char string is in, then use wcstombs and mbstowcs functions to perform the conversion. ...
    (comp.lang.c)
  • Re: Sets and portability (was) Re: Is ISO Pascal compatible with J&W (original) Pascal ?
    ... strings, the user can control the length by the data they process; ... >> The computer world is more complex than it's ever been (eg Unicode) ... The Pascal `Char' type can be this size (unlike C, ... > Note that ansi->wide conversion is codepage sensitive. ...
    (comp.lang.pascal.misc)
  • Re: Send string to IP address
    ... "Plain hex" implies something formatted as text, but doesn't answer the question of encoding. ... There's no "just" as far as "an ASCII string" is concerned. ... Characters are not bytes and bytes are not characters. ... Normally you'd create the Writer once at the same time as you create the underlying stream, rather than every time you write some text, obviously. ...
    (comp.lang.java.programmer)
  • Re: "Read stuff from a file and chop it up to do stuff" code advice wanted.
    ... ;; This function returns TRUE if any character ... (if (char< char #\!) ... a stream and an array to hold characters in temp memory. ... ;; resulting string. ...
    (comp.lang.lisp)