Re: Using MBCS in a UNICODE defined project




This looks odd:
if( fgets( buffer, sizeof(buffer)/sizeof(buffer[0]), f ) )
Yes, it is a bit odd.
sizeof(buffer)/sizeof(buffer[0]) is the same with sizeof(buffer) here
because buffer is always char (not using the generic API)
Just wanted to show "the right way" of calculating character count.


Does fgets really count characters the way MSDN says? Or does it use the
word "characters" with the meaning of "bytes", as would happen when copying
and pasting from the C or C++ standard?
I see no contradiction between MSDN and the C++ standard.
The MSDN meaning is "what the programmer understands by character" (code unit
in the Unicode lingo), not "what the user understands by character"
So MSDN basically says number of char(s), same as number of byte(s).


MSDN says that Visual Studio 2005 added an optional ccs specification to
the
second parameter of _tfopen but you didn't use it. If you did use it, it's
not exactly obvious if there would be some effect on the way fgets and
fgetws and _fgetts count characters.
ccs does no affect how _fgetts counts characters.
_tfopen affects the code page for the opened file, _fgetts count characters
after a code page conversion (if it happens)
So opening a utf-8 file (using ccs) and reading with fgetws means the
utf-8 stream is read, converted to utf-16, and the counter for fgetws
is the number of code units (16 bit each).


optional ccs specification to the
second parameter of _tfopen but you didn't use it
Because the request is to get Chinese MBCS strings and pass them as
is to a function. So I did not want any encoding smartness from ccs.


--
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
.



Relevant Pages

  • Re: Null-terminated strings: the final analysis.
    ... If you use fgets, you can see any '\0' that you had previously written, ... prefilling the buffer and calling fgets, you can scan the buffer backwards ... Whether a text file requires a new-line character on the last line is ...
    (comp.lang.c)
  • Re: sorting the input
    ... of data in the "line" leaves at least two unused bytes in the buffer, ... buffer to mean the "array" pointed to by the first argument to fgets. ... contents of the stream buffer. ... new-line character or after end-of-file. ...
    (comp.lang.c)
  • Re: Null-terminated strings: the final analysis.
    ... If you use fgets, you can see any '\0' that you had previously written, ... prefilling the buffer and calling fgets, you can scan the buffer backwards ... It is clear that NUL in a text stream corrupts the string. ... Handling the last line without '\n' is trivial: If fgets returns non-null, the first '\0' in the buffer will have been placed there by fgets after the last character read from the stream. ...
    (comp.lang.c)
  • Re: Null-terminated strings: the final analysis.
    ... I can reasonably expect to see a '\a' character when I read ... If you use fgets, you can see any '\0' that you had previously written, ... terminator. ... prefilling the buffer and calling fgets, you can scan the buffer backwards ...
    (comp.lang.c)
  • renee.rtf.xaa
    ... renee is RTF parser/macro processor I wrote. ... Character Stream\ ... Write Output Buffer to Files\ ...
    (comp.lang.tcl)