Re: Is it difficult to read UTF-8 .txt in VC++?

Tech Tip: Click here to run a free scan for Windows Errors and optimize PC performance



On Jul 7, 3:58 am, David Wilkinson <no-re...@xxxxxxxxxxxx> wrote:
Sin Jeong-hun wrote:
I want to read a line from a .txt file (which is saved in
Unicode(UTF-8) and show the string in a MessageBox.
With C#, it is very easy.

StreamReader sr=new StreamReader("c:\\a.txt",Encoding.UTF8);
string line=sr.ReadLine();
sr.close();
MessageBox.Show(line);

But with unmanaged Visual C++, I couldn't even find a simple example.
I tried to use wifstream.getline but it didn't read Asian characters
correctly.
I've searched for the news groups for an answer but I all I found was
that
I might have to buy a third party library just to do that.
Do I really need to buy a library just to read simple UTF-8 text file?
(Of course, in case I don't know how to create an equivalent library
myself)

If not, could you plelase tell me the VC++ equivalent code for the C#
code above?
Thank you.

Sin:

UTF-8 uses 8-bit chatracters, so you should use ifstream not wifstream.
Use MultiByteToWideChar to convert to UTF-16. If your app is ANSI
compiled (I hope not) then use WideCharToMultiByte() to get the local
code page.

--
David Wilkinson
Visual C++ MVP
Thank you, I'll try that.


.



Relevant Pages

  • Re: Interpretation of extensions different from Unix/Linux?
    ... the use of UTF-8 in this way is the recommendation of the ARG. ... (UTF-8 is a problem of its own in Ada. ... a UTF-8 encoded string is a String. ... You can't enumerate roots in Windows, ...
    (comp.lang.ada)
  • Re: Unicode Delphi Win32 - which approach
    ... I like the backwards compatibility aspects of UTF-8 vs UTF-16. ... The first 256 Unicode characters map to the ANSI character set. ... entire stream> but calling an API 100 times in a loop I can imagine. ... and explicitly contextualise every string. ...
    (borland.public.delphi.non-technical)
  • Re: UTF-8 encoding
    ... I need to pass a UTF-8 encoded writer ... reading that file with the system's default encoding. ... String), but used elsewhere as if it were a StringBuffer. ... There's a very good reason that ...
    (comp.lang.java.programmer)
  • Re: Chinese filenames
    ... Always use simple ASCII characters. ... Ensure your PHP script be properly UTF-8 encoded. ... The name of the file can be acquired as a UTF-8 string: ...
    (comp.lang.php)
  • Seed7 (was: Program compression)
    ... Does Seed7 include a parser that reads Seed7 source-code syntax ... ] structures with string elements) the memory allocated for all ... | The type 'char' describes UNICODE characters. ... UTF-8 coding of a single character, ...
    (comp.programming)