Re: How to read UTF-8 chars using VBA



As I stressed above, you have to read it from the file in binary mode (not
with Line Input), and then convert it from UTF-8 to Unicode.

The following code assumes the utf.txt file contains a properly-flagged
UTF-8 content (e.g. as saved by Notepad). Note that this means there will be
a magic 3-byte UTF-8 marker at the start of the file. The UTF8->Unicode
conversion will convert this to the Unicode marker &HFEFF.

'============== Start Form1 ===========
Private Const CP_UTF8 = 65001

Private Declare Function MultiByteToWideChar Lib "kernel32" ( _
ByVal CodePage As Long, ByVal dwFlags As Long, _
ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, _
ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long

Public Function sUTF8ToUni(bySrc() As Byte) As String
' Converts a UTF-8 byte array to a Unicode string
Dim lBytes As Long, lNC As Long, lRet As Long

lBytes = UBound(bySrc) - LBound(bySrc) + 1
lNC = lBytes
sUTF8ToUni = String$(lNC, Chr(0))
lRet = MultiByteToWideChar(CP_UTF8, 0, VarPtr(bySrc(LBound(bySrc))),
lBytes, StrPtr(sUTF8ToUni), lNC)
sUTF8ToUni = Left$(sUTF8ToUni, lRet)
End Function

Private Function ConvertUTF8File(sUTF8File As String) As String
Dim iFile As Integer, bData() As Byte, sData As String, lSize As Long

' Get the incoming data size
lSize = FileLen(sUTF8File)
If lSize > 0 Then
ReDim bData(0 To lSize - 1)

' Read the existing UTF-8 file
iFile = FreeFile()
Open sUTF8File For Binary As #iFile
Get #iFile, , bData
Close #iFile

' Convert all the data to Unicode (all VB Strings are Unicode)
sData = sUTF8ToUni(bData)
Else
sData = ""
End If
ConvertUTF8File = sData
End Function

Private Sub Form_Load()
Dim vLine As Variant, sFileBody As String

' Load the UTF-8 file body into a Unicode string variable
sFileBody = ConvertUTF8File("utf.txt")
' Remove the leading Unicode marker from the body (i.e. the &HFEFF
sequence)
sFileBody = Mid$(sFileBody, 2)
Debug.Print sFileBody
' Now add the separate lines to out list box
For Each vLine In Split(sFileBody, vbCrLf)
List1.AddItem CStr(vLine)
Next vLine
End Sub
'============== End Form1 ===========

Tony Proctor

"MSK" <mannaikarthik@xxxxxxxxx> wrote in message
news:1165332779.981591.277710@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
How did you load ? Can you please pass me the code.. ? Did you use any
builtin methods to convert ?

I tried to load like the following

open <filename> for input as #1
do while end of file
line input #1 ,ABC
loop
close #1

using the ABC (string) variable I filled the combo box

Can we get the y,w (circumflex) chars also as it is?

MSK



.



Relevant Pages

  • Re: Unicode Delphi Win32 - which approach
    ... I like the backwards compatibility aspects of UTF-8 vs UTF-16. ... The first 256 Unicode characters map to the ANSI character set. ... entire stream> but calling an API 100 times in a loop I can imagine. ... and explicitly contextualise every string. ...
    (borland.public.delphi.non-technical)
  • Re: Unicode to UTF8
    ... DOMDocument), creates HTML (UTF8) ... Why would you want to convert Unicode to UTF-8? ... unicode string in an XML DOM and save the DOM the default is to save UTF-8 ...
    (microsoft.public.scripting.vbscript)
  • Re: Unicode string libraries
    ... it comes to sequences of characters? ... I know that Perl uses UTF-8 as its internal string representation. ... Ruby just didn't do Unicode. ...
    (comp.programming)
  • Re: Fast UTF-8 strlen function
    ... >> Is there a fast UTF-8 string length function floating around? ... Length in bytes, or length in characters? ... For UTF-8, the main basic "change" you have to make to your string routines ... then I could individually look up the characters in my UNICODE ...
    (alt.lang.asm)
  • Re: Tranfering unicod charcters in Socket programming!
    ... You are telling about conversion b/w MBCS to Unicode. ... If this is not possible Shall I try with string to wstring ... int SendStringAsUnicode ...
    (microsoft.public.win32.programmer.networks)