Converting text between various encodings
- From: "Paul Randall" <paulr90@xxxxxxx>
- Date: Mon, 16 Mar 2009 22:34:51 -0700
Hi,
I'm playing with converting text strings between various encodings like
Unicode and UTF8 and UTF7. One tool I've found that seems to do at least
some of this is the oleprn.olecvt object (on W2K and later I think) which
has ToUnicode and ToUtf8 methods. TLViewer shows that the ToUnicode method
takes two arguments, a string to be converted and a long integer
representing the codepage of the string to be converted (I think), and
returns a Unicode string. It shows that the ToUtf8 method takes one
argument, apparently a little endian Unicode string.
I'm hoping that someone will post a url for some documentation on this
oleprn.olecvt object. TLViewer indicates a lot of functionality, with
coclasses having a lot of printer functionality. But I'm mostly interested
in whether the ToUnicode method can convert only from UTF8 or from any
encoding for which my computer has a codepage number. Maybe this function:
Function fsUTF8ToUnicode(sUTF8String, lCodePage)
should really be called an any-encoding to Unicode routine.
Anyhow, here is a simple script that converts a Unicode character to UTF8
and back to Unicode:
Option Explicit
Dim sUnicode: sUnicode = ChrW(&H2018)
Dim sUTF8, sNewUnicode
sUTF8 = fsUnicodeToUTF8(sUnicode)
MsgBox "Original Unicode = " & sUnicode & _
" (&H" & Hex(ascW(sUnicode)) & ")" & _
vbcrlf & "UTF8: " & sUTF8
sNewUnicode = fsUTF8ToUnicode(sUTF8, 65001)
MsgBox "Original Unicode = " & sUnicode & _
" (&H" & Hex(ascW(sUnicode)) & ")" & _
vbcrlf & "UTF8: " & sUTF8 & vbCrLf & _
"New Unicode = " & sNewUnicode & _
" (&H" & Hex(ascW(sNewUnicode)) & ")"
Function fsUnicodeToUTF8(sUnicodeString)
fsUnicodeToUTF8 = CreateObject("OlePrn.OleCvt")._
ToUtf8(sUnicodeString)
End Function 'fsUnicodeToUTF8(sUnicodeString)
'list of codepage numbers for various encodings.
'http://www.motobit.com/help/scptutl/cl68.htm
Function fsUTF8ToUnicode(sUTF8String, lCodePage)
fsUTF8ToUnicode = CreateObject("OlePrn.OleCvt")._
ToUnicode(sUTF8String, lCodePage)
End Function 'fsUTF8ToUnicode(sUTF8String, lCodePage)
-Paul Randall
.
- Prev by Date: Re: msdn scripting information
- Next by Date: REG_EXPAND_SZ only seems to expand some variables
- Previous by thread: WshShell.RegRead
- Next by thread: REG_EXPAND_SZ only seems to expand some variables
- Index(es):
Relevant Pages
|
Loading