Re: Reading text files with UTF-8 byte order mark



FriendOfBOB schrieb:
I have a simple VBA application that reads text files using an ADO 2.x library. The application that creates the text file has been "enhanced" to support Unicode, and now places a UTF-8 byte order mark (BOM) as the first three characters of the file ... , hex values EF BB BF. ADO 2.x doesn't seem to "speak" Unicode, and doesn't interpret these characters as a BOM. Instead, the three characters are include in the first row, first field value ... a field name in my case.

All I can find is an ANSI versus OEM setting in the schema.ini file. Am I missing something, or does ADO 2.x reading text files simply not support Unicode?

While you can set the CharacterSet key in the schema.ini file to any
valid encoding (try 1200 for utf-16 or 65001 for utf-8), both the
OLEDB and the ODBC Text driver choke on the utf-8 bom in the first
line.

You may put an (misleading) ColNameHeader=True in the schema.ini, if
your VBA application can write headers or dummies to the first row.


.



Relevant Pages

  • Re: Transmitting strings via tcp from a windows c++ client to a Java server
    ... That algorithm will not give you the size in bytes of a UTF-8 encoded string. ... There is no way to compute the length of the UTF-8 encoding of a Unicode ... or Unicode characters. ... I would probably decide that a BOM must not be used, ...
    (comp.lang.java.programmer)
  • Re: How to identify unicode characters in record
    ... We are in the process of upgrading our application to support unicode ... This table exists in a 10GR2 database that supports UTF-8 character set. ... or more unicode characters? ... Ana - I think both the tips from Michael and Charles will work. ...
    (comp.databases.oracle.server)
  • Re: Invalid characters before xml header
    ... "UTF-8" hence the BOM which is a 16 a magic 16 bit unicode value usually put ... Just to confuse things I seem to remember that Encoding.UTF8 and new ... checked - the output XML files were identical. ...
    (microsoft.public.dotnet.languages.csharp)
  • Writing UTF-8 file under Windows
    ... Whatever I try to write a UTF-8 file, I always end up with UTF-16LE ... with the "FF FE" BOM at the beginning and 2 bytes per character. ... I am reading strings from an external resource and try to write to ... Why does Perl add it? ...
    (comp.lang.perl.misc)
  • Writing a UTF-8 file
    ... Whether I open files in utf8 mode (2nd parameter of open or ... Firefox (UTF-8 encoding). ... Where does the BOM 0xFF 0xFE come from? ... Why does Perl add it? ...
    (comp.lang.perl.misc)