RE: Translated characters stored in database



Hello,

I tested your code with this binary data in the infile.txt:

20 21 22 23 24 25 26 27 28 29 30 7B 7C 7D 7E 7F 80 82 83 84 85 86 87 88 89
8A C0 C1 C2 C3 C4 C5 F0 F1 F2 F3 FE FF

I then insert the data with the code below into the database. The result is
below. Note that the characters are probably not displaying correctly but
the hex values should:

select text, cast (text as varbinary(40)) from texttable

text


----------------------------------------------------------------------------
------------------------
----------------------------------------------------------------------------
------
!"#$%&'()0{|}~??????????ÀÁÂÃÄÅðñòóþÿ

0x20212223242526272829307B7C7D7E7F8082838485868788898AC0C1C2C3C4C5F0F1F2F3FE
FF


This is what I would expect. Are you seeing different results? If so, what
data are you seeing?

Thanks,
Kamil

Kamil Sykora
Microsoft Developer Support - Web Data

Please reply only to the newsgroups.
This posting is provided "AS IS" with no warranties, and confers no rights.


Are you secure? For information about the Strategic Technology Protection
Program and to order your FREE Security Tool Kit, please visit
http://www.microsoft.com/secur­ity.



--------------------
| From: "=?Utf-8?B?QmlyY2hCYXJsb3c=?="
<BirchBarlow@xxxxxxxxxxxxxxxxxxxxxxxxx>
| Subject: Translated characters stored in database
| Date: Sun, 22 May 2005 21:26:07 -0700
| Lines: 43
|
| Is this a known issue? I am using the MS JDBC driver to connect to SQL
| Server 2000 SP3a. The driver version is either SP2 or SP3. I tested
both
| with the same results. The database collation is Latin1. Sun java 1.3.1
or
| 1.4.2. When using the SendStringParametersAsUnicode=false in the
connection
| string the driver translates the characters in the 0x80 to 0x9f range.
The
| translation does not occur when setting the above connection parameter to
| true. I have also tested the Data Direct driver and there is no
character
| translation with the parameter true or false. I have enclosed a sample
piece
| of code to illustrate the issue. I need to use the parameter as false
since
| there is a significant performance hit with it set to true. Also my
| application requires support for the full Latin1 (Windows 1252) character
| set. It looks like the character translation is a result of dropping the
| upper byte of the Unicode character which for all characters except 0x80
-
| 0x9f is 0x00. If you look at the Windows 1252 character set Unicode
values
| you will see that the 0x80 to 0x9f characters have a value in the upper
byte
| therefore simply truncating the upper byte produces a character
translation.
| Example: Character 0x80 has the Uniccode value of 0x20AC and is
translated to
| 0xAC in the database.
|
| table1:
| varchar(300);
|
| file: infile.txt is a binary file with the characters bytes 0x20 to 0xff
|
| Class.forName("com.microsoft.jdbc.sqlserver.SQLServerDriver");
| con =
|
DriverManager.getConnection("jdbc:microsoft:sqlserver://myserver;SelectMetho
d=cursor;SendStringParametersAsUnicode=false;user=test;password=test;Databas
eName=TestDB"
| );
|
| BufferedReader r = new BufferedReader(new FileReader("infile.txt"));
| String text = r.readLine();
|
| PreparedStatement pstmt = null;
| String sqlQuery = "insert into table1 (text) values (?)";
|
| pstmt = con.prepareStatement(sqlQuery);
| pstmt.setString(1, text);
| pstmt.execute();
|
| pstmt.close();
| pstmt=null;
| con.close();
| con=null;
|

.



Relevant Pages

  • Re: pronunciation of Toyota
    ... I'm still getting to grips with who plays Cleopatra in the Joseph Manckewicz movie. ... it brings up a related issue of translation. ... This may be a scandalous suggestion but one wonders if readers would be better off simply giving new names to these characters, or at least anglicizing the names, etc. ... It's one of my pet peeves. ...
    (alt.usage.english)
  • Re: Words of a book
    ... professors of, I assume, ancient Egyptian. ... it shall come to pass that the Lord God shall say ... him and the characters, I refer to his own account of the circumstances, as ... been translated, with the translation thereof, to Professor Charles Anthon, ...
    (soc.religion.mormon)
  • Re: Whats your favorite ASCII and EBCDIC code pages? (wrapup)
    ... I want to touch on a few of the responses with comments of my own. ... there are three reasons why reversibility is not a priority to me. ... most of the files for which I want to do this translation are on a one-way trip to a display device or printer. ... Perhaps I am wrong, but once you get past the set of commonly defined characters, I know of no consensus about how to translate the remaining bytes. ...
    (bit.listserv.ibm-main)
  • Re: pronunciation of Toyota
    ... I'm still getting to grips with who plays Cleopatra in the Joseph Manckewicz movie. ... it brings up a related issue of translation. ... This may be a scandalous suggestion but one wonders if readers would be better off simply giving new names to these characters, or at least anglicizing the names, etc. ... Buddhist) and related familiar constructions/conjugations. ...
    (alt.usage.english)
  • Re: regex question
    ... > The perl program reads this parameter file to load the translation pairs ... > code delimited by new-line into an array and then out into a flat file. ... > characters in the FROM blocks. ...
    (comp.lang.perl.misc)

Loading