Re: Unicode/UTF-8 decoding
- From: "Bill Nguyen" <billn_nospam_please@xxxxxxxx>
- Date: Tue, 5 Jun 2007 16:46:07 -0700
Göran ;
I think you are correct. However, not much I can do since I can not change
the host server parameters.
I am using SQLyog to access mySQL remotely. What I need is to be able to
read the data in its correct format/encoding scheme. Is it possible with
..NET ?
Thanks
Bill
"Göran Andersson" <guffa@xxxxxxxxx> wrote in message
news:eWl7rU4pHHA.4100@xxxxxxxxxxxxxxxxxxxxxxx
Bill Nguyen wrote:
I set UTF-8 as the default encoding in mySQL.
I don't really know how this work, but IE or Firefox browser can decode
easily.
This is the test:
I put the lines below in an HTML document and viewed it in IE, and it
worked. (make sure to set encoding to UTF-8 in VIEW).
I include the test.htm for your testing. (The text is in Vietnamese).
So I think what I need is to find a utility that has the same function
that might already be available out there. Any help is greatly
appreciated.
Bill
----------------
<html>
<head></head>
<body>
Virginia Hamilton Adair / Lâm Thá»< Mỹ Dạ
Lấp lánh há»"n thÆ¡ Viá»?t trên sân ga Tokyo chiá»?u cuá»'i nÄfm
</body>
</html>
"Göran Andersson" <guffa@xxxxxxxxx> wrote in message
news:%23rhR3M0pHHA.1776@xxxxxxxxxxxxxxxxxxxxxxx
Bill Nguyen wrote:
Below are sometext I extracted from a mySQL database. How can I decode
them so that I can read them in Unicode?
Thanks
Bill
------------
Virginia Hamilton Adair / Lâm Thá»< Mỹ Dạ
Lấp lánh há»"n thÆ¡ Viá»?t trên sân ga Tokyo chiá»?u cuá»'i nÄfm
This text looks as it has been decoded with a different encoding than
was used to encode it. It might be possible to recreate the data if you
know what encodings was used to encode and decode it. Then you might be
able to encode it back to it's prevois state and use the proper encoding
to decode it. There is a great risk that some data has been lost,
though, and that you can't recreate the original data from this stage.
If you want to store unicode strings in the MySQL database, it has to be
set up to use unicode as character set.
--
Göran Andersson
_____
http://www.guffa.com
------------------------------------------------------------------------
> Virginia Hamilton Adair / Lâm Th? M? D? > L?p lánh h?n tho Vi?t trên
sân ga Tokyo chi?u cu?i nam
You are doing exactly what I was talking about. If you read the data using
the wrong encoding, then save it using the same encoding, you can then
open it using the corrent encoding, provided that the process hasn't
removed any data.
If you have set up your MySQL database to use unicode, and still get the
string out in that manner, the error is before you even saved the string
in the database in the first place. What you have done is basically:
unicode -> bytes -> wrong encoding -> MySQL -> wrong encoding -> html ->
bytes -> browser -> unicode
While this gives the correct result for some strings, some byte codes used
in UTF-8 doesn't represent a single character by themselves, so if you
contine to store mis-decoded strings as unicode, you will sooner or later
experience corrupted strings.
--
Göran Andersson
_____
http://www.guffa.com
.
- Follow-Ups:
- Re: Unicode/UTF-8 decoding
- From: Göran Andersson
- Re: Unicode/UTF-8 decoding
- References:
- Unicode/UTF-8 decoding
- From: Bill Nguyen
- Re: Unicode/UTF-8 decoding
- From: Göran Andersson
- Re: Unicode/UTF-8 decoding
- From: Göran Andersson
- Unicode/UTF-8 decoding
- Prev by Date: Re: Instance of a form
- Next by Date: Re: Impersonation Question
- Previous by thread: Re: Unicode/UTF-8 decoding
- Next by thread: Re: Unicode/UTF-8 decoding
- Index(es):
Relevant Pages
|