Re: Unicode

Tech Tip: Click here to run a free scan for Windows Errors and optimize PC performance

From: Bart Duncan [MSFT] (bartd_at_online.microsoft.com)
Date: 05/24/04


Date: Mon, 24 May 2004 16:37:11 GMT

Some additional info you might find useful:

---------------
QUESTION: What is Unicode? Why should I use it?

ANSWER: Read the Books Online articles "Code Pages and Sort Orders",
"Using Unicode
Data", "Unicode Data", and "Difference Between Unicode and Character
Sets". They
describe what Unicode is, what the benefits are, and how Unicode data
differs from
non-Unicode (character set) data.

---------------
QUESTION: What do I need to do to switch to Unicode on SQL Server?

ANSWER: On the server side this is relatively simple. Everywhere you
store or
manipulate textual data you should switch from the non-Unicode data types
CHAR/VARCHAR/TEXT to the Unicode counterparts NCHAR/NVARCHAR/NTEXT. Some
example
of places where you may need to make changes to data types include table
columns,
stored procedure/trigger variables and parameters, CONVERT or CAST to a
character
data type, user-defined data types, etc. Also look for use of the T-SQL
functions
ASCII() and CHAR() and substitute the Unicode versions UNICODE() and
NCHAR().

T-SQL string manipulation functions have been enhanced to support Unicode
wherever
possible. The following functions accept Unicode arguments, respect the
two-byte
character boundaries of Unicode strings, and use the SQL Server Unicode
collation
for string comparisons when the input parameters are Unicode: CHARINDEX,
LEFT, LEN,
UPPER, LOWER, LTRIM, RTRIM, PATINDEX, REPLACE, QUOTENAME, REPLICATE,
REVERSE,
STUFF, SUBSTRING, UNICODE.

String-related functions that do not work with Unicode: The SOUNDEX
algorithm is
defined around English phonetic rules, so won't be meaningful on Unicode
strings
unless the string only contains Latin characters A-Z/a-z (this is due to
limitations of the Soundex algorithm itself, not SQL Server's
implementation of the
function). The ASCII() function is explicitly defined as returning the
non-Unicode
character code of the character passed in, so use the counterpart
UNICODE()
function for Unicode strings where you would use the ASCII function on
non-Unicode
strings. The same is true of the CHAR function; NCHAR is its Unicode
counterpart.
These 3 functions (ASCII, CHAR, SOUNDEX) can be passed Unicode
parameters, but
Unicode arguments will be implicitly converted to non-Unicode strings
(possible
loss of Unicode characters) before processing because these functions
operate on
non-Unicode strings by definition. Also, be aware that the DATALENGTH
returns a
byte count, not a character count; if you are using DATALENGTH to
determine string
length you should be using the LEN function instead.

Whenever you use a Unicode string literal in a query you must prefix the
constant
with a capital N. For example, N'abcd‚' will be treated as Unicode,
while 'abcd‚'
will be converted to the non-Unicode code page of the current database
before processing,
which will result in a loss of any Unicode characters that don't have
equivalents
in the SQL Server code page. Prefix all Unicode strings with N on the
client (when
sending data to SQL Server, for example in INSERT statements) and in code
that
executes on the server (in triggers or stored procedures, for example).
See
Q239530 INF: Unicode String Constants in SQL Server Require N Prefix for
details.

--------------------
From: "SriSamp" <ssampath@sct.co.in>
References: <uaBNR2WQEHA.2216@TK2MSFTNGP12.phx.gbl>
Subject: Re: Unicode
Date: Mon, 24 May 2004 16:33:56 +0530
Lines: 22
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2800.1409
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409
Message-ID: <erjj85XQEHA.1892@TK2MSFTNGP09.phx.gbl>
Newsgroups: microsoft.public.sqlserver.programming
NNTP-Posting-Host: firewall.salliemaesolutions.soft.net 164.164.96.51
Path:
cpmsftngxa10.phx.gbl!TK2MSFTFEED01.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFTNGP
09.phx.gbl
Xref: cpmsftngxa10.phx.gbl microsoft.public.sqlserver.programming:445944
X-Tomcat-NG: microsoft.public.sqlserver.programming

What is UniCode: http://www.unicode.org/standard/WhatIsUnicode.html
NVARCHAR is used to store unicode values and VARCHAR for non-unicode data.
The storage of NVARCHAR is twice that of VARCHAR.

-- 
HTH,
SriSamp
Please reply to the whole group only!
http://www32.brinkster.com/srisamp
"Ramesh" <ramesh@mail.punecity.com> wrote in message
news:uaBNR2WQEHA.2216@TK2MSFTNGP12.phx.gbl...
> Dear all
> can anyone plz explain me
> What is Unicode? and the difference between datatype nvarchar and 
varchar.
>
> Thnx
> Regards
> Ramesh :)
>
>


Relevant Pages

  • Re: Unicode Support
    ... > Not knowing much about UTF-8 (my Unicode knowledge extends as far as ... > literal strings of this form as long as the character code for quote ... > can never appear in a MBCS (multibyte character sequence). ... then XP Notepad directly understands UNICODE and you can ...
    (alt.lang.asm)
  • Re: Optimization of code
    ... external devices that take 8-bit character string commands. ... convert Unicode to ANSI. ... CStringA command; ... that and it could just assume Unicode for all strings, ...
    (microsoft.public.vc.mfc)
  • Wide character notation, was Re: How to NOT use utf8.
    ... > So the author suggests that there may be a problems for unicode, ... in the Perl documentation). ... The Unicode code for the desired character, in hexadecimal, ... Unicode strings ...
    (comp.lang.perl.misc)
  • Re: Need help on string manipulation
    ... better to convert strings to UCS-32 before manipulation? ... Characters represented by wchar_t must use one wchar_t per character, ... which may use a multibyte encoding. ... use some newer Unicode characters, if this is a problem for you, then ...
    (comp.lang.c)
  • Re: UTF8: cgi ist staerker als ich
    ... UNICODE bzw. eigentlich UCS (Universal Character Set) ist kein Encoding, ... Versuch Dir einen Character in Perl ganz einfach als Integer ... Damit Perl automagisch Strings als Characters behandeln kann (und ...
    (de.comp.lang.perl.cgi)