RE: Collation settings for ASCII code page
From: Bart Duncan [MSFT] (bartd_at_online.microsoft.com)
Date: 05/27/04
- Next message: guru: "DTSSql Task and Stored Procedure"
- Previous message: Hooman Boostani: "Re: DTS Owner password"
- In reply to: Rayha: "Collation settings for ASCII code page"
- Messages sorted by: [ date ] [ thread ]
Date: Thu, 27 May 2004 00:12:06 GMT
When you select a collation in SQL you are detemining several things:
a. The code page used to store non-Unicode data
b. The string sort and comparison behavior for non-Unicode data
c. The string sort and comparison behavior for Unicode data
A couple of examples:
Latin1_General_BIN - Uses a binary sort order for both Unicode and
non-Unicode data. Uses code page 1252 to store non-Unicode data.
Greek_CI_AS - Uses more "natural" sort algorithms for Unicode and
non-Unicode data. Uses code page 1253 (Greek) to store non-Unicode data.
The code page component of a collation determines which subset of
characters you can store without using Unicode. Most code pages are
single-byte, which means that each character is stored in a single byte
(8 bits) -- there are also double-byte code pages for Asian languages but
I'll ignore them for the sake of simplicity. 8 bits lets you represent
256 different values, so every single-byte code page can be used to store
at most 256 distinct characters. The whole point of a code page is to
define the meaning of each of those 256 available character codes. For
example, code point 0xD9 (decimal 217) means GREEK CAPITAL LETTER OMEGA
in code page 1253, but LATIN CAPITAL LETTER U WITH GRAVE in code page
1252.
There is no such thing as an "ASCII code page" in SQL Server. ASCII is a
subset of *every* code page used in SQL. Specifically, ASCII defines the
contents of the 7-bit character space (character codes through code point
127). This range includes the assignment of the standard English
characters A-Z, a-z, and 0-9; and these characters are always located at
the same code points in every code page that SQL supports. The
differences between the code pages only affect the "extended" character
codes from 128 through 255.
Hope this helps a bit, although I'm sorry that it doesn't provide a more
direct answer to your question. Maybe someone at IBM can clarify exactly
what their documentation means when they say that they require an "ASCII
code page" at the database layer. If all it means is that the code page
has to agree with the ASCII standard for character codes 0-127, then it
shouldn't matter what code page you select in SQL because all of them
meet this requirement.
Regarding your second question, there is no remapping utility provided
with SQL that will automatically update individual characters codes and
set them to a different manually-defined value.
Bart
------------
Bart Duncan
Microsoft SQL Server Support
Please reply to the newsgroup only - thanks.
This posting is provided "AS IS" with no warranties, and confers no
rights.
--------------------
Thread-Topic: Collation settings for ASCII code page
thread-index: AcRDLhlVtb6EW6vSTvSOGgA9kLNuVA==
X-WN-Post: microsoft.public.sqlserver.dts
From: "=?Utf-8?B?UmF5aGE=?=" <raymond.haugli@kongsberg.com>
Subject: Collation settings for ASCII code page
Date: Wed, 26 May 2004 07:31:14 -0700
Lines: 13
Message-ID: <D8A468AC-1346-4703-8210-DFF12F68DBA4@microsoft.com>
MIME-Version: 1.0
Content-Type: text/plain;
charset="Utf-8"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
Newsgroups: microsoft.public.sqlserver.dts
Path: cpmsftngxa10.phx.gbl
Xref: cpmsftngxa10.phx.gbl microsoft.public.sqlserver.dts:48070
NNTP-Posting-Host: tk2msftcmty1.phx.gbl 10.40.1.180
X-Tomcat-NG: microsoft.public.sqlserver.dts
Hi.
I'm using a IBM Rational product with our MS SQL Server 2000 called
ClearQuest. As default my databases have Collation set to
'Latin1_General_BIN'. I discovered that ClearQuest required ASCII code
page to run properly from both Windows and Unix clients.
I'm a bit new to this admin stuff for SQL server, so bear with me.
1st problem: Which Collation do I choose to get ASCII code page? There
are no Collation with a name that fits ASCII.
2nd problem: After examine the databases with a IBM Rational tool, I
found nearly 2000 instances of illegal characters according to ASCII code
page. Is there a mapping tool included in MS SQL Server 2000? I would
rather not do this manually. I have all the illegal characters presented
in XML-files if that helps.
Alternative solution is to export the complete database to Excel (or
text-file), convert all illegal characters and import it back to a new
database. But I still need to set ASCII code page for this database.
br /Raymond
- Next message: guru: "DTSSql Task and Stored Procedure"
- Previous message: Hooman Boostani: "Re: DTS Owner password"
- In reply to: Rayha: "Collation settings for ASCII code page"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|