Re: Using a character GUID for the primary key - Any issues to know ab

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Greg Linwood (g_linwoodQhotmail.com)
Date: 10/12/04


Date: Wed, 13 Oct 2004 08:50:40 +1000

Hi David

You used the word "disks" (plural), but the use of an identity based key on
a clustered index ensures that only one disk will ever be used - the disk
that the current page at the end of the table resides on. GUIDs can use
multiple disks concurrently and due to their random nature, use all disks in
a multi disk system evenly. In my experience, the single-disk'edness of
identity based clustered keys hurts performance once the numbers start
getting big.

There's another big gotcha waiting just around the corner as well if you use
identities. If you do start hitting really high levels of transactional
activity, you might want to start using partitioned views or distributed
partitioned views to spread disk load (eg via other partitioning columns
such as year, lastname etc). What they don't teach you in school is that
using Identity stops you from later using partitioning if you need to in SQL
2000 (not in SQL 05) because the view based partitioning approach available
in SQL 2000 requires multiple base tables & which table's going to hold the
identity once the table's partitioned? The reason this problem goes away in
SQL 2005 is that table based partitioning (rather than view based
partitioning) is introduced.

There are other issues that GUIDs can solve - their randomness, if used with
appropriate fill factors & pad_index is the most effective way of avoiding
table / index fragmentation. Identities unbalance the btree structure of
index nodes because inserts don't occur throughout the table - they're
simply monotonically added to the end. Hence, the index needs constant
maintenance. If you use a random key such as GUID (or other) and leave a
little space in the pages, you won't have to worry so much about
fragmentation / index re-builds. This can be very important in large scale
sites given SQL 2000's current limitations with index rebuilding (off line).

I actually use both identities and GUIDs regularly, but I've constantly
witnessed GUIDs getting an unwarranted bad wrap in newsgroups. The two major
draw-backs that I worry about with GUIDs are purely their size (twice
bigint, 4x int) and that SQL Server blows up when they're sorted in order
by's.

Regards,
Greg Linwood
SQL Server MVP

"David Gugick" <davidg-nospam@imceda.com> wrote in message
news:esc8URJsEHA.3520@TK2MSFTNGP11.phx.gbl...
> chris.cichocki@nospam4me.com wrote:
>> We are about to begin a large development effort using .Net and SQL
>> Server, and we are wondering about how to implement unique id's for
>> all our tables.
>>
>> The two basic choices are using an auto-incrementing numeric field,
>> or a character field that contains a GUID (36 long character field
>> that is created through the .Net Framework). I know SQL Server has a
>> data type of "unique identifier", but we would actually set the data
>> type to "nchar" and then take the value of the GUID that we get from
>> .Net and assign the value.
>>
>> Also, our DBA's have a policy of creating the primary key as a
>> clustered index. The tables could get quite large (10's of millions
>> of rows).
>>
>> The question is, are there any performance or other issues to be
>> concerned about when using a 36 long nchar field as the primary key
>> value in a clustered index?
>>
>> Thanks in advance!
>> Chris
>
> Use an INT or BIGINT. No reasons to use a GUID based on your comments.
> Using an IDENTITY will only save space, increase row density, and help
> insert operations by keeping the disk heads in the same area of the
> disk (if a clustered index is used). I don't believe hot-spots are much
> of an issue any more now that insert row locking is used.
>
> Plus, every FK will be smaller as well.
>
>
> --
> David Gugick
> Imceda Software
> www.imceda.com
>



Relevant Pages

  • Re: Using a character GUID for the primary key - Any issues to know ab
    ... > - the disk that the current page at the end of the table resides on. ... > GUIDs can use multiple disks concurrently and due to their random ... > other partitioning columns such as year, ... The reason this problem goes away in SQL ...
    (microsoft.public.sqlserver.server)
  • Re: Shared Disk and Active Node in SQL 2000
    ... I would place whatever local disk file you have on a clustered file share resource. ... Principal SQL Infrastructure Consultant ... point in time there will be only one active node that hosts SQL instance. ... The application connects to SQL instance and runs T-SQL query "select * from ...
    (microsoft.public.sqlserver.clustering)
  • Re: New to clustering and need some clarification on disk resource
    ... So I have seperate cluster groups for MSCS, MSDTC, SQL, SQL Analysis. ... All have their own physical disk and MSCS and MSDTC have network name and IP ... Microsoft SQL Server MVP ... Is the shared disk for the cluster groups that I will install SQL Server ...
    (microsoft.public.sqlserver.clustering)
  • Re: "True" hardware requirements for SPPS
    ... And yes 4GB is all that Windows 2003 Server Standard Edition will use. ... > Also are you talking about SBS Premium which includes SQL 2000? ... > control or not within SPS document libraries will have a huge impact on ... I would recomend on your Box B that you at least mirror your disk. ...
    (microsoft.public.sharepoint.portalserver)
  • Re: MSDTC in Windows 2008 Cluster...
    ... and just added the Data disk. ... I then created the MSDTC resource in a new ... clicked and moved it to my SQL application group in the cluster. ...
    (microsoft.public.sqlserver.clustering)