Thursday, March 29, 2012

Datatype <uniqueidentifier>

When creating table, now we usually include the colunm(Id) whose datatype is uniqueidentifier.

My first question is why choosing datatype as uniqueidentifier instead of int[identity(1,1)] if no replication needed.

My 2nd question is for the uniqueidentifier column, which index is appropriate for the column, cluster or non-cluster?

And the third question is how to set the default for the uniqueidentifier.which function is better, NewID() or NEWSEQUENTIALID()?

Thanks in advance.

uniqueidentifier is a type of GUID which means that it is unique. It is larger than int style fields and unless you are using NEWSEQUENTIALID() the new values are going to be random (so don't cluster on it unless you use that - or your insertion point will be random in your index). Even with NEWSEQUENTIALID() the values will not normally be consecutive, just increasing.

Generally I would use an identity column for a single table with no multi-source issues (uniqueidentifier is good if you are taking records from various places and merging them). It is smaller (4bytes for int, 8bytes for bigint as opposed to 16bytes for uniqueidentifier), and it is much easier to type in a SQL statement. It is also easier for the processor and memory access to deal with as a value.

As noted it is not a good idea to cluster on an random uniqueidentifier. Even if it increasing why do you want to cluster on it. Generally you do not query by a range of GUIDs. As clustering determines the grouping of the records on disk it is normally better to cluster by something that matches your common querying - to minimise the amount of disk access to return the record. Single record access (if you do retrieve by GUID) is not really affected by clustering as you are only after one record on one datapage.

|||

Here is some additional information that will help you resolve your question. (Generally, avoid GUIDs in situations where Replication is not involved.

GUID -Identity and Primary Keys
http://sqlteam.com/item.asp?ItemID=2599

GUID -Is not Always GOOD
http://bloggingabout.net/blogs/wellink/archive/2004/03/15/598.aspx

GUID -The Cost of GUIDs as Primary Keys
http://www.informit.com/articles/article.asp?p=25862&rl=1

GUID -Uniqueidentifier vs. IDENTITY
http://sqlteam.com/item.asp?ItemID=283

|||

1. Identity is better to use if you never planned for replication. It is one of the biggest datatype (16 Bytes) in the sql server. If you are going to use to identitfy your row then better use it Identity (int/smallint/bigint,1,1) is more enough. But GUID allocates 16 Bytes on each row.

2. Creating a index on GUID is really bad idea unless its required. Since the datatype is huge the indexes will occupy more space and the manipulation also slow.

3. Setting default value ColumnName UniqueIdentifier Default NEWID()