Re: Snapshot Relational Data into New Tables
From: Raterus (moc.liamtoh_at_suretar.reverse)
Date: 09/28/04
- Next message: Roji. P. Thomas: "Re: Select statement and conditional value"
- Previous message: Henrik Dahl: "Re: Does interger ID columns force serialization at insertion time"
- In reply to: David Portas: "Re: Snapshot Relational Data into New Tables"
- Messages sorted by: [ date ] [ thread ]
Date: Tue, 28 Sep 2004 12:25:01 -0400
Thank you very much for all your help on this. You've given me lots to take in and right now I'm trying to implement it in my current situation. So far it is going good!
--Michael
"David Portas" <REMOVE_BEFORE_REPLYING_dportas@acm.org> wrote in message news:KvqdnSXS5us_6sXcRVn-rA@giganews.com...
> The problem is more fundamental than "how do I find a key?". Ask yourself
> "what information do I need to store about a customer?" If you are only
> storing customer names then apparently you don't need to differentiate
> between two customers of the same name. If you did you would presumably
> capture information to allow you to distinguish betwen them.
>
> Now consider a simplified version of what a naive schema for a Customer
> table might look like:
>
> CREATE TABLE Customers (customer_id INTEGER IDENTITY PRIMARY KEY,
> customer_name VARCHAR(50) NOT NULL)
>
> Maybe you see something like this in your data:
>
> customer_id customer_name
> ----------- --------------------
> 1 George W Bush
> 2 John Doe
> 3 John Doe
> 4 John Doe
>
> But why did you store John Doe multiple times? The IDENTITY key is different
> but that isn't telling you any extra information about Mr John Doe. Was it
> the same John Doe? Or two of them on different occassions? Or three of them?
> If you cared you would add something to identify them. If you're only
> interested in the name then the following will do just as well:
>
> CREATE TABLE Customers (customer_id INTEGER IDENTITY PRIMARY KEY /* The
> surrogate key */, customer_name VARCHAR(50) NOT NULL UNIQUE /* The natural
> key */)
>
> customer_id customer_name
> ----------- --------------------
> 1 George W Bush
> 2 John Doe
>
> Or, if it's important to know how many times John Doe was a customer, add a
> Trasaction Count column.
>
> Hopefully it's obvious from this example that the surrogate key should
> always be redundant because there should always be another key. Expert
> opinions will divide on just when you should use IDENTITY or whether it is
> necessary at all. What's for sure is that a table whose *only* key is
> IDENTITY lacks any data integrity and that has dire consequences when you
> try to make sense of the data in queries.
>
> > Are there situations where natural keys are impossible
>
> The only situation I can think of is stuff like server and application logs.
> These are event driven and it's often hard to enforce the data integrity in
> real time. I view this as an ETL problem rather than a data modelling one
> but sometimes it can be difficult to avoid if you use a table as an error
> log for example.
>
> --
> David Portas
> SQL Server MVP
> --
>
>
- Next message: Roji. P. Thomas: "Re: Select statement and conditional value"
- Previous message: Henrik Dahl: "Re: Does interger ID columns force serialization at insertion time"
- In reply to: David Portas: "Re: Snapshot Relational Data into New Tables"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|