Re: Snapshot Relational Data into New Tables

From: Raterus (moc.liamtoh_at_suretar.reverse)
Date: 09/28/04


Date: Tue, 28 Sep 2004 12:25:01 -0400

Thank you very much for all your help on this. You've given me lots to take in and right now I'm trying to implement it in my current situation. So far it is going good!

--Michael

"David Portas" <REMOVE_BEFORE_REPLYING_dportas@acm.org> wrote in message news:KvqdnSXS5us_6sXcRVn-rA@giganews.com...
> The problem is more fundamental than "how do I find a key?". Ask yourself
> "what information do I need to store about a customer?" If you are only
> storing customer names then apparently you don't need to differentiate
> between two customers of the same name. If you did you would presumably
> capture information to allow you to distinguish betwen them.
>
> Now consider a simplified version of what a naive schema for a Customer
> table might look like:
>
> CREATE TABLE Customers (customer_id INTEGER IDENTITY PRIMARY KEY,
> customer_name VARCHAR(50) NOT NULL)
>
> Maybe you see something like this in your data:
>
> customer_id customer_name
> ----------- --------------------
> 1 George W Bush
> 2 John Doe
> 3 John Doe
> 4 John Doe
>
> But why did you store John Doe multiple times? The IDENTITY key is different
> but that isn't telling you any extra information about Mr John Doe. Was it
> the same John Doe? Or two of them on different occassions? Or three of them?
> If you cared you would add something to identify them. If you're only
> interested in the name then the following will do just as well:
>
> CREATE TABLE Customers (customer_id INTEGER IDENTITY PRIMARY KEY /* The
> surrogate key */, customer_name VARCHAR(50) NOT NULL UNIQUE /* The natural
> key */)
>
> customer_id customer_name
> ----------- --------------------
> 1 George W Bush
> 2 John Doe
>
> Or, if it's important to know how many times John Doe was a customer, add a
> Trasaction Count column.
>
> Hopefully it's obvious from this example that the surrogate key should
> always be redundant because there should always be another key. Expert
> opinions will divide on just when you should use IDENTITY or whether it is
> necessary at all. What's for sure is that a table whose *only* key is
> IDENTITY lacks any data integrity and that has dire consequences when you
> try to make sense of the data in queries.
>
> > Are there situations where natural keys are impossible
>
> The only situation I can think of is stuff like server and application logs.
> These are event driven and it's often hard to enforce the data integrity in
> real time. I view this as an ETL problem rather than a data modelling one
> but sometimes it can be difficult to avoid if you use a table as an error
> log for example.
>
> --
> David Portas
> SQL Server MVP
> --
>
>



Relevant Pages

  • Re: Snapshot Relational Data into New Tables
    ... storing customer names then apparently you don't need to differentiate ... But why did you store John Doe multiple times? ... The only situation I can think of is stuff like server and application logs. ...
    (microsoft.public.sqlserver.programming)
  • Re: Combining Values
    ... The customer requires it. ... Commerce Server requires it. ... John Doe | Address 2 ... Do this in the reporting layer if at ...
    (comp.databases.ms-sqlserver)
  • Re: Combining Values
    ... The customer requires it. ... Commerce Server requires it. ... the client wants. ... John Doe | Address 2 ...
    (comp.databases.ms-sqlserver)
  • SubReport - Lined Up with Main Report Detail - Repeats any Multivalues??!?!?!
    ... In the main report, I put the customer name and address, and the Order ... John Doe 2/2/2007 ... It lists the three ...
    (microsoft.public.access.reports)
  • Re: [fw-wiz] PIX Logging Analysis
    ... I use ipaudit-web http://ipaudit.sourceforge.net/ipaudit-web/ ... Snort is good but you will get alot of false alarms that if given to ... your customer will cause panic. ... msyslog has worked really well to examine the logs through a php web ...
    (Firewall-Wizards)