Re: Validating Surrogate Keys
From: Dave Wickert [MSFT] (dwickert_at_online.microsoft.com)
Date: 11/09/04
- Next message: Dave Wickert [MSFT]: "Re: One To Many Dim Design Question"
- Previous message: Dave Wickert [MSFT]: "Re: track down table usage"
- In reply to: TP: "Re: Validating Surrogate Keys"
- Next in thread: Desmond: "Re: Validating Surrogate Keys"
- Messages sorted by: [ date ] [ thread ]
Date: Tue, 9 Nov 2004 15:53:50 -0800
Ref: the importance of partitioning -- two resources.
1) There is an entire chapter in the SQL Server 2000 Resource Kit on this
topic (Chapter 18: Using Partitions in a SQL Server 2000 Data Warehouse) --
also at:
http://msdn.microsoft.com/library/default.asp?url=/library/techart/partitionsindw.htm
2) SSAS Performance Guide:
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ansvcspg.mspx
-- Dave Wickert [MSFT] dwickert@online.microsoft.com Program Manager BI SystemsTeam SQL BI Product Unit (Analysis Services) -- This posting is provided "AS IS" with no warranties, and confers no rights. "TP" <thom@pantazi.net> wrote in message news:7d353aba.0411090339.495ccde7@posting.google.com... > Dave, > > Thanks for the great ideas. We do not delete from our dimensions so > the timestamp function would absolutely work. Having said that, my > DBA mentioned using partitioned data. We have never done this, can > you point me to a good resource describing data partitioning on SQL > 2000? > > Again, I appreciate your input. One final question that I know I am > going to be asked is "Is this method an industry standard?" If it is, > could you possibly refer to support documentation? If not, do you > know if there is a standard method for ensuring referential integrity > in data warehouses? > > Thom > > "Dave Wickert [MSFT]" <dwickert@online.microsoft.com> wrote in message news:<uU038FhxEHA.536@TK2MSFTNGP10.phx.gbl>... > > Sure. Assuming that have an index built on the dimension tables using the > > surrogate keys, it should be fairly straightforward to do an outer join and > > look for NULL keys being returned. Since you only really want to know if it > > exists in the dimension table, make sure that you only use the surrogate > > itself -- in which case, it should be an index scan of the dimension > > table -- which should run quickly. > > > > Assuming that you never delete dimension table entries, then you could have > > it run even faster if you timestamp the fact table. Once you've scanned a > > fact table entry, you never need to go back and check it again (that would > > be the best case). > > > > Alternatively you could look at partitioning your fact tables. Again, this > > would be a mechanism to ensure that you only scan a fact table record once. > > If partitioned into a separate table, for example, one table per month, then > > you would only need to validate the latest month. > > > > Just an idea. > > > > -- > > Dave Wickert [MSFT] > > dwickert@online.microsoft.com > > Program Manager > > BI SystemsTeam > > SQL BI Product Unit (Analysis Services) > > -- > > This posting is provided "AS IS" with no warranties, and confers no rights. > > > > "TP" <thom@pantazi.net> wrote in message > > news:7d353aba.0411081958.4b3879a7@posting.google.com... > > > Thanks, Dave. We aren't using Analysis Services. We are using Cognos > > > to build our cubes, it doesn't such a feature. What do you think of > > > the script idea? > > > > > > "Dave Wickert [MSFT]" <dwickert@online.microsoft.com> wrote in message > > news:<ekhnpwexEHA.3376@TK2MSFTNGP12.phx.gbl>... > > > > Using Analysis Services, we fully expect that RI will not necessarily be > > > > in-place with your DW. Thus when we are building cubes from this data, > > > > Analysis Services has an optional exception reporting facility where it > > will > > > > log off to a separate file any records which are processed into cubes > > with > > > > missing dimension entries. This allows you to collect (and report on) > > which > > > > facts have missing surrogate keys. > > > > -- > > > > Dave Wickert [MSFT] > > > > dwickert@online.microsoft.com > > > > Program Manager > > > > BI SystemsTeam > > > > SQL BI Product Unit (Analysis Services) > > > > -- > > > > This posting is provided "AS IS" with no warranties, and confers no > > rights. > > > > > > > > "TP" <thom@pantazi.net> wrote in message > > > > news:7d353aba.0411081318.645fafc5@posting.google.com... > > > > > We have a small data warehouse (less than 50 million rows in roughly > > > > > 75 gigabytes) however, it is growing steadily. We have noticed that > > > > > we get much better performance on queries when there are no foreign > > > > > key constraints. We have demonstrated a 14% improvement. I am aware > > > > > that many data warehouses are designed without foreign key constraints > > > > > to improve the load performance of the ETL process. We have yet to > > > > > need that, as a matter of fact until now, we have used a script to > > > > > build the constraints before the ETL process and drop them after it. > > > > > > > > > > The process of rebuilding the constraints is time consuming. As more > > > > > data is put into the warehouse the time to build the constraints will > > > > > continue to become longer. At some point we anticipate it impacting > > > > > our batch window. It has been my recommendation that we completely > > > > > remove the constraints and leave them off. > > > > > > > > > > Since our use for the constraints is to check the integrity of the ETL > > > > > process, I was wondering if anyone has a methodology for validating > > > > > the surrogate keys besides building foreign key constraints. We have > > > > > considered building a series of test scripts that will perform selects > > > > > against the data looking for invalid key values. > > > > > > > > > > Thanks, > > > > > Thom
- Next message: Dave Wickert [MSFT]: "Re: One To Many Dim Design Question"
- Previous message: Dave Wickert [MSFT]: "Re: track down table usage"
- In reply to: TP: "Re: Validating Surrogate Keys"
- Next in thread: Desmond: "Re: Validating Surrogate Keys"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|