RE: ETL / Data mart / Cube question

Tech Tip: Click here to run a free scan for Windows Errors and optimize PC performance



Hi Joe,

I'm involved in a Data Warehouse project. Client is already invested in SQL
Server 2005.

Directly coming to the point..
You must have setup server(s) for your data warehouse. I wanted to know
whether you were involved in Server Sizing / Data Sizing. If yes, then can
you please guide me as to which method / template did you use to size your
data and based on it what hardware (with configuration) did you suggest /
decide?

Please help this is very critical and urgent!

Regards,
Bharat Tamhankar


"Joe" wrote:




We are in process of using new Kimball toolkit book as our guide
to create another DM. Since we only have one data source, we think we
need to pull in all the tables into the staging area. At first we wanted
to just pull in minimal data to populate the Dimension/Fact tables and any
information needed to derive columns.

However the business area wants a complete snapshot of ALL THE
data for this given subject matter. Their reason is that sometimes our
CUBES don't have all the information they want, and they want to be able
to query the data mart tables against the staging area tables using SQL or
Access. Since our data mart load times is first Saturday of the month -
this argues in favor of this. If the load process was every day or once a
week - it might be better to just query across the data mart and the
reporting (replicated production) database - the the data mart tables
would be older than the information they were linking against on the
replicated report DB using the 1st Saturday approach.



Questions:

1. Is this the correct approach?

2. According to a Microsoft diagram, the load goes from Source to
Staging to OLAP DB - which is really just another OLTP DB but the tables
are modeled as Dimension/Fact tables, right?

3. Our existing data mart which everyone mostly uses CUBES to
access, has a big manual step. After all the ETL happens and the Dim/Fact
tables have been updated - we are having to open up SSAS and process the
cubes, the first business day after the 1st Saturday, to get them
refreshed - can't this be automated within SSIS?

4. There have been some major production changes to our existing
data source for our existing production data mart. All the ETL was done
on SQL2k with stored proc's and DTS was only used as the policeman to
control the flow. All the SCD processes were done and are working fine
using SP's. We now have to make changes and are faced with having two
complete different systems to manage - one with an ETL using the old
method and the 2nd using SSIS. Some of the CUBES from the original data
mart are going to be combined with our new cubes we are creating in our
new system. To try and make it clearer - we have for example a Client
dimension and a Client fact table that first is refreshed monthly with the
old ETL, and now our new subject matter is now also going to affect the
Client dimension and Client fact table that was created the first project.
The old production changes it appears may be many weeks to adjust using
the old ETL methods - so it begs the question, since all the ETL is
working fine with DTS as flow master - is it perhaps better to get it all
working with SSIS calling all the SP's and incorporate all our new ETL
within SSIS using the new features that SSIS can offer?

5. When adding a 3rd subject matter data mart - is it a 3rd ETL
package or just enhancing the one we have?







____________________

Joe Horton

SQL 2005 Business Intelligence

Aging & Disability Services Administration





.



Relevant Pages

  • Re: ETL / Data mart / Cube question
    ... CUBES don't have all the information they want, and they want to be able ... Since our data mart load times is first Saturday of the month - ... After all the ETL happens and the Dim/Fact ... refreshed - can't this be automated within SSIS? ...
    (microsoft.public.sqlserver.datawarehouse)
  • Re: ETL system
    ... BULK INSERT'd them into the ETL, and then did INSERT INTO...SELECT into my ... Data Mart. ... > a different server than the destination? ...
    (microsoft.public.sqlserver.datawarehouse)
  • Re: ETL system
    ... > The way I've implimented such is BCP'd out the data from the OLTP system, ... > BULK INSERT'd them into the ETL, and then did INSERT INTO...SELECT into my ... > Data Mart. ...
    (microsoft.public.sqlserver.datawarehouse)