Re: Newbie dim table question
- From: "Phil" <phil.austin@xxxxxxxxxxxxxxxxxxxxxxx>
- Date: Mon, 7 Apr 2008 16:32:47 +0100
Having a 1:1 relationship sort of defeats the purpose of having a dimension
table tbh - its still a relational database after all. Your end users won't
thank you for it either!
You should cetainly choose to reduce the number of rows in a dimension
table, having large dimension tables are the cause of some of the biggest
headaches in data warehousing.
How you populate the dimension table depends on the level of granularity
you've decided on (ie what is represented by one row in the dimension table)
which, depends on what the dimension itself represents. In the classic
example of a Date dimension each row usually represents one day so you have
365 rows for each year (366 for leap years of course). You might have
millions of fact table rows in a day but only one row for a Date dimension.
Other classic types of dimension granularity would be a Product dimension
with one row per product or an Employee dimension with one row per employee.
Which dimensions you choose to create should reflect the requirements of
your users as much as possible.
Generally its wisest to choose a level of granularity that gives you a lot
of detail - ie choose the lowest possible. That is almost always an order of
magnitude smaller than your fact table. The level of detail available will
obviously depend on your source data - so doing a SELECT DISTINCT on each
column of source data is a good place to start, once you've worked out what
dimensions you need.
Dimensional modelling (which is what you're essentially asking about) is a
huge topic, but one of its best proponents is Ralph Kimball - its worth
looking out his 'Data Warehouse Toolkit' on amazon here:
http://www.amazon.co.uk/Data-Warehouse-Toolkit-Complete-Dimensional/dp/0471200247/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1207582270&sr=8-1.
HTH
--
Phil
http://www.clarity-integration.com
http://www.phil-austin.blogspot.com
"Carl Henthorn" <CarlHenthorn@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:70017C02-BC13-4FBC-A556-1796415403D5@xxxxxxxxxxxxxxxx
I have a table that I am converting to a cube with almost 1 million rows in
it. I am curious if in the dimension table, is it good design to have a
1:1
relationship with the fact table rows, or should I distinct the data in
the
dimension table, and point to it from the fact table?
thanks!!
.
- Prev by Date: Updating data in a data warehouse
- Next by Date: Storing natural keys in the dimension tables, to aid ETL matching
- Previous by thread: Updating data in a data warehouse
- Next by thread: Storing natural keys in the dimension tables, to aid ETL matching
- Index(es):
Relevant Pages
|
Loading