Re: UDM and Star Schema
- From: Nestor <n3570r@xxxxxxxxx>
- Date: Thu, 23 Aug 2007 15:09:03 -0700
Hi Jeje,
Thanks for the recommendation, I've been ponder about this and thought that's the best approach as well. It's great to hear from someone who agrees with the design. At the end of the day there should be one tha caters to real time reporting and another for historical analysis reporting. It's a pity that the vision of 'one version of truth' might have to be compromised in this case though.
Cheers.
On Wed, 22 Aug 2007 17:58:46 -0700, Jeje <willgart@xxxxxxxxxxx> wrote:
Like marco says,
use views in your source database to apply some business rules and cleansing process.
for the choice of MOLAP or ROLAP...
historical partitions generally are stored in MOLAP mode, because you don't update them frequently.
the "Today" partition can be in ROLAP mode and use indexed views.
But I recommend the low latency MOLAP mode. provides good performance by keeping the cache longer, but you suffer few seconds of latency between an update and the cube update.
if you want to be more realtime the HOLAP mode provides updated result all the time.
SCD dimension should not impact historical data. but your business key must be unique other the time to insure usage of both real time and historical content.
but regarding the data cleansing process you have to do and other technical constraints.
Try to split your solution in 2 projects:
Real time project --> contains only "today" data, there is no SCD dimension
Historical project -> contains all other data and the SCD dimensions
generally doing analysis against the history will not answer the same questions as the real time access.
"Nestor" <n3570r@xxxxxxxxx> wrote in message news:op.txgz9yzc83lx0t@xxxxxxxxxxxxxxxxxxxxxHi Jeje,
Thanks for the recommendation. The tricky part is will need the system to fulfil real time scenarios, have the capability to store historical data as well as being able to due with very large SCDs.
Proactive caching when set up properly does changes from MOLAP to ROLAP and to MOLAP again according to set up (some reference http://msdn2.microsoft.com/en-us/library/ms174769.aspx). I suppose at the end of the day I might need to identify what needs to be real time (hence ROLAP structure) and what needs to be historical + scheduled (hence MOLAP and Snowflaked).
I did some intensive research to UDM and basically 'concluded' that the concept of data cleansing doesn't sit very well in it. Seems like I need to break out the system into 2 conceptual areas in order to fulfil all that requirements...
Thanks.
Regards,
David Ong
On Tue, 21 Aug 2007 16:40:56 -0700, Jeje <willgart@xxxxxxxxxxx> wrote:
SSAS 2005 like any other OLAP tool on the market can works with flat schemas and more complicated schemas.
but using star or snowflake schema is far better for a DW project and when you create a cube.
SSAS 2005 is able to create the tables for you, so instead of creating the tables in the database then creating the cube, you create your dimensions and cubes first and SSAS2005 will generates the tables for you.
this greatly simplify the prototype step of your project.
proactive caching don't switch from MOLAP to ROLAP.
proactive caching can detect changes in the source database and then update the cube automatically or can do an incremental loading automatically.
there is a lot of options available here for different usage and requirements.
but there is no switch from MOLAP to ROLAP.
the ETL process is not required if your cubes are directly connect to the operational database.
today, a lot of databases schemas are more accessible then before and the data quality is good, so accessing the source database is a good option.
it's also a good option when you have only 1 database to play with (and not a lot of databases to synchronize)
but if you have to sync. multiple databases, or you have to do a lot of cleansing / transformations, then an ETL process is required.
another option we can found today:
the transaction made to the operational system are also replicated into the datawarehouse in real time (using triggers, procedures or workflow systems) so the DW database himself become a realtime system.
"Nestor" <n3570r@xxxxxxxxx> wrote in message news:op.txe6yasa83lx0t@xxxxxxxxxxxxxxxxxxxxxHi everyone,
I have a general question to datawarehousing and it'll be great if someone can drop some tips here. I'm about to start designing a corporate real time datawarehouse with SQL Server 2005 and is deciding on the approach for this.
SQL Server 2005 now allows us to actually build a datawarehouse without actually restructuring the datasources into star/snowflake schemas. I'm curious to know if this will have any performance impact on the actual implementation? This project will probably require the usage of proactive caching as well as medium latency molap storage structure.
By using proactive caching, SSAS will intelligently switch between ROLAP and MOLAP depending on the data changes in the production database. What is confusing to me is, what happens to ETL then? Does this mean that if SSAS is made to query directly from the production database, then ETL is totally eliminated?
Any recommendations will be appreciated.
Thanks.
Nes
-- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
.
- Follow-Ups:
- Re: UDM and Star Schema
- From: Nestor
- Re: UDM and Star Schema
- References:
- UDM and Star Schema
- From: Nestor
- Re: UDM and Star Schema
- From: Jeje
- Re: UDM and Star Schema
- From: Nestor
- Re: UDM and Star Schema
- From: Jeje
- UDM and Star Schema
- Prev by Date: Re: version/source control in datawarehouse project
- Next by Date: Re: version/source control in datawarehouse project
- Previous by thread: Re: UDM and Star Schema
- Next by thread: Re: UDM and Star Schema
- Index(es):
Relevant Pages
|
Loading