Re: Better approach for common business problem ? (long post)
From: Chris Webb (OnlyForPostingToNewsgroups_at_hotmail.com)
Date: 10/18/04
- Next message: Mark Hill: "Re: Seibel Analytics Module vs others"
- Previous message: Jamie Thomson: "Re: Template for documenting cubes"
- In reply to: Tom VdP: "Re: Better approach for common business problem ? (long post)"
- Next in thread: Chris Webb: "Re: Better approach for common business problem ? (long post)"
- Reply: Chris Webb: "Re: Better approach for common business problem ? (long post)"
- Messages sorted by: [ date ] [ thread ]
Date: Mon, 18 Oct 2004 02:29:02 -0700
Hi Tom,
I'm glad to hear we're making progress...!
First of all, regarding the aggregates, the fact that performance didn't
increase very much when you built the new ones only strengthens my opinion
that your main problem is the amount of summing up you need to do. The fact
that you are generating thousands of subqueries is also significant - if you
haven't tried these properties already in your connection string, try using
the following together or separately:
Cache Ratio=0.01
Cache Ratio2=0.01
Cache Policy = 5 or 6 or 7 (in theory 7 should be the one that makes the
difference)
I've found that various combinations of these reduce the number of
subqueries sent to the server, and therefore speed up overall query
performance. And also, as I think I said in an earlier mail, make sure you
have turned off query logging on the server - with this many subqueries,
logging every single one can introduce a fairly big overhead.
Regarding the extra levels, the reason I proposed two hidden levels rather
than a single quarter level between month and year is that it should increase
the opportunities you have for optimisation - you have two sets of
aggregations you could hit instead of one. But this is something you'll need
to test yourself to see if the impact is worth the extra work.
Last of all, partitioning should always have positive impact on this type of
query, especially if you are partitioning by month. By setting the data slice
property on a partition in your scenario, you're saying to the AS engine that
instead of having to search through the whole cube for data for the month
January 1999 (for example), it can go directly to the partition you have
specified. Therefore the amount of time actually taken to read the data off
disk is greatly reduced. There are two possible reasons why you're seeing a
performance decrease, in my opinion:
1) You haven't set the data slice property properly (although I'm sure you
have!)
2) When you reprocessed the cube to include the new partitions, and reran
your test queries, the cache on the server was empty (as a result of the
reprocess) and you were comparing this performance with queries run on your
old cube when the cache was pre-populated with useful data. What I generally
do is stop and start the AS service and close and reopen the client
connection before each query run when I'm doing performance testing to try to
reduce the impact of caching as much as possible.
However, as we already established when looking at your aggregations, it's
the summing up of data and not the reading from disk that is your problem, so
I would imagine that any benefit partitioning will bring you will be
relatively minor. I would definitely keep it in your cube design though.
Regards,
Chris
- Next message: Mark Hill: "Re: Seibel Analytics Module vs others"
- Previous message: Jamie Thomson: "Re: Template for documenting cubes"
- In reply to: Tom VdP: "Re: Better approach for common business problem ? (long post)"
- Next in thread: Chris Webb: "Re: Better approach for common business problem ? (long post)"
- Reply: Chris Webb: "Re: Better approach for common business problem ? (long post)"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|