Re: DMX Query response time
From: Peter Kim [MS] (peterkim_at_online.microsoft.com)
Date: 09/15/04
- Previous message: Adam Machanic: "Re: DeDupe Query"
- In reply to: Poch Reyes: "Re: DMX Query response time"
- Next in thread: Poch Reyes: "Re: DMX Query response time"
- Reply: Poch Reyes: "Re: DMX Query response time"
- Messages sorted by: [ date ] [ thread ]
Date: Wed, 15 Sep 2004 12:02:46 -0700
Yes, but mostly training time rather than query performance. Setting
MINIMUM_LEAF_CASES to lower and/or COMPLEXITY_PENALTY to a lower number
effectively makes the tree learn more details (i.e., more splits) which
causes training time longer. The prediction is obtained by traversing the
tree. As the tree gets longer, the traversal time gets longer. However, the
impact for the prediction should be ignorable compared to training.
-- Peter Kim This posting is provided "AS IS" with no warranties, and confers no rights. "Poch Reyes" <pochreyes@hotmail.com> wrote in message news:%23c8Gv6wmEHA.632@TK2MSFTNGP12.phx.gbl... > Hi Peter, > > Generally, would playing around with the setting of MINIMUM_LEAF_CASES > and/or COMPLEXITY_PENALTY improve/degrade DMX query performance? Model > Tranining time? > > Thanks? > > Poch > > "Peter Kim [MS]" <peterkim@online.microsoft.com> wrote in message > news:eKOmr85iEHA.1048@tk2msftngp13.phx.gbl... >> We have not published any paper on the optimizations, but I'll try to >> answer your questions here. >> In SQL Server 2000, the patterns (contents) found during training in the >> server are remoted to the client and cached into client memory until it's >> not used any longer. So, the prediction happens in the client; in your >> case, inside the SQL server since you're invoking the Analysis Services >> client component (MSOLAP) inside the SQL Server engine. High disk >> activity may be introduced by the SQL server to deal with temp result >> between MSOLAP and its query processor (it's just my guess), but DM >> prediction wouldn't cause any disk IO since all prediction happens in >> memory and your prediction input is on-the-fly using SELECT. >> >> Yes, 64bit SQL Server 2000 uses the same code path in 32bit. So, the >> limitation that I mentioned below remains the same for both 32bit and >> 64bit. >> >> Yes, SQL Server 2005 Beta2 include all algorithms in SQL 2000 and more. >> >> -- >> Peter Kim >> This posting is provided "AS IS" with no warranties, and confers no >> rights. >> >> "Poch Reyes" <pochreyes@hotmail.com> wrote in message >> news:uOjq4TxiEHA.2992@TK2MSFTNGP12.phx.gbl... >>> Hi Peter, >>> >>> In terms of implementation, are traned models stored in memory? I'm just >>> wondering because whenever we the query below we notice high disk >>> activities, CPU utilization is about 50%. What's the best step to take >>> to improve the query performance? >>> >>> The limitation you mentioned below, will this still be the case if we go >>> 64-bit SQL Server 2000? >>> >>> Worse case, to meet or come even close to our target we may need to run >>> Yukon Beta 2 mining - are all the alogorithms available in Beta 2? >>> >>> Poch >>> >>> "Poch Reyes" <pochreyes@hotmail.com> wrote in message >>> news:uS5yy9liEHA.1656@TK2MSFTNGP09.phx.gbl... >>>> Hi Peter, >>>> >>>> Do you have data mining optimization papers you can share with us? We >>>> are planning to setup a SQL Server 2005 beta box to check out the >>>> difference in query/processing times... >>>> >>>> Thanks, >>>> >>>> Poch >>>> >>>> "Peter Kim [MS]" <peterkim@online.microsoft.com> wrote in message >>>> news:%23C6bNScbEHA.2520@TK2MSFTNGP12.phx.gbl... >>>>>I see two potential sources of slow response time, but unfortunately >>>>>neither of issues is addressed fully until SQL 2005 beta2 (which will >>>>>be available shortly). >>>>> >>>>> 1. I guess you had to use OPENROWSET() from SQL to MSOLAP in order to >>>>> make your procedure reusable in the server as SPROC that accepts >>>>> parameters. This would introduce another expensive layer of rowset >>>>> traffic between the SQL engine and the provider. In SQL Server 2005, >>>>> DMX does support parameterized query and SPROC as well. But, no such >>>>> feature was available in SQL Server 2000. The only option may be to >>>>> move the SPROC logic to middle-tier or client app so that you may be >>>>> able to use MSOLAP directly. >>>>> >>>>> 2. In SQL Server 2000, the optimization for TopCount() in DMX was >>>>> limited. It can be very slow when you have lots of distinct keys in >>>>> the nested table (distinct Service in your app). You could try run the >>>>> DMX query directly against MSOLAP to figure out what the pure overhead >>>>> of DMX is. In SQL Server 2005 beta2, a better optmization has been >>>>> applied and the performance has improved in order of magnitude in this >>>>> type of query. >>>>> >>>>> -- >>>>> Peter Kim >>>>> This posting is provided "AS IS" with no warranties, and confers no >>>>> rights. >>>>> >>>>> "Noldz" <enoldz@hotmail.com> wrote in message >>>>> news:2f4bb01c46d37$362f87d0$a501280a@phx.gbl... >>>>>> ALTER PROC sp_CrossSell @MobileNo Varchar(15), >>>>>> @Service Varchar(100) = '' >>>>>> AS >>>>>> DECLARE @Services AS VARCHAR(8000) >>>>>> DECLARE @DMX NVARCHAR(4000) >>>>>> >>>>>> EXECUTE dbo.sp_GetTexterLastServiceV2 @MobileNo, >>>>>> @Service ,@OutServices = @Services OUTPUT >>>>>> >>>>>> --//THIS WAS DONE IN THIS MANNER BECAUSE >>>>>> --//OPENROWSET Function does not accept variables for its >>>>>> arguments. >>>>>> SET @DMX = 'SELECT A.* FROM >>>>>> OPENROWSET( >>>>>> ''MSOLAP'', >>>>>> ''Provider=MSOLAP;Data Source=Proj88A;Mining >>>>>> Location=D:\proj88a\msanalysissrvcs\data\Proj88_DM'', >>>>>> ''SELECT FLATTENED >>>>>> TOPCOUNT( >>>>>> (SELECT Service, $Support as >>>>>> [Support] ,100 * $probability as [Confidence], 100 * >>>>>> $adjustedProbability as Lift >>>>>> FROM PREDICT([KeyHits], >>>>>> INCLUDE_STATISTICS, EXCLUSIVE) >>>>>> ), [Lift], 5) >>>>>> FROM [Cross Selling Service] PREDICTION >>>>>> JOIN >>>>>> (SELECT (@Services) AS ServiceList) AS >>>>>> Input >>>>>> ON [Cross Selling Service].[KeyHits]. >>>>>> [Service] = Input.ServiceList.[Service]'') AS A' >>>>>> >>>>>> --print @service >>>>>> SET @DMX = REPLACE(@DMX, '@Services' , @Services) >>>>>> >>>>>> --PRINT @DMX >>>>>> EXECUTE SP_EXECUTESQL @DMX >>>>>> >>>>>>>-----Original Message----- >>>>>>>Can you send a sample query? >>>>>>> >>>>>>>-- >>>>>>> >>>>>>>-Jamie MacLennan >>>>>>>SQL Server Data Mining >>>>>>>This posting is provided "AS IS" with no warranties, and >>>>>> confers no rights. >>>>>>>"Noldz" <enoldz@hotmail.com> wrote in message >>>>>>>news:2dc9001c46a2f$df9a3910$a301280a@phx.gbl... >>>>>>>> Hi, >>>>>>>> >>>>>>>> Is there a way to make a DMX Query response time >>>>>> faster? >>>>>>>> Currently our query response time is 0.6 sec/hit. Our >>>>>>>> requirement is 200 hits/sec. >>>>>>>> >>>>>>>> Noldz >>>>>>>> >>>>>>> >>>>>>> >>>>>>>. >>>>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >> > >
- Previous message: Adam Machanic: "Re: DeDupe Query"
- In reply to: Poch Reyes: "Re: DMX Query response time"
- Next in thread: Poch Reyes: "Re: DMX Query response time"
- Reply: Poch Reyes: "Re: DMX Query response time"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|