Re: DMX Query response time
From: Noldz (noldz_at_discussions.microsoft.com)
Date: 09/20/04
- Next message: Poch Reyes: "Re: DMX Query response time"
- Previous message: Adam Machanic: "Re: Impossible query?"
- In reply to: Peter Kim [MS]: "Re: DMX Query response time"
- Next in thread: Poch Reyes: "Re: DMX Query response time"
- Reply: Poch Reyes: "Re: DMX Query response time"
- Reply: Jamie MacLennan \(MS\): "Re: DMX Query response time"
- Messages sorted by: [ date ] [ thread ]
Date: Mon, 20 Sep 2004 02:09:08 -0700
We are planning to test our Recommendation System to SQL 2005 . Is the
migration from SQL & Analysis Server 2000 to 2005 is as easy as backup &
restores?
"Peter Kim [MS]" wrote:
> In SQL 2000, the prediction is done in client-side. You don't need to have
> multiple servers with copies of the model for that. Having multiple clients
> (where your app that talks to the server through MSOLAP is located) will
> increase your prediction throughput although an individual query response
> time doesn't change.
>
> In SQL 2005, all predictions are performed in the server. Also, numerous
> optimization has been applied, specially for the scenario that you have.
> Besides, the programmability and integration with other components that SQL
> 2005 has added is tremendous. SQL 2005 is now at beta2 stage and available
> to download for MSDN subscribers. It expected to be released early next
> year.
>
> --
> Peter Kim
> This posting is provided "AS IS" with no warranties, and confers no rights.
>
> "Poch Reyes" <Poch Reyes@discussions.microsoft.com> wrote in message
> news:9A55FBB0-ADC1-418A-9361-F79A2FA59888@microsoft.com...
> > Hi Peter,
> >
> > Given this limitation of SQL Server data mining, can you suggest ways to
> > improve query performance? Do we upgrade our server processor/RAM? Do we
> > split processing of requests to different machines with their own copies
> > of
> > trained models - that way we balance request processing to multiple
> > machines?
> >
> > Thanks,
> >
> > Poch
> >
> > "Peter Kim [MS]" wrote:
> >
> >> Yes, but mostly training time rather than query performance. Setting
> >> MINIMUM_LEAF_CASES to lower and/or COMPLEXITY_PENALTY to a lower number
> >> effectively makes the tree learn more details (i.e., more splits) which
> >> causes training time longer. The prediction is obtained by traversing the
> >> tree. As the tree gets longer, the traversal time gets longer. However,
> >> the
> >> impact for the prediction should be ignorable compared to training.
> >>
> >> --
> >> Peter Kim
> >> This posting is provided "AS IS" with no warranties, and confers no
> >> rights.
> >>
> >> "Poch Reyes" <pochreyes@hotmail.com> wrote in message
> >> news:%23c8Gv6wmEHA.632@TK2MSFTNGP12.phx.gbl...
> >> > Hi Peter,
> >> >
> >> > Generally, would playing around with the setting of MINIMUM_LEAF_CASES
> >> > and/or COMPLEXITY_PENALTY improve/degrade DMX query performance? Model
> >> > Tranining time?
> >> >
> >> > Thanks?
> >> >
> >> > Poch
> >> >
> >> > "Peter Kim [MS]" <peterkim@online.microsoft.com> wrote in message
> >> > news:eKOmr85iEHA.1048@tk2msftngp13.phx.gbl...
> >> >> We have not published any paper on the optimizations, but I'll try to
> >> >> answer your questions here.
> >> >> In SQL Server 2000, the patterns (contents) found during training in
> >> >> the
> >> >> server are remoted to the client and cached into client memory until
> >> >> it's
> >> >> not used any longer. So, the prediction happens in the client; in your
> >> >> case, inside the SQL server since you're invoking the Analysis
> >> >> Services
> >> >> client component (MSOLAP) inside the SQL Server engine. High disk
> >> >> activity may be introduced by the SQL server to deal with temp result
> >> >> between MSOLAP and its query processor (it's just my guess), but DM
> >> >> prediction wouldn't cause any disk IO since all prediction happens in
> >> >> memory and your prediction input is on-the-fly using SELECT.
> >> >>
> >> >> Yes, 64bit SQL Server 2000 uses the same code path in 32bit. So, the
> >> >> limitation that I mentioned below remains the same for both 32bit and
> >> >> 64bit.
> >> >>
> >> >> Yes, SQL Server 2005 Beta2 include all algorithms in SQL 2000 and
> >> >> more.
> >> >>
> >> >> --
> >> >> Peter Kim
> >> >> This posting is provided "AS IS" with no warranties, and confers no
> >> >> rights.
> >> >>
> >> >> "Poch Reyes" <pochreyes@hotmail.com> wrote in message
> >> >> news:uOjq4TxiEHA.2992@TK2MSFTNGP12.phx.gbl...
> >> >>> Hi Peter,
> >> >>>
> >> >>> In terms of implementation, are traned models stored in memory? I'm
> >> >>> just
> >> >>> wondering because whenever we the query below we notice high disk
> >> >>> activities, CPU utilization is about 50%. What's the best step to
> >> >>> take
> >> >>> to improve the query performance?
> >> >>>
> >> >>> The limitation you mentioned below, will this still be the case if we
> >> >>> go
> >> >>> 64-bit SQL Server 2000?
> >> >>>
> >> >>> Worse case, to meet or come even close to our target we may need to
> >> >>> run
> >> >>> Yukon Beta 2 mining - are all the alogorithms available in Beta 2?
> >> >>>
> >> >>> Poch
> >> >>>
> >> >>> "Poch Reyes" <pochreyes@hotmail.com> wrote in message
> >> >>> news:uS5yy9liEHA.1656@TK2MSFTNGP09.phx.gbl...
> >> >>>> Hi Peter,
> >> >>>>
> >> >>>> Do you have data mining optimization papers you can share with us?
> >> >>>> We
> >> >>>> are planning to setup a SQL Server 2005 beta box to check out the
> >> >>>> difference in query/processing times...
> >> >>>>
> >> >>>> Thanks,
> >> >>>>
> >> >>>> Poch
> >> >>>>
> >> >>>> "Peter Kim [MS]" <peterkim@online.microsoft.com> wrote in message
> >> >>>> news:%23C6bNScbEHA.2520@TK2MSFTNGP12.phx.gbl...
> >> >>>>>I see two potential sources of slow response time, but unfortunately
> >> >>>>>neither of issues is addressed fully until SQL 2005 beta2 (which
> >> >>>>>will
> >> >>>>>be available shortly).
> >> >>>>>
> >> >>>>> 1. I guess you had to use OPENROWSET() from SQL to MSOLAP in order
> >> >>>>> to
> >> >>>>> make your procedure reusable in the server as SPROC that accepts
> >> >>>>> parameters. This would introduce another expensive layer of rowset
> >> >>>>> traffic between the SQL engine and the provider. In SQL Server
> >> >>>>> 2005,
> >> >>>>> DMX does support parameterized query and SPROC as well. But, no
> >> >>>>> such
> >> >>>>> feature was available in SQL Server 2000. The only option may be to
> >> >>>>> move the SPROC logic to middle-tier or client app so that you may
> >> >>>>> be
> >> >>>>> able to use MSOLAP directly.
> >> >>>>>
> >> >>>>> 2. In SQL Server 2000, the optimization for TopCount() in DMX was
> >> >>>>> limited. It can be very slow when you have lots of distinct keys in
> >> >>>>> the nested table (distinct Service in your app). You could try run
> >> >>>>> the
> >> >>>>> DMX query directly against MSOLAP to figure out what the pure
> >> >>>>> overhead
> >> >>>>> of DMX is. In SQL Server 2005 beta2, a better optmization has been
> >> >>>>> applied and the performance has improved in order of magnitude in
> >> >>>>> this
> >> >>>>> type of query.
> >> >>>>>
> >> >>>>> --
> >> >>>>> Peter Kim
> >> >>>>> This posting is provided "AS IS" with no warranties, and confers no
> >> >>>>> rights.
> >> >>>>>
> >> >>>>> "Noldz" <enoldz@hotmail.com> wrote in message
> >> >>>>> news:2f4bb01c46d37$362f87d0$a501280a@phx.gbl...
> >> >>>>>> ALTER PROC sp_CrossSell @MobileNo Varchar(15),
> >> >>>>>> @Service Varchar(100) = ''
> >> >>>>>> AS
> >> >>>>>> DECLARE @Services AS VARCHAR(8000)
> >> >>>>>> DECLARE @DMX NVARCHAR(4000)
> >> >>>>>>
> >> >>>>>> EXECUTE dbo.sp_GetTexterLastServiceV2 @MobileNo,
> >> >>>>>> @Service ,@OutServices = @Services OUTPUT
> >> >>>>>>
> >> >>>>>> --//THIS WAS DONE IN THIS MANNER BECAUSE
> >> >>>>>> --//OPENROWSET Function does not accept variables for its
> >> >>>>>> arguments.
> >> >>>>>> SET @DMX = 'SELECT A.* FROM
> >> >>>>>> OPENROWSET(
> >> >>>>>> ''MSOLAP'',
> >> >>>>>> ''Provider=MSOLAP;Data Source=Proj88A;Mining
> >> >>>>>> Location=D:\proj88a\msanalysissrvcs\data\Proj88_DM'',
> >> >>>>>> ''SELECT FLATTENED
> >> >>>>>> TOPCOUNT(
> >> >>>>>> (SELECT Service, $Support as
> >> >>>>>> [Support] ,100 * $probability as [Confidence], 100 *
> >> >>>>>> $adjustedProbability as Lift
> >> >>>>>> FROM PREDICT([KeyHits],
> >> >>>>>> INCLUDE_STATISTICS, EXCLUSIVE)
> >> >>>>>> ), [Lift], 5)
> >> >>>>>> FROM [Cross Selling Service] PREDICTION
> >> >>>>>> JOIN
> >> >>>>>> (SELECT (@Services) AS ServiceList) AS
> >> >>>>>> Input
> >> >>>>>> ON [Cross Selling Service].[KeyHits].
> >> >>>>>> [Service] = Input.ServiceList.[Service]'') AS A'
> >> >>>>>>
> >> >>>>>> --print @service
> >> >>>>>> SET @DMX = REPLACE(@DMX, '@Services' , @Services)
> >> >>>>>>
> >> >>>>>> --PRINT @DMX
> >> >>>>>> EXECUTE SP_EXECUTESQL @DMX
> >> >>>>>>
> >> >>>>>>>-----Original Message-----
> >> >>>>>>>Can you send a sample query?
> >> >>>>>>>
> >> >>>>>>>--
> >> >>>>>>>
> >> >>>>>>>-Jamie MacLennan
> >> >>>>>>>SQL Server Data Mining
> >> >>>>>>>This posting is provided "AS IS" with no warranties, and
> >> >>>>>> confers no rights.
> >> >>>>>>>"Noldz" <enoldz@hotmail.com> wrote in message
> >> >>>>>>>news:2dc9001c46a2f$df9a3910$a301280a@phx.gbl...
> >> >>>>>>>> Hi,
> >> >>>>>>>>
> >> >>>>>>>> Is there a way to make a DMX Query response time
> >> >>>>>> faster?
> >> >>>>>>>> Currently our query response time is 0.6 sec/hit. Our
> >> >>>>>>>> requirement is 200 hits/sec.
> >> >>>>>>>>
> >> >>>>>>>> Noldz
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>.
> >> >>>>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>
> >> >>>>
> >> >>>
> >> >>>
> >> >>
> >> >>
> >> >
> >> >
> >>
> >>
> >>
>
>
>
- Next message: Poch Reyes: "Re: DMX Query response time"
- Previous message: Adam Machanic: "Re: Impossible query?"
- In reply to: Peter Kim [MS]: "Re: DMX Query response time"
- Next in thread: Poch Reyes: "Re: DMX Query response time"
- Reply: Poch Reyes: "Re: DMX Query response time"
- Reply: Jamie MacLennan \(MS\): "Re: DMX Query response time"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|