Re: DMX Query response time

Tech-Archive recommends: Speed Up your PC by fixing your registry

From: Peter Kim [MS] (peterkim_at_online.microsoft.com)
Date: 09/15/04

  • Next message: Peter Kim [MS]: "Re: Subquery problem"
    Date: Wed, 15 Sep 2004 12:02:46 -0700
    
    

    Yes, but mostly training time rather than query performance. Setting
    MINIMUM_LEAF_CASES to lower and/or COMPLEXITY_PENALTY to a lower number
    effectively makes the tree learn more details (i.e., more splits) which
    causes training time longer. The prediction is obtained by traversing the
    tree. As the tree gets longer, the traversal time gets longer. However, the
    impact for the prediction should be ignorable compared to training.

    -- 
    Peter Kim
    This posting is provided "AS IS" with no warranties, and confers no rights.
    "Poch Reyes" <pochreyes@hotmail.com> wrote in message 
    news:%23c8Gv6wmEHA.632@TK2MSFTNGP12.phx.gbl...
    > Hi Peter,
    >
    > Generally, would playing around with the setting of MINIMUM_LEAF_CASES 
    > and/or COMPLEXITY_PENALTY improve/degrade DMX query performance? Model 
    > Tranining time?
    >
    > Thanks?
    >
    > Poch
    >
    > "Peter Kim [MS]" <peterkim@online.microsoft.com> wrote in message 
    > news:eKOmr85iEHA.1048@tk2msftngp13.phx.gbl...
    >> We have not published any paper on the optimizations, but I'll try to 
    >> answer your questions here.
    >> In SQL Server 2000, the patterns (contents) found during training in the 
    >> server are remoted to the client and cached into client memory until it's 
    >> not used any longer. So, the prediction happens in the client; in your 
    >> case, inside the SQL server since you're invoking the Analysis Services 
    >> client component (MSOLAP) inside the SQL Server engine. High disk 
    >> activity may be introduced by the SQL server to deal with temp result 
    >> between MSOLAP and its query processor (it's just my guess), but DM 
    >> prediction wouldn't cause any disk IO since all prediction happens in 
    >> memory and your prediction input is on-the-fly using SELECT.
    >>
    >> Yes, 64bit SQL Server 2000 uses the same code path in 32bit. So, the 
    >> limitation that I mentioned below remains the same for both 32bit and 
    >> 64bit.
    >>
    >> Yes, SQL Server 2005 Beta2 include all algorithms in SQL 2000 and more.
    >>
    >> -- 
    >> Peter Kim
    >> This posting is provided "AS IS" with no warranties, and confers no 
    >> rights.
    >>
    >> "Poch Reyes" <pochreyes@hotmail.com> wrote in message 
    >> news:uOjq4TxiEHA.2992@TK2MSFTNGP12.phx.gbl...
    >>> Hi Peter,
    >>>
    >>> In terms of implementation, are traned models stored in memory? I'm just 
    >>> wondering because whenever we the query below we notice high disk 
    >>> activities, CPU utilization is about 50%. What's the best step to take 
    >>> to improve the query performance?
    >>>
    >>> The limitation you mentioned below, will this still be the case if we go 
    >>> 64-bit SQL Server 2000?
    >>>
    >>> Worse case, to meet or come even close to our target we may need to run 
    >>> Yukon Beta 2 mining - are all the alogorithms available in Beta 2?
    >>>
    >>> Poch
    >>>
    >>> "Poch Reyes" <pochreyes@hotmail.com> wrote in message 
    >>> news:uS5yy9liEHA.1656@TK2MSFTNGP09.phx.gbl...
    >>>> Hi Peter,
    >>>>
    >>>> Do you have data mining optimization papers you can share with us? We 
    >>>> are planning to setup a SQL Server 2005 beta box to check out the 
    >>>> difference in query/processing times...
    >>>>
    >>>> Thanks,
    >>>>
    >>>> Poch
    >>>>
    >>>> "Peter Kim [MS]" <peterkim@online.microsoft.com> wrote in message 
    >>>> news:%23C6bNScbEHA.2520@TK2MSFTNGP12.phx.gbl...
    >>>>>I see two potential sources of slow response time, but unfortunately 
    >>>>>neither of issues is addressed fully until SQL 2005 beta2 (which will 
    >>>>>be available shortly).
    >>>>>
    >>>>> 1. I guess you had to use OPENROWSET() from SQL to MSOLAP in order to 
    >>>>> make your procedure reusable in the server as SPROC that accepts 
    >>>>> parameters. This would introduce another expensive layer of rowset 
    >>>>> traffic between the SQL engine and the provider. In SQL Server 2005, 
    >>>>> DMX does support parameterized query and SPROC as well. But, no such 
    >>>>> feature was available in SQL Server 2000. The only option may be to 
    >>>>> move the SPROC logic to middle-tier or client app so that you may be 
    >>>>> able to use MSOLAP directly.
    >>>>>
    >>>>> 2. In SQL Server 2000, the optimization for TopCount() in DMX was 
    >>>>> limited. It can be very slow when you have lots of distinct keys in 
    >>>>> the nested table (distinct Service in your app). You could try run the 
    >>>>> DMX query directly against MSOLAP to figure out what the pure overhead 
    >>>>> of DMX is. In SQL Server 2005 beta2, a better optmization has been 
    >>>>> applied and the performance has improved in order of magnitude in this 
    >>>>> type of query.
    >>>>>
    >>>>> -- 
    >>>>> Peter Kim
    >>>>> This posting is provided "AS IS" with no warranties, and confers no 
    >>>>> rights.
    >>>>>
    >>>>> "Noldz" <enoldz@hotmail.com> wrote in message 
    >>>>> news:2f4bb01c46d37$362f87d0$a501280a@phx.gbl...
    >>>>>> ALTER       PROC sp_CrossSell @MobileNo Varchar(15),
    >>>>>> @Service Varchar(100) = ''
    >>>>>> AS
    >>>>>> DECLARE @Services AS VARCHAR(8000)
    >>>>>> DECLARE @DMX NVARCHAR(4000)
    >>>>>>
    >>>>>> EXECUTE dbo.sp_GetTexterLastServiceV2 @MobileNo,
    >>>>>> @Service ,@OutServices = @Services OUTPUT
    >>>>>>
    >>>>>> --//THIS WAS DONE IN THIS MANNER BECAUSE
    >>>>>> --//OPENROWSET Function does not accept variables for its
    >>>>>> arguments.
    >>>>>> SET @DMX = 'SELECT A.* FROM
    >>>>>> OPENROWSET(
    >>>>>> ''MSOLAP'',
    >>>>>> ''Provider=MSOLAP;Data Source=Proj88A;Mining
    >>>>>> Location=D:\proj88a\msanalysissrvcs\data\Proj88_DM'',
    >>>>>> ''SELECT FLATTENED
    >>>>>> TOPCOUNT(
    >>>>>>   (SELECT Service, $Support as
    >>>>>> [Support] ,100 * $probability as [Confidence], 100 *
    >>>>>> $adjustedProbability as Lift
    >>>>>>    FROM PREDICT([KeyHits],
    >>>>>> INCLUDE_STATISTICS, EXCLUSIVE)
    >>>>>>   ), [Lift], 5)
    >>>>>> FROM [Cross Selling Service] PREDICTION
    >>>>>> JOIN
    >>>>>>   (SELECT (@Services) AS ServiceList) AS
    >>>>>> Input
    >>>>>> ON [Cross Selling Service].[KeyHits].
    >>>>>> [Service] = Input.ServiceList.[Service]'') AS A'
    >>>>>>
    >>>>>> --print @service
    >>>>>> SET @DMX = REPLACE(@DMX, '@Services' , @Services)
    >>>>>>
    >>>>>> --PRINT @DMX
    >>>>>> EXECUTE SP_EXECUTESQL @DMX
    >>>>>>
    >>>>>>>-----Original Message-----
    >>>>>>>Can you send a sample query?
    >>>>>>>
    >>>>>>>-- 
    >>>>>>>
    >>>>>>>-Jamie MacLennan
    >>>>>>>SQL Server Data Mining
    >>>>>>>This posting is provided "AS IS" with no warranties, and
    >>>>>> confers no rights.
    >>>>>>>"Noldz" <enoldz@hotmail.com> wrote in message
    >>>>>>>news:2dc9001c46a2f$df9a3910$a301280a@phx.gbl...
    >>>>>>>> Hi,
    >>>>>>>>
    >>>>>>>> Is there a way to make a DMX Query response time
    >>>>>> faster?
    >>>>>>>> Currently our query response time is 0.6 sec/hit. Our
    >>>>>>>> requirement is 200 hits/sec.
    >>>>>>>>
    >>>>>>>> Noldz
    >>>>>>>>
    >>>>>>>
    >>>>>>>
    >>>>>>>.
    >>>>>>>
    >>>>>
    >>>>>
    >>>>
    >>>>
    >>>
    >>>
    >>
    >>
    >
    > 
    

  • Next message: Peter Kim [MS]: "Re: Subquery problem"

    Relevant Pages

    • Re: CONTAINS performance
      ... That said, and with the query plan, I can start to give you more ... relational join in the context of the free-text optimization, ... SQL Server tables. ...
      (microsoft.public.sqlserver.fulltext)
    • Re: Indexing Service, Openquery and sp_executesql
      ... SQL Server version and sp are you running? ... > data from the indexing service catalog when pasted into ... > query analyzer, but failed when put against sp_executesql ... I would choose Microsoft Indexing ...
      (microsoft.public.sqlserver.fulltext)
    • Re: Problem using Access or Query Designer to run queries in SQL Serve
      ... >or Query Designer within Enterprise Manager, it works and I get data back. ... >ODBC Call Failed [ODBC SQL Server Driver] Timeout Expirederror in Access ... >[ODBC SQL Server Driver] Timeout Expired ...
      (microsoft.public.sqlserver.odbc)
    • Re: Transaction Isolation Level
      ... This means that while I initiate a transaction selecting all new orders it ... My query is executed on a SQL server 2005. ... the isolation level you should use is snapshot isolation. ...
      (comp.databases.ms-sqlserver)
    • Extreme performance issues (SQL Server 2000/ADO.NET/C#)
      ... This process runs very quickly if run through Query ... same exact stored procedures and views, run in the same exact order, through ... system that runs SQL Server (a 4-cpu Xeons system with 2gigs of physical ... When I execute these steps manually through query analyser,, ...
      (microsoft.public.dotnet.framework.adonet)