Re: Clustering - export

Tech Tip: Click here to run a free scan for Windows Errors and optimize PC performance

From: Jamie MacLennan \(MS\) (jamiemac_at_online.microsoft.com)
Date: 06/21/04

  • Next message: Ralph Krausse: "Indexes.."
    Date: Mon, 21 Jun 2004 10:35:18 -0700
    
    

    You may find some answers in the FAQ at
    http://groups.msn.com/AnalysisServicesDataMining
    Is there any way of exporting the results that Microsoft Clustering
    produces?
      Yes, but the method depends on what you mean by "results." All of the
    cluster definitions are present in the content schema that you can get from
    the query "SELECT * FROM <model>.CONTENT" - see the clustervieweraddin at
    the site above for a sample.
      You can also get the cluster membership of cases through using the
    Cluster() function (see below)

    Is there any way of finding out the optimal number of clusters?
       Not directly, but you can examine the results. The problem here is how
    you define the "optimal" number of clusters. There is a balance between
    having enough clusters to accurately represent the data, but not so many
    that you can't understand the model.

    Here's a suggested approach - your cluster model should accurately cluster
    holdout data. Seperate your data into two sets for training and testing.
    Create many models with different cluster numbers. For each model,
    determine the ClusterProbability() of each case in the test set. The model
    with the highest overall cluster probability would be the optimal one, as it
    best fits holdout data.

    What is Cluster()?
       Cluster() is a function that returns the most likely cluster for a given
    case in a prediction query. For example, using a singleton construct

    SELECT Cluster() FROM MyClusterModel NATURAL PREDICTION JOIN (SELECT 'Male'
    as Gender, 40 as Age) as t

    Would find the most likely cluster for a 40 year-old male. The site above
    has a downloadable tool for assisting in creating DMX (data mining
    extension) queries

    -- 
    -Jamie MacLennan
    SQL Server Data Mining
    This posting is provided "AS IS" with no warranties, and confers no rights.
    "FatherB" <FatherB@discussions.microsoft.com> wrote in message
    news:84B1F7CD-26D6-46D2-8E9C-8AF6AA92922B@microsoft.com...
    > Hello!
    >
    > I have several questions regarding clustering:
    > - Is there any way of exporting the results that Microsoft Clustering
    produces?
    > - Is there any way of finding out the optimal number of clusters?
    > - What is Cluster()?
    >
    > Thanks,
    > Bostjan
    

  • Next message: Ralph Krausse: "Indexes.."

    Relevant Pages

    • Re: mining process flow
      ... so how can I save this select as table on my Database? ... click the Save Query Result button. ... Cluster() AS FROM YourModel PREDICTION JOIN ... using sql data mining structures. ...
      (microsoft.public.sqlserver.datamining)
    • Re: File share access fails during cluster node failover
      ... Microsoft Clustering provides High Availability. ... When you failover or move group within ... the file transfer program would have to be cluster aware - ... Normally the file share is accessible from a client system. ...
      (microsoft.public.windows.server.clustering)
    • Re: Data Mining
      ... You would use the Clustering algorithm in SQL 2000 and build a model across ... all existing movies. ... SELECT Cluster(), ... > we are using ms analysis service 2k for data warehousing & data mining. ...
      (microsoft.public.sqlserver.datamining)
    • Re: anormaly detection using tree algo
      ... term), for instance, the cluster diagram displays cluster 1, cluster 2, ... This assumption allows you to use the tree to determine ... then you aren't really doing data mining anymore. ... > Personally I think that the clustering method for anomaly detection is much ...
      (microsoft.public.sqlserver.datamining)
    • Re: Quorum in 2 node cluster
      ... You cannot use Majority Node Set in a two node cluster, ... Microsoft clustering uses the 'shared nothing' model so both nodes cannot ... I have got a shared disk formed by 4 hard disks ...
      (microsoft.public.windows.server.clustering)