Re: Understanding the contents of a cluster



Essentially with this data set all you are goinf to get are random clusters
based on the clustering starting position. There is no information here.
Squares, Cubes and Circles could be any color. And blue and red could be
any shape. The only "pattern" is that rectangles can only be green. Since
there are only ten cases, there's not much for the clustering algorithm to
work with.

--

-Jamie MacLennan
SQL Server Data Mining
This posting is provided "AS IS" with no warranties, and confers no rights.
"Raman Iyer [MS]" <ramaniy@xxxxxxxxxxxxxxxxxxxx> wrote in message
news:uu4o7wfbFHA.2664@xxxxxxxxxxxxxxxxxxxxxxx
> This is due to the fact that we do "soft clustering" so a case can be
> present in all the clusters, with varying levels of
> probability/likelihood, and clusters are therefore overlapping. If you
> select the cluster label (use the Cluster() function) in your prediction
> query, you'll see the "hard cluster" that each case was assigned to (this
> would be the most likely cluster).
>
> --
> -Raman Iyer
> SQL Server Data Mining
> [This posting is provided "AS IS" with no warranties, and confers no
> rights.]
>
> "Earl Newcomer" <EarlNewcomer@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
> news:BDA7F994-AFDA-48BE-8768-DC1B3ACEC56C@xxxxxxxxxxxxxxxx
>> What does the node path mean when browsing a mining model cluster?
>> I created a sample data set of 10 rows each with a unique case id. I
>> have
>> two attributes color and shape that describe each row. There are 3
>> colors
>> (Red, Blue, Green) and 4 shapes (Square, Circle, Cube, Rectangle). I
>> requested 4 clusters and the mining model created 4 clusters but I can't
>> decipher which cases the algorithm put in which clusters when both
>> attributes
>> are included as "input & predictable".
>>
>> The training data is:
>> CaseKey Color Shape
>> 1 Blue Square
>> 2 Red Square
>> 3 Green Square
>> 4 Blue Cube
>> 5 Red Cube
>> 6 Green Cube
>> 7 Blue Circle
>> 8 Red Circle
>> 9 Green Circle
>> 10 Green Rectangle
>>
>> The node paths for each cluster show as:
>> Cluster 1 => Shape=Square, Color=Red, Shape=Circle
>> Cluster 2 => Shape=Cube, Shape=Square, Color=Blue, Color=Red
>> Cluster 3 => Shape=Rectangle, Color=Green, Shape=Cube
>> Cluster 4 => Shape=Circle, Color=Blue, Color=Red, Shape=Cube
>>
>> Is there a way to interpret the node paths to explain the cases and
>> probabilities reported for each cluster?
>>
>> Earl Newcomer
>> High Road Consulting, Inc.
>>
>
>


.



Relevant Pages

  • Re: Renaming clusters
    ... cluster will be Male. ... SQL Server Data Mining ... After the rename, you need to re-process the corresponding DM dimension ...
    (microsoft.public.sqlserver.datamining)
  • Re: Understanding the contents of a cluster
    ... present in all the clusters, with varying levels of probability/likelihood, ... If you select the cluster label ... > 4 Blue Cube ... > 7 Blue Circle ...
    (microsoft.public.sqlserver.datamining)