Re: Multiple Quesions and some feedback on Modelling



When you create a decision tree model, we automatically mark continuous
columns as REGRESSOR. When you change these columns to DISCRETIZED, the
REGRESSOR flag is not automatically removed.

To clear the flag, switch to the Mining Models pane and choose the
discretized column in the decision tree model (I would say choose the
discretized column in the mining model column, but that's confusing). When
you do this, the _model_ column properties (as opposed to _structure_
columns properties) appear in the properties panel. Find the Modeling Flags
property and clear the REGRESSOR flag.

This should resolve the issue.

--

-Jamie MacLennan
SQL Server Data Mining
This posting is provided "AS IS" with no warranties, and confers no rights.
"ImJoe" <ImJoe@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:0C8A0D2A-9262-4DEE-9CD7-7D7E2B777377@xxxxxxxxxxxxxxxx
> Thanks for the response. And yet problems persist, see my comments below.
> I
> need to add ENV is sql server 2005 enterprise edition.
>
> "Dejan Sarka" wrote:
>
>> Inline are some answers.
>>
>> > 1. How to EDIT/CHANGE NUM of Clusters (seems defaults to 10)
>> Click on the Mining Models tab, right-click on the model and select Set
>> Agorithm Parameters option.
> Good, it works.
>
>>
>> > 2. LIFT CHART for 2+ models instead of current model, how?
>>
>> You can have a single lisft chart for multiple models that share the same
>> structure.
> Can't try it without resolving the problem of number 5 (see below).
>
>> > 3. using Clustering model, input includes firstName and lastName,
>> > characteristics indicates that both values are missing -- good
>> > and yet, probability histogram/bar has 60-70% 'mark', why not blank?
>> > since they are not determinants for prediction attribute -- income.
>>
>> Why do you need names in the model anyway?
> But what if you you need to include them for predication result?
> This one is not that important for now.
>
>> > 4. Any way to change a model structure's source table (not edit ds
>> > view)?
>>
>> AFAIK no - create a new model. You could help yourself with ds view, as
>> you
>> mentioned, or with a wiev in the relational database. You could refer to
>> the
>> view in the structure and then change the view definition as needed.
> Probably I didn't get my point across. Say, modelTreeX currently used
> DimCustomer (table) and now I change my mind I want mdelTreeX to point to
> VwCustomer (view) instead, how? In other words, remap model's "data
> origin",
> I don't seem to see an option to do that. Both these entities already
> exist
> in the current DSV.
>
>> > 5. Have a model structure that entails three mining models including
>> > tree.
>> > One input is yearlyIncome set to descretized and for the tree model,
>> > parameter for force regressor has no values set at all. Still,
>> > while processing all models I got an error, "xxx model, column xyz must
>> > be of continous type since REGRESSOR flag is set on for it." but
>> > that flag has no value at all, which I construed as not set. ?
>> > And if I make the yearlyIncome column of continous type, then my naive
>> > bayes
>> > model in this structure would complain. So, how do we deal with it?
>> > I must have this column.
>>
>> Set the attribute to discretized in the minimg structure (use the Mining
>> Structure tab), and clear the regressor flag for this attribute for the
>> DT
>> model (using Mining Models tab).
> No option to 'clear that attribute', removal is, however, even after
> SAVING
> upon removal, this attribute still exists for a given model.
>
>> > 6. model status: how can you tell if it has been trained in BI dev
>> > studio?
>> > and similary, define the word, 'Process' here as in Process a model?
>> > Is it Train model data only or more? And if more, what's that?
>> > same question goes for it for sql management studio, my initial thought
>> > was they are trained already for sql management studio because
>> > otherwise
>> > they won't be deployed at server, however, I still see the menu option
>> > "Process", would it imply that the model may have changed, hence, the
>> > need for re-train etc.?
>>
>> Process means to re-train, so the model does not have to change - you can
>> reprocess it because you have some new data. So disabling the option when
>> the model is processed for the 1st time wouldn't be very useful.
> "Disabling the option", that's not what I intended. Were it me who
> designed
> the interface I would more likely stick with data mining terms as much as
> possible like "Train model" instead of "Process model". This one is not
> that
> important.
>
>> > 7. Legend: relationship between prediction probability and score?
>> > Case in mind, a naive bayes model generated 75.51% prediction
>> > probability
>> > and the score is 0.78. Also, a tree model did not show legend, could
>> > it because my window settings mess up? On a similar note, some model
>> > has
>> > a score of 10.78 something, how come? Or possibly that model wasn't
>> > created
>> > properly. BOL does not seem to explain the term Score here.
>>
>> I will leave this one to the DM team:-)
>>
>> > 8. where can I find a updated copy of MS_DM_Tutorial if there's any
>> > revision?
>> > (for MS SQL Server 2005)
>> > by
>> > Seth Paul,Jamie MacLennan,Zhaohui Tang,Scott Oveson
>> > Microsoft Corporation
>> > 07/2004
>>
>> There is a tutorial in Books OnLine, and
>> http://www.sqlserverdatamining.com/DMCommunity/Tutorials/default.aspx.
>>
>> --
>> Dejan Sarka, SQL Server MVP
>> Mentor
>> www.SolidQualityLearning.com
>>
>>
>>


.