Monday, 15 April 2013

classification - Attribute order influences results Naive Bayes Orange -



classification - Attribute order influences results Naive Bayes Orange -

i have data-file 13 attributes , binary class variable. in orange canvas, when apply 'naive bayes classifier' , check performance 'test learner' find results depend on order in attributes selected in de 'select attributes' widget. difference not large, illustration accuracy goes 0.78 0.76.

as naive bayes algorithm consists of multiplying estimated probabilities, order of terms should not matter. closer exam revealed:

this happens relative freaquencies estimation (not laplace) it not happen every datafile, or every rearrangement. happen when first 3 variables moved lastly 3 places our datafile contains 0 frequencies it appears difference not due different probability estimates. when calling estimators command line, order in attributes presented in info file not matter.

the phone call looks this:

bayes_rl = orange.classification.bayes.naivelearner(estimator_constructor=orange.statistics.estimate.relativefrequency()) bayes_relative = bayes_rl(data) print bayes_relative.conditional_distributions

of course, assuming here calling classifier command line equivalent selecting attributes visually in same order appear in file.

this makes me bit insecure going on, kind of rounding error?

the order of attributes matter due limited precision of machine floating point number representation, in particular when multiplying little (near zero) numbers. cause behavior (the naive bayes in orange uses 32 bit floating point precision).

classification orange

No comments:

Post a Comment