python - scikit-learn cross_validation need more info about the resulting score -
i'm attempting generate "which engine works best" info project i'm on. general thoughts simple, pick engine, cross validation, generate list of cross validation results, 1 that's biggest "best." tests done on same set of teaching data. here's snippet of idea. set loop , instead of setting simple_clf svm.svc() have loop of engines , rest of code each engine. base of operations info in featurevecs , scorenums contains corresponding score value, 0 9, particular base of operations info item supposed generate.
x_train, x_test, y_train, y_test = train_test_split( featurevecs, scorenums, test_size = 0.333, random_state = 0 ) # in loop of engine types i'm making sure basic code works simple_clf = svm.svc() simple_clf = grid_search.gridsearchcv( simple_clf, clfparams, cv = 3 ) simple_clf.fit( x_train, y_train ) kf = cross_validation.kfold( len( x_train ), k = 5 ) scores = cross_validation.cross_val_score( simple_clf, x_test, y_test, cv = kf ) print scores.mean(), scores.std() / 2 # loop end here my problem scores isn't usable i'm supposed provide in terms of saying what's "best." scores can provide .mean() , .std() me print. don't want results of engine returning exact match, "close" match. in case, close means numeric score within 1 of expected score. if expected score 3, either 2, 3 or 4 considered match , result.
i looked through documentation , seems latest bleeding border version of scikit-learn has add-on metrics bundle allows custom score function passed grid search i'm unsure if plenty need. because i'd need able pass cross_val_score function not grid_search, no? regardless isn't option, i'm locked version of scikit-learn have use.
i noted reference cross_val_predict in latest bleeding border version seems need, 1 time again i'm locked version use.
what done before bleeding border when definition of "good" cross_validation wasn't exact match default used? certainly done. need pointed in right direction.
i'm stuck @ version 0.11 of scikit-learn because of corporate policy, can utilize approved software , version approved awhile ago alternative me.
here's changed things to, using helpful hint @ cross_val_score in 0.11 docs , find can custom score function , can write own long matches parameters. code have now. i'm looking for, generating results based not on exact match when "close" close defined within 1.
# kludge way of changing testing match close score_count = 0 score_crossover_count = 0 def my_custom_score_function( y_true, y_pred ): # kludge way of changing testing match close global score_count, score_crossover_count if( score_count < score_crossover_count ): close_applies = false else: close_applies = true score_count += 1 print( close_applies, score_crossover_count, score_count ) deltas = np.abs( y_true - y_pred ) = 0 delta in deltas: if( delta == 0 ): += 1 elif( close_applies , ( delta == 1 ) ): += 1 reply = float( ) / float( len( y_true ) ) homecoming reply code snippet main routine:
fold_count = 5 # kludge way of changing testing match close # set global variables custom scorer function global score_count, score_crossover_count score_count = 0 score_crossover_count = fold_count # simple cross validation simple_clf = svm.svc() simple_clf = grid_search.gridsearchcv( simple_clf, clfparams, cv = 3 ) simple_clf.fit( x_train, y_train ) print( '{0} '.format( test_type ), end = "" ) kf = cross_validation.kfold( len( x_train ), k = fold_count ) scores = cross_validation.cross_val_score( simple_clf, x_train, y_train, cv = kf, score_func = my_custom_score_function ) print( 'accuracy (+/- 0) {1:0.4f} (+/- {2:0.4f}) '.format( scores, scores.mean(), scores.std() / 2 ), end = "" ) scores = cross_validation.cross_val_score( simple_clf, x_train, y_train, cv = kf, score_func = my_custom_score_function ) print( 'accuracy (+/- 1) {1:0.4f} (+/- {2:0.4f}) '.format( scores, scores.mean(), scores.std() / 2 ), end = "" ) print( "" )
you can find documentation cross_val_score 0.11 here can provide custom store function score_func argument, interface different. aside: why "locked into" current version? backward compatible 2 releases usually.
python scikit-learn
No comments:
Post a Comment