Saturday, 15 January 2011

python - Loading svmlight style file when there are less features in the train set than in the test set -



python - Loading svmlight style file when there are less features in the train set than in the test set -

i have updated scikit-learn version latest, 0.15.2 (more specifically, have created new anaconda environment). seems that, in version, new valueerror has been defined in method load_svmlight_files() in sklearn/datasets/svmlight_format.py (line 238) :

elif n_features < n_f: raise valueerror("n_features set {}," " input file contains {} features" .format(n_features, n_f))

my problem is, load model , then, want load test data, utilize shape of coef_ attribute of model when loading test info (using "n_features" attribute of load_svmlight_file() method). if model has less features test data, loading fails. way handle setting ? i'm not sure when exception has been added seems absent in 0.14.1 release. additional question, why exception has been added ?

>>> sklearn.externals import joblib >>> sklearn.datasets import load_svmlight_file >>> clf = joblib.load('mymodel') >>> print clf.coef_.shape (11, 9862) >>> x,y = load_svmlight_file('test_data', n_features=clf.coef_.shape[1] ) traceback (most recent phone call last): file "<stdin>", line 1, in <module> file ".../anaconda/envs/test/lib/python2.7/site-packages/sklearn/datasets/svmlight_format.py", line 113, in load_svmlight_file zero_based, query_id)) file ".../anaconda/envs/test/lib/python2.7/site-packages/sklearn/datasets/svmlight_format.py", line 248, in load_svmlight_files .format(n_features, n_f)) valueerror: n_features set 9862, input file contains 34912 features

python machine-learning scikit-learn

No comments:

Post a Comment