I have a multi-classification problem and my dataset is slant, I have 100 examples of a particular class and some of them 10 different classes, so I want to divide the ratio of maintaining my datasets between classes, if I have 100 examples of a particular class and I have to record 30% of the training, then I want my 100 records 30 examples of representation squares and mer Like 10 represent three examples classes of records and so on.
You can use scalarse from online docs:
Stratified K-folds cross-resolution iterator
Train / test indicator provides information for splitting data into test tests.
This is a variation of the cross-audition object's fault, which returns the returned values. Silvets are formed by preserving the percentage of samples of each class.
& gt; & Gt; & Gt; Cross_validation from sklearn import & gt; & Gt; & Gt; X = NP Array ([[1, 2], [3, 4], [1, 2], [3, 4]]) gt; & Gt; & Gt; Y = np.array ([0, 0, 1, 1]) & gt; & Gt; & Gt; Skf = cross_validation.StratifiedKFold (y, n_folds = 2) & gt; & Gt; & Gt; Lane (SkaF) 2> gt; & Gt; & Gt; Print (skf) sklearn.cross_validation.StratifiedKFold (label = [0 0 1 1], n_folds = 2, shuffle = false, random_state = none)> gt; & Gt; & Gt; For train_index, in test_index skf: ... print ("train:", train_index, "test:", test_index) ... X_train, X_test = X [train_index], x [test_index] ... y_train, y_test = Y [Train_index], y [test_index] Train: [1 3] Exam: [0 2] Train: [0] Exam: [1 3]
This is the ratio of your class to Will preserve that the division maintains the ratio of the square, this pandus will work fine with DFS.
As suggested by @eli_m, you can use which accepts a different ratios paragraph:
sss = StratifiedShuffleSplit (y, 3, test_size = 0.7, random_state = 0)
will produce a 70% segmentation.
No comments:
Post a Comment