sklearn.ensemble.AdaBoostClassifier cannot accecpt SVM as base_estimator?

Question

I am doing a text classification task. Now I want to use ensemble.AdaBoostClassifier with LinearSVC as base_estimator. However, when I try to run the code

clf = AdaBoostClassifier(svm.LinearSVC(),n_estimators=50, learning_rate=1.0,    algorithm='SAMME.R')
clf.fit(X, y)

An error occurred. TypeError: AdaBoostClassifier with algorithm='SAMME.R' requires that the weak learner supports the calculation of class probabilities with a predict_proba method

The first question is Cannot the svm.LinearSVC() calculate the class probabilities ? How to make it calculate the probabilities?

Then I Change the parameter algorithm and run the code again.

clf = AdaBoostClassifier(svm.LinearSVC(),n_estimators=50, learning_rate=1.0, algorithm='SAMME')
clf.fit(X, y)

This time TypeError: fit() got an unexpected keyword argument 'sample_weight' happens. As is said in AdaBoostClassifier, Sample weights. If None, the sample weights are initialized to 1 / n_samples. Even if I assign an integer to n_samples, error also occurred.

The second question is What does n_samples mean? How to solve this problem?

Hope anyone could help me.

According to @jme 's comment, however, after trying

clf = AdaBoostClassifier(svm.SVC(kernel='linear',probability=True),n_estimators=10,  learning_rate=1.0, algorithm='SAMME.R')
clf.fit(X, y)

The program cannot get a result and the memory used on the server keeps unchanged.

The third question is how I can make AdaBoostClassifier work with SVC as base_estimator?

You generally want to use a weak classifier (even weaker than a linear SVM) in boosting, something like a shallow decision tree (or stump). The error messages you've received are fairly straightforward: no, you cannot use `LinearSVC` here, as the class doesn't have a `predict_proba` method. `SVC` does, if you use a linear kernel. — jme, Nov 24 '14 at 15:34
@jme I have tried `LinearSVC`, but as the task is text classification, I guess whether I could use `SVM` in boosting to improve the accuracy. Do you think it is possible? — allenwang, Nov 25 '14 at 02:34
@jme I am a newbie in machine learning, besides `ensemble`, do you know other methods that can improve the accuracy in text classification? — allenwang, Nov 25 '14 at 02:55

score 14 · Accepted Answer · answered Jan 11 '16 at 00:02

The right answer will depend on exactly what you're looking for. LinearSVC cannot predict class probabilities (required by default algorithm used by AdaBoostClassifier) and does not support sample_weight.

You should be aware that the Support Vector Machine does not nominally predict class probabilities. They are computed using Platt scaling (or an extension of Platt scaling in the multi-class case), a technique which has known issues. If you need less "artificial" class probabilities, an SVM might not be the way to go.

With that said, I believe the most satisfying answer given your question would be that given by Graham. That is,

from sklearn.svm import SVC
from sklearn.ensemble import AdaBoostClassifier

clf = AdaBoostClassifier(SVC(probability=True, kernel='linear'), ...)

You have other options. You can use SGDClassifier with a hinge loss function and set AdaBoostClassifier to use the SAMME algorithm (which does not require a predict_proba function, but does require support for sample_weight):

from sklearn.linear_model import SGDClassifier

clf = AdaBoostClassifier(SGDClassifier(loss='hinge'), algorithm='SAMME', ...)

Perhaps the best answer would be to use a classifier that has native support for class probabilities, like Logistic Regression, if you wanted to use the default algorithm provided for AdaBoostClassifier. You can do this using scikit.linear_model.LogisticRegression or using SGDClassifier with a log loss function, as used in the code provided by Kris.

Hope that helps, if you're curious about what Platt scaling is, check out the original paper by John Platt here.

score 1 · Answer 2 · answered Mar 21 '15 at 07:11

You need to use a learner that has the predict_proba method, since this isn't available in LinearSVC, try SVC with kernel set to 'linear'

clf = AdaBoostClassifier(svm.SVC(probability=True,kernel='linear'),n_estimators=50,       learning_rate=1.0, algorithm='SAMME')
clf.fit(X, y)

while I'm not sure if this will yield identical results to LinearSVC, from the documentation it says:

Similar to SVC with parameter kernel=’linear’, but implemented in terms of liblinear rather than libsvm, so it has more flexibility in the choice of penalties and loss functions and should scale better (to large numbers of samples).

Also mentions something about One vs All and One vs One in terms of how they differ.

score 1 · Answer 3 · answered Jun 30 '17 at 10:42

Actually, LinearSVC can be applied to AdaBoostClassifier without rescaling SVC output through Platt scaling and that is AdaBoost.M1 algorithm was initially designed for, a classifier take {-1, 1} as output. The default algorithm choice in AdaBoostClassifier is AdaBoost.SAMME algorithm [2](specifying "SAMME.R" in algorithm keyword argument) which is designed for multi-class classification.

However, your LinearSVC AdaBoost will be unable to provide predict_proba. On the other side, if what you want is to keep the sign in the output instead of fitting SVM output into sigmoid curve to provide probability. Then you change the algorithm from SAMME.R to SAMME is the easit way to do.

[1] Y. Freund, R. Schapire, “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting”, 1995.
[2] Zhu, H. Zou, S. Rosset, T. Hastie, “Multi-class AdaBoost”, 2009

score 0 · Answer 4 · edited May 23 '17 at 12:02

0

I just had a similar issue trying to use AdaBoostClassifier with LogisticRegression. The docs mention that the weak classifier (or base_estimator) must have a fit method that takes the optional sample_weight=... keyword argument, cf. question #18306416.

If you do want to use an SVM or logistic regression with AdaBoost, you use sklearn's stochastic gradient descent classifier with loss='hinge' (svm) or loss='log' (logistic), e.g.

from sklearn.linear_model import SGDClassifier
from sklearn.ensemble import AdaBoostClassifier

clf = AdaBoostClassifier(SGDClassifier(loss='log'), ...)

YMMV

edited May 23 '17 at 12:02

Community

1
1

answered Apr 15 '15 at 11:22

Kris

22,079
3
30
35

thank you for your answer. but if SGDClassifier(loss='log') is assigned to the first parameter. How can I use SVM classifier? Please show me some details. I really need your help – allenwang Apr 28 '15 at 03:10

sklearn.ensemble.AdaBoostClassifier cannot accecpt SVM as base_estimator?

4 Answers4