Logistic regression in sklearn & Gradient Descent
sklearn’s Logistic Regression implementation does not use Gradient Descent by default. Instead, it uses more sophisticated optimization techniques depending on the solver specified. These solvers are more efficient and robust for finding the optimal parameters for logistic regression. Here’s a summary:
Optimisation function: Solvers in sklearn’s Logistic Regression**
lbfgs
(default in many cases):- Stands for Limited-memory Broyden–Fletcher–Goldfarb–Shanno.
- It’s a quasi-Newton method, which approximates the second derivatives (Hessian matrix) to find the minimum of the cost function efficiently.
liblinear
:- Uses the coordinate descent method for optimization.
- Ideal for small datasets or when
penalty='l1'
is used.
sag
(Stochastic Average Gradient):- An iterative solver similar to stochastic gradient descent (SGD) but averages gradients over all samples.
- Efficient for large datasets.
saga
:- An improved version of
sag
, supporting bothl1
andl2
penalties. - Suitable for sparse and large datasets.
- An improved version of
newton-cg
:- Uses the Newton method with conjugate gradients.
- Efficient for datasets with many features.