L1 Regularisation

L1 regularization adds a penalty proportional to the absolute values of the model coefficients to the loss function. This penalty encourages sparsity-some coefficients become exactly zero-making it useful for feature selection.

Remember how L1 and L2 metrics work with regards to the unit ball.

Loss Function:

Loss = MSE + λ i = 1 \sum n ∣ w_{i} ∣

where:

$MSE$ = Mean Squared Error
$λ$ = Regularization strength
$w_i$ = Model weights

Key Properties:

Adds penalty based on absolute value of coefficients.
Drives some coefficients to zero by removes less relevant features.
Produces a sparse model (subset of important features retained).

Example (Lasso in scikit-learn):

from sklearn.linear_model import Lasso
 
# Initialize and fit Lasso model
model = Lasso(alpha=0.1)  # alpha controls regularization strength
model.fit(X_train, y_train)

Use Case:

Ideal for feature selection when dealing with many predictors.

Video Explanation

Data Archive

Explorer

L1 Regularisation

Backlinks

Explorer