Gradient descent in linear regression
Related terms:
It iteratively updates coefficients to minimize error.
Gradient descent is an optimization algorithm used to minimize the cost function in linear regression by iteratively adjusting the model parameters (coefficients). Here’s how it works with linear regression:
-
Initialize Parameters: Start with initial guesses for the coefficients (weights), typically set to zero or small random values.
-
Compute Predictions: Use the current coefficients to make predictions for the dependent variable using the linear regression model:
-
Calculate the Cost Function:Compute the loss function, SSE.
-
Compute the Gradient: Calculate the gradient of SSE function with respect to each coefficient. The gradient is a vector of partial derivatives indicating the direction and rate of change of the cost function: Here, is the value of the -th feature for the -th observation.
-
Update the Coefficients: Adjust the coefficients in the opposite direction of the gradient to reduce the cost function. This is done using a Learning Rate, which controls the size of the steps taken:
-
Iterate & Converge Repeat steps 2 to 5 until the cost function converges to a minimum or a predefined number of iterations is reached. The algorithm converges when the changes in the cost function or the coefficients become very small, indicating that the minimum has been reached.