Momentum is an Optimisation technique used to accelerate the Gradient Descent algorithm by incorporating the concept of inertia. It helps in reducing oscillations and speeding up convergence, especially in scenarios where the cost function has a complex landscape (surface). Momentum helps in dampening oscillations and achieving faster convergence. Momentum is a technique that helps accelerate gradient descent by adding a fraction of the previous update to the current update. Formula:

Where:

  • is the velocity (the accumulated gradient).
  • is the momentum factor.
  • is the gradient of the cost function with respect to the parameters .
  • is the learning rate.

In ML_Tools see: Momentum.py

Key Features of Momentum

Inertia Effect: Momentum uses the past gradients to smooth out the updates, which helps in navigating the parameter space more effectively.

Parameter Update Rule: The update rule for momentum involves maintaining a velocity vector that accumulates the gradients. The parameters are then updated using this velocity, which is a combination of the current gradient and the previous velocity.

Hyperparameter

  • Learning Rate (): Controls the size of the steps taken towards the minimum.
  • Momentum Coefficient (): Determines the contribution of the previous gradients to the current update. A typical value is 0.9.