Lagrange multipliers let us embed constraints into optimization problems, and many ML models (like SVMs and constrained likelihoods) rely on them.
Lagrange multipliers are a mathematical method for solving constrained optimization problems. In machine learning, they are used when we want to maximize or minimize a function subject to one or more constraints.
General Idea
Suppose we want to minimize (or maximize): subject to a constraint: . We introduce a new variable (the Lagrange multiplier) and define the Lagrangian:
Then solve by setting derivatives = 0:
Why It Matters in ML
Lagrange multipliers let us handle constraints directly in optimization problems common in machine learning. Examples:
- The optimization problem is: minimize margin loss subject to constraints on classification.
- Lagrange multipliers are used to transform it into a dual problem, making the solution tractable.
- Can be seen as introducing constraints on weights (e.g., ).
- Lagrange multipliers turn this into a penalty term (like in ridge regression).
Intuition
- Without constraints: move downhill on until reaching the minimum.
- With constraints: must stay on the surface defined by .
- The Lagrange multiplier balances the trade-off between optimizing and satisfying the constraint.