The concept of a cost function is central to Model Optimisation, particularly during the training phase of a model.

A cost function (also called a loss function or error function) is a mathematical function used in optimization and machine learning to measure the difference between predicted values and actual values. It quantifies the error or “cost” of a model’s predictions. The main goal of most learning algorithms is to minimize this cost function, thereby improving model accuracy.

Common examples include:

  • Mean Squared Error (MSE) for regression tasks.
  • Cross-Entropy Loss for classification tasks.

Relation to Loss Function

  • The loss function measures the error for a single data point.
  • The cost function typically aggregates these errors over the entire dataset, often by taking an average.
  • See also: Loss versus Cost function.

Parameter Space and Visualization

  • Cost functions depend on Model Parameters.
  • Plotting the cost function across the parameter space creates a surface that shows how different parameter values affect the cost.
  • This surface often contains peaks and valleys, making optimization challenging.

Connection to Gradient Descent

  • Gradient-based optimization methods use the cost function’s derivatives to iteratively update parameters toward the minimum.

Caveats

  • The shape of the cost function depends on the dataset.
  • There is often no explicit formula for complex models.
  • Finding the global minimum can be challenging due to local minima and saddle points.

Summary

  • A cost function is a specific type of objective function used to measure model error.
  • It usually refers to empirical risk (average loss over all training samples).

Example (Linear Regression):

Images