These metrics provide various ways to evaluate the performance of regression models, each with its strengths and weaknesses. The choice of metric often depends on the specific characteristics of the data and the goals of the analysis.

Common Regression Metrics

  1. Mean Absolute Error (MAE):

    • Definition: MAE measures the average absolute differences between predicted and actual values.
    • Interpretation: Lower values indicate better model performance, as it reflects fewer errors in predictions.
    • Formula:
    • Where:
      • = number of observations
      • = actual value
      • = predicted value
  2. Mean Squared Error (MSE):

    • Definition: MSE calculates the average of the squares of the errors (the differences between predicted and actual values).
    • Interpretation: Like MAE, lower values are better. However, MSE is more sensitive to outliers due to the squaring of errors, which can disproportionately affect the metric.
    • Formula:
    • Where:
      • = number of observations
      • = actual value
      • = predicted value
  3. Root Mean Squared Error (RMSE):

    • Definition: RMSE is the square root of MSE, providing an error metric in the same units as the target variable.
    • Interpretation: Lower RMSE values indicate better model performance, and it also emphasizes larger errors due to the squaring process.
    • Formula:
  4. R squared

  5. Adjusted R squared

  6. Median Absolute Error:

    • Definition: This metric measures the median of the absolute errors between predicted and actual values.
    • Interpretation: It provides a robust measure of prediction accuracy, especially in the presence of outliers.
    • Formula:
  7. Explained Variance Score:

    • Definition: This metric measures the proportion of variance in the target variable that is predictable from the features.
    • Interpretation: Higher values indicate that the model explains a greater proportion of the variance in the target variable.
    • Formula:
    • Where:
      • = variance of the actual values
      • = variance of the prediction errors

Code snippets

Regression Metrics

  1. Mean Absolute Error (MAE). Lower better.
from sklearn.metrics import mean_absolute_error
# Assuming y_true and y_pred are your true and predicted values respectively
mae = mean_absolute_error(y_true, y_pred)
print("Mean Absolute Error:", mae)
  1. Mean Squared Error (MSE). Lower better. MSE is more sensitive to outliers
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(y_true, y_pred)
print("Mean Squared Error:", mse)
  1. Root Mean Squared Error (RMSE)
import numpy as np
rmse = np.sqrt(mean_squared_error(y_true, y_pred))
print("Root Mean Squared Error:", rmse)
  1. R-squared (R2)I R squared. t ranges from 0 to 1, where 1 indicates perfect predictions. Higher R2 values signify better model fit to the data
from sklearn.metrics import r2_score
r2 = r2_score(y_true, y_pred)
print("R-squared:", r2)
  1. Median Absolute Error
from sklearn.metrics import median_absolute_error
median_abs_err = median_absolute_error(y_true, y_pred)
print("Median Absolute Error:", median_abs_err)
  1. Explained Variance Score
from sklearn.metrics import explained_variance_score
explained_var = explained_variance_score(y_true, y_pred)
print("Explained Variance Score:", explained_var)