Anomaly detection involves identifying Outliers. Detecting these anomalies is crucial for maintaining data integrity and improving model performance.

Methods for Detecting Anomalies

In ML_Tools see: Pycaret_Anomaly.ipynb

Process of Detection

Data Preparation

Anomaly Detection with a model: Use a chosen method to flag anomalies

Validation

  • Validate the detected anomalies by comparing them against known anomalies (if available) or using domain knowledge.
  • Adjust thresholds or methods based on validation results.

Visualization

  • Visualize the results using plots (e.g., scatter plots, box plots) to understand the distribution of data and the identified anomalies.
  • Boxplot: Displays the distribution and identifies outliers using the interquartile range (IQR).
  • Scatter Plot: Helps visually identify outliers.