Outliers can often be detected using clustering methods because they either form small, distinct groups or are isolated from major clusters without strict statistical assumptions.

MethodKey AssumptionStrengthsWeaknessesTypical Use Case
DBSCANClusters are areas of high density separated by low-density regions- No need to specify number of clusters- Can find arbitrarily shaped clusters- Explicitly identifies noise (anomalies)- Struggles with varying densities- Sensitive to parameter choice (epsilon, minPoints)Spatial data clustering and density-based anomaly detection
Isolated ForestAnomalies are easier to isolate via random splits- Efficient on large, high-dimensional datasets- Requires fewer assumptions- Scales well with data size- Not suited for small datasets- Less interpretable than density-based methodsHigh-dimensional tabular data anomaly detection
Local Outlier Factor (LOF)Anomalies have significantly lower local density compared to neighbors- Good for local anomaly detection- Adapts to density variations- Sensitive to choice of k (number of neighbors)- Poor performance on high-dimensional dataDetecting subtle anomalies in medium-sized tabular datasets