Data Leakage refers to the unintentional inclusion of information in the training data that would not be available in a real-world scenario, leading to overly optimistic model performance. It occurs when the model has access to data it shouldn’t during training, such as future information or test data, which can result in misleading evaluation metrics and poor generalization to new data.