PMML is an XML-based standard developed by the Data Mining Group (DMG) that allows applications to describe and share predictive models across different platforms and tools.
Purpose
- Provides a common representation for predictive analytics models.
- Enables model portability: train a model in one tool and deploy it in another without custom code.
What It Contains
- Model structure: Type (e.g., Decision Tree, Regression, Neural Network).
- Data dictionary: Input and output fields, data types.
- Transforms: Preprocessing steps (normalization, binning, etc.).
- Model parameters: Coefficients, splits, weights.
Why Important
- Facilitates interoperability in heterogeneous environments.
- Reduces the need for re-implementation.
- Commonly used in banking, insurance, and enterprise systems for deploying models.
Example Use
- Train a logistic regression model in R or Python → export as PMML → load into a Java-based scoring engine.