A single notebook that runs five libraries end to end on the same synthetic UK motor dataset: fit a CatBoost Poisson frequency model, extract actuarial factor tables via SHAP, audit the model for proxy discrimination under FCA Consumer Duty, monitor it for drift, and attach distribution-free conformal prediction intervals. The whole workflow runs in under three minutes on a free Colab GPU.
The dataset has a deliberate design: gender is correlated with region (urban areas skew female in the synthetic population) but has no causal path to claims. The fairness audit should detect this indirect channel — which is exactly the proxy discrimination pattern the FCA flags in Consumer Duty assessments — even though gender was never a model input.
Click the badge above. The notebook installs its own dependencies via pip and runs top to bottom with no additional setup.
| Step | Library | What it does |
|---|---|---|
| Frequency model | CatBoost | Poisson GBM with exposure offset |
| Factor tables | shap-relativities | Multiplicative rating relativities from SHAP values |
| Fairness audit | insurance-fairness | Proxy discrimination audit, D_proxy metric, Shapley attribution |
| Drift monitoring | insurance-monitoring | PSI/CSI per feature, A/E ratios, Murphy decomposition |
| Prediction intervals | insurance-conformal | Conformal intervals with finite-sample coverage guarantees |
All five libraries are published on PyPI and available at github.com/burning-cost.
- Synthetic UK motor data — 5,000 policies, known Poisson DGP
- CatBoost frequency model — standard rating factors, exposure as weight
- SHAP rating relativities — factor tables in GLM format with confidence intervals
- Proxy discrimination audit — detects indirect gender effect via region correlation
- Model monitoring — PSI per feature, A/E with Poisson CIs, Murphy decomposition
- Conformal prediction intervals — 90% coverage verified on held-out test set
pip install catboost "shap-relativities[ml]" insurance-fairness insurance-monitoring insurance-conformal polars shap scikit-learn statsmodels
jupyter notebook insurance-pricing-pipeline.ipynbMIT