Skip to content
This repository was archived by the owner on Mar 26, 2026. It is now read-only.

burning-cost/insurance-pricing-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Insurance Pricing Pipeline Demo

Open In Colab

A single notebook that runs five libraries end to end on the same synthetic UK motor dataset: fit a CatBoost Poisson frequency model, extract actuarial factor tables via SHAP, audit the model for proxy discrimination under FCA Consumer Duty, monitor it for drift, and attach distribution-free conformal prediction intervals. The whole workflow runs in under three minutes on a free Colab GPU.

The dataset has a deliberate design: gender is correlated with region (urban areas skew female in the synthetic population) but has no causal path to claims. The fairness audit should detect this indirect channel — which is exactly the proxy discrimination pattern the FCA flags in Consumer Duty assessments — even though gender was never a model input.

Open in Colab

Click the badge above. The notebook installs its own dependencies via pip and runs top to bottom with no additional setup.

Libraries used

Step Library What it does
Frequency model CatBoost Poisson GBM with exposure offset
Factor tables shap-relativities Multiplicative rating relativities from SHAP values
Fairness audit insurance-fairness Proxy discrimination audit, D_proxy metric, Shapley attribution
Drift monitoring insurance-monitoring PSI/CSI per feature, A/E ratios, Murphy decomposition
Prediction intervals insurance-conformal Conformal intervals with finite-sample coverage guarantees

All five libraries are published on PyPI and available at github.com/burning-cost.

Notebook structure

  1. Synthetic UK motor data — 5,000 policies, known Poisson DGP
  2. CatBoost frequency model — standard rating factors, exposure as weight
  3. SHAP rating relativities — factor tables in GLM format with confidence intervals
  4. Proxy discrimination audit — detects indirect gender effect via region correlation
  5. Model monitoring — PSI per feature, A/E with Poisson CIs, Murphy decomposition
  6. Conformal prediction intervals — 90% coverage verified on held-out test set

Running locally

pip install catboost "shap-relativities[ml]" insurance-fairness insurance-monitoring insurance-conformal polars shap scikit-learn statsmodels
jupyter notebook insurance-pricing-pipeline.ipynb

Licence

MIT

About

End-to-end insurance pricing pipeline: CatBoost frequency model, SHAP relativities, fairness audit, monitoring, and conformal intervals on a single synthetic UK motor dataset

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors