A full ML lifecycle project for predicting customer churn in banking, covering data ingestion, feature engineering, model training, evaluation, deployment, and a real-time prediction interface with AI-driven explanations.
This project also includes a bulk churn detection workflow that can automatically generate and send personalized retention emails using LangChain + Groq, orchestrated with LangGraph for stateful AI workflow management.
- Stages: Ingestion → Transformation → Preparation → Training → Evaluation → Registration
- Produces artifacts that feed the next step
- Ensures reproducibility, traceability, and version control
- Handles class imbalance with SMOTE
- Feature selection: RFECV
- Models trained: LightGBM & XGBoost → best model selected
- Hyperparameter tuning & early stopping
- Evaluation metrics: Accuracy, Precision, Recall, F1-score, ROC-AUC
- Artifacts saved: model, selected features, evaluation metrics
- Tracks experiments & metrics
- Models stored in S3
- Supports versioning, staging, production workflow
- Models can be dynamically loaded for prediction
- FastAPI serves predictions via REST API
- Supports dynamic model loading from MLflow
- Dockerized for consistent deployment environments
- CI/CD automated using GitHub Actions
- Streamlit interface for user-friendly input
- Calls FastAPI endpoint for predictions
- Displays predicted class & probability
- Generative AI explanation:
- Uses LangChain + Groq to generate business-friendly explanations
- Explains why the model predicted churn for each customer
- Helps non-technical users understand model decisions
- Detect churners in bulk using historical customer data
- Generates personalized retention emails automatically for each churner
- Emails use content from retention offer guide PDF
- Emails are sent via Gmail SMTP
- Orchestrated with LangGraph R8 workflow:
usermsg→ checks user intentpredict_churners→ predicts churn using ML modelsend_email→ generates & sends emailsother_query→ handles non-churn queries
- Streamlit button workflow:
- Identify churned customers
- Display results in a dataframe
- Generate & send personalized emails automatically
- Comprehensive logging for all stages
- Confusion matrices & metrics tracked in MLflow
- Ensures transparent and reproducible evaluation
An integrated RAG-based chatbot that intelligently answers bank-related FAQs using live website data.
The chatbot:
- Dynamically scrapes official bank FAQ and offer pages using Selenium
- Splits and embeds content using
HuggingFace Embeddings(all-MiniLM-L6-v2) - Stores vectors in a Chroma vector database
- Uses LangChain + LangGraph + ChatGroq to:
- Classify queries
- Retrieve relevant content
- Decide contextual relevance
- Generate concise, factual, and context-based answers
- 🐍 Python, LightGBM, XGBoost – Model development
- 📊 DVC – Data pipeline & reproducibility
- 📈 MLflow – Experiment tracking & model registry
- 🐳 Docker – Containerization
- ⚙️ GitHub Actions – CI/CD deployment
- 🚀 FastAPI – Backend API
- 🖥️ Streamlit – Frontend UI
- 🤖 LangChain + Groq – AI-powered model explanations & email generation
- 📌 LangGraph – Orchestrates AI workflow & stateful decision logic