🌍 Life Expectancy Predictor

This project builds a regression model to estimate life expectancy across countries using a range of socioeconomic and health-related factors, including:

Gross Domestic Product (GDP)
Education level
Healthcare expenditure
Immunization rates
and other development indicators

📂 Dataset Source

This dataset was originally published on Kaggle by KumarRajarshi. It contains global life expectancy data curated from WHO and other development indicators.

📖 Dataset Description

This project aims to predict the life expectancy of a given country based on various health, economic, and demographic parameters. The dataset includes the following variables:

Country: Name of the country
Year: Year of the data record
Status: Development status (Developed or Developing)
Life Expectancy: Average expected lifespan in years
Adult Mortality: Number of adult deaths per 1000 population
Infant Deaths: Number of infant deaths per 1000 live births
Alcohol: Alcohol consumption per capita (in liters)
Percentage Expenditure: Health expenditure as a percentage of GDP
Hepatitis B: Immunization coverage percentage for Hepatitis B
Measles: Number of reported measles cases
BMI: Average Body Mass Index of the population
Under Five Deaths: Number of deaths of children under five years
Polio: Immunization coverage percentage for Polio
Total Expenditure: Total health expenditure per capita
Diphtheria: Immunization coverage percentage for Diphtheria
HIV/AIDS: Death rate per 1000 due to HIV/AIDS
GDP: Gross Domestic Product per capita
Population: Total population of the country
Thinness 1-19 years: Prevalence of thinness among children and adolescents aged 1–19
Thinness 5-9 years: Prevalence of thinness among children aged 5–9
Income Composition of Resources: Indicator of income inequality and distribution
Schooling: Average years of schooling in the population

This dataset provides a broad view of factors influencing life expectancy, enabling the development of predictive models to analyze health outcomes globally.

📊 Project Highlights

✅ Data Cleaning & Preprocessing: Handled missing values, encoded categorical features, and normalized continuous variables.
📈 Correlation Analysis & Feature Selection: Used heatmaps and feature importance (via Random Forest) to select relevant predictors.
🧠 Modeling & Evaluation:
- Algorithms tested: Linear Regression, SVR, RandomForest, and ensemble models.
- Hyperparameter tuning via GridSearchCV.
- Final model: VotingRegressor combining RandomForest and LinearRegression.
- Best RMSE: 1.66 years

🔧 Tools & Libraries

pandas, numpy, matplotlib, seaborn, plotly
scikit-learn
Jupyter Notebook

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Life_Expectancy_Data.csv		Life_Expectancy_Data.csv
README.md		README.md
life_expectancy_predictor.ipynb		life_expectancy_predictor.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌍 Life Expectancy Predictor

📂 Dataset Source

📖 Dataset Description

📊 Project Highlights

🔧 Tools & Libraries

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌍 Life Expectancy Predictor

📂 Dataset Source

📖 Dataset Description

📊 Project Highlights

🔧 Tools & Libraries

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages