GitHub - savita102/House_Price_Prediction_ML: This project focuses on predicting residential house prices based on various features. Multiple machine learning regression algorithms are implemented, evaluated, and compared to identify the best-performing model for accurate price estimation.

🏠 House Price Prediction using Machine Learning

📌 Project Overview

This project focuses on predicting house prices using multiple regression techniques and selecting the best-performing model through systematic evaluation and hyperparameter tuning. The goal is to build a robust, interpretable, and well-generalized regression pipeline suitable for real-world price prediction tasks.

🎯 Problem Statement

Accurately predict house prices based on various numerical and categorical features such as location, size, and property characteristics.

🧠 Models Implemented

Linear Regression

Ridge Regression

Lasso Regression

Decision Tree Regressor

Random Forest Regressor

🛠️ Tech Stack & Tools

Python

Pandas, NumPy

Scikit-learn

Matplotlib

Jupyter Notebook

🔄 ML Pipeline

Data Cleaning & Missing Value Imputation

One-Hot Encoding for Categorical Features

Train–Test Split

Feature Scaling (StandardScaler)

Model Training & Evaluation

Overfitting Analysis

Ensemble Learning

Hyperparameter Tuning

Feature Importance Analysis

📊 Evaluation Metrics

R² Score

Mean Absolute Error (MAE)

Root Mean Squared Error (RMSE)

📈 Model Performance Summary Model R² Score RMSE Linear / Ridge / Lasso ~0.336 ~48,665 Decision Tree -0.39 ~70,474 Random Forest (Base) 0.352 ~48,092 Random Forest (Tuned) 0.369 ⭐ 47,447 ⭐ ⚙️ Hyperparameter Tuning

Hyperparameter optimization was performed using RandomizedSearchCV with 5-fold cross-validation to improve generalization and reduce RMSE while keeping computational cost reasonable.

Best Parameters:

n_estimators: 300

max_depth: 10

min_samples_split: 2

min_samples_leaf: 2

🔍 Feature Importance

Feature importance analysis was conducted using the tuned Random Forest model to identify the most influential features affecting house prices, improving model interpretability and validating domain relevance.

🏆 Final Model Selection

The tuned Random Forest Regressor was selected as the final model due to:

Highest R² score (0.369)

Lowest RMSE (47,447)

Better generalization compared to baseline and tree-based models

🚀 Key Learnings

Linear models may underperform on complex, non-linear datasets

Decision Trees tend to overfit without proper constraints

Ensemble models with tuning provide better bias–variance tradeoff

Hyperparameter tuning is critical for production-ready ML models

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
HousePricePrediction.csv		HousePricePrediction.csv
House_Price_Prediction (2).ipynb		House_Price_Prediction (2).ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages