Skip to content

savita102/House_Price_Prediction_ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

🏠 House Price Prediction using Machine Learning

📌 Project Overview

This project focuses on predicting house prices using multiple regression techniques and selecting the best-performing model through systematic evaluation and hyperparameter tuning. The goal is to build a robust, interpretable, and well-generalized regression pipeline suitable for real-world price prediction tasks.

🎯 Problem Statement

Accurately predict house prices based on various numerical and categorical features such as location, size, and property characteristics.

🧠 Models Implemented

Linear Regression

Ridge Regression

Lasso Regression

Decision Tree Regressor

Random Forest Regressor

🛠️ Tech Stack & Tools

Python

Pandas, NumPy

Scikit-learn

Matplotlib

Jupyter Notebook

🔄 ML Pipeline

Data Cleaning & Missing Value Imputation

One-Hot Encoding for Categorical Features

Train–Test Split

Feature Scaling (StandardScaler)

Model Training & Evaluation

Overfitting Analysis

Ensemble Learning

Hyperparameter Tuning

Feature Importance Analysis

📊 Evaluation Metrics

R² Score

Mean Absolute Error (MAE)

Root Mean Squared Error (RMSE)

📈 Model Performance Summary Model R² Score RMSE Linear / Ridge / Lasso ~0.336 ~48,665 Decision Tree -0.39 ~70,474 Random Forest (Base) 0.352 ~48,092 Random Forest (Tuned) 0.369 ⭐ 47,447 ⭐ ⚙️ Hyperparameter Tuning

Hyperparameter optimization was performed using RandomizedSearchCV with 5-fold cross-validation to improve generalization and reduce RMSE while keeping computational cost reasonable.

Best Parameters:

n_estimators: 300

max_depth: 10

min_samples_split: 2

min_samples_leaf: 2

🔍 Feature Importance

Feature importance analysis was conducted using the tuned Random Forest model to identify the most influential features affecting house prices, improving model interpretability and validating domain relevance.

🏆 Final Model Selection

The tuned Random Forest Regressor was selected as the final model due to:

Highest R² score (0.369)

Lowest RMSE (47,447)

Better generalization compared to baseline and tree-based models

🚀 Key Learnings

Linear models may underperform on complex, non-linear datasets

Decision Trees tend to overfit without proper constraints

Ensemble models with tuning provide better bias–variance tradeoff

Hyperparameter tuning is critical for production-ready ML models

About

This project focuses on predicting residential house prices based on various features. Multiple machine learning regression algorithms are implemented, evaluated, and compared to identify the best-performing model for accurate price estimation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors