House Price Prediction Using Machine Learning Algorithms:
A Comparative Study of Regression Models with Full-Stack Web Deployment
Keywords:
House Price Prediction, Gradient Boosting, Random Forest, Linear Regression, Ridge Regression, Decision Tree, scikit-learn, Flask, California Housing, Supervised Regression, Machine Learning DeploymentAbstract
This research article presents a comprehensive, end-to-end machine learning system for residential house price prediction, trained and evaluated on a synthetic California housing dataset comprising 10,000 records with nine input features. Five supervised regression algorithms are systematically compared: Linear Regression, Ridge Regression, Decision Tree, Random Forest, and Gradient Boosting Regressor. Following a rigorous preprocessing pipeline—missing value imputation, label encoding of categorical variables, and outlier removal—an 80/20 stratified train-test split is applied. Evaluation using four standard metrics (R², Adjusted R², Mean Absolute Error, Root Mean Squared Error) demonstrates that the Gradient Boosting Regressor achieves superior performance with R² = 0.8924, MAE = $20,267, and RMSE = $25,273. The best-performing model is deployed as a full-stack Flask web application with SQLite-backed user authentication, real-time prediction, seven interactive EDA visualizations, and a model comparison dashboard. Mathematical formulations for all five algorithms, system architecture, data flow diagrams, algorithmic pseudocode, and performance analyses through bar and distribution charts are provided. Results confirm that ensemble tree-based methods reduce prediction error by up to 16% over linear baselines and can achieve practical real-time deployment without GPU infrastructure.
Published
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.











