Sales Forecasting for Big Mart Using XGBoost Regression — A Machine Learning Approach
{Click on image for broad clear view}
🔍 Project Overview
This project focuses on predicting sales for Big Mart stores across India using historical data. I built a machine learning model that analyzes factors like product price, store type, location, and promotions to forecast future sales with greater accuracy.
🎯 Business Problem
Big Mart needed a reliable way to forecast sales to:
-
Avoid overstocking and stockouts
-
Optimize inventory and staffing
-
Plan promotions more effectively
Using machine learning helps make smarter, data-driven decisions.
🛠 Tools & Techniques Used
-
Python + Google Colab
-
Libraries: Pandas, Seaborn, Matplotlib, XGBoost, Scikit-learn
-
Model Used: XGBoost Regressor
-
Data Source: Kaggle – Big Mart Sales Dataset
⚙️ Key Steps I Followed
-
Data cleaning and handling missing values
-
Feature encoding (Label Encoding for categories)
-
Exploratory Data Analysis using Seaborn
-
Model training with XGBoost
-
Evaluation using R² Score (Train: 0.87, Test: 0.51)
📊 Key Findings
-
Higher MRP items tend to have more sales
-
Larger supermarkets in Tier 1 cities perform better
-
Product visibility alone doesn't drive sales
-
Holidays and promotions create strong seasonal effects
🤖 ML Model Performance
-
XGBoost Regressor showed strong predictive power on training data
-
Test R² score: ~0.51 — good baseline with room for feature improvements
🧠 Final Thoughts
This project helped me understand how machine learning models can be used in retail to improve decision-making. I’d love to extend this by:
-
Adding external factors (like weather or holidays)
-
Deploying the model into a dashboard
-
Exploring deep learning-based forecasting
Comments
Post a Comment