A complete end-to-end customer segmentation project using K-Means Clustering to analyze customer behavior and identify meaningful groups for business decision-making.
Customer segmentation plays a key role in understanding consumer behavior and improving marketing strategies.
In this project, we use unsupervised machine learning to group customers based on their purchasing patterns.
This project demonstrates:
- Data preprocessing
- Exploratory Data Analysis (EDA)
- Feature selection
- Clustering using K-Means
- Data visualization
- Dataset Name: Mall Customers Dataset
- Source: Public dataset
- Features Used:
- Gender
- Age
- Annual Income (k$)
- Spending Score (1–100)
To segment customers into meaningful groups based on spending behavior and income, helping businesses:
- Target the right audience
- Improve customer engagement
- Optimize marketing strategies
- Programming Language: Python
- Libraries Used:
- NumPy
- Pandas
- Matplotlib
- Seaborn
- Scikit-learn
- Loaded dataset using Pandas
- Checked data types and null values
- Removed unnecessary columns
- Distribution analysis of:
- Age
- Annual Income
- Spending Score
- Gender-based insights
Selected important features such as:
- Age vs Spending Score
- Annual Income vs Spending Score
- Used Elbow Method to determine optimal clusters
- Applied K-Means algorithm
- Labeled customer segments
- Cluster visualization using scatter plots
- Clear differentiation of customer groups
- Customers were grouped into distinct clusters based on behavior
- Identified:
- High-income high-spending customers
- Budget-conscious customers
- Average spenders
- These insights can help businesses improve:
- Personalized marketing
- Customer retention
- Product recommendations
git clone https://github.com/ProblemShooter/customer-segmentation-ml.git
cd customer-segmentation-ml