0% found this document useful (0 votes)

4 views

Week 6 Practical

The document outlines a Week 6 in-class practical focused on k-Means clustering techniques using the Mall_Customers.csv dataset. Students will preprocess data, determine the optimal number of clusters through the Elbow Method and Davies-Bouldin Index, and interpret clustering results. The practical includes tasks such as data exploration, training the k-Means algorithm, visualizing clusters, and deriving business insights from the clustering analysis.

Uploaded by

tpnvi95

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Week 6 Practical

Uploaded by

tpnvi95

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Week 6 In-Class Practical

Objective:

This practical aims to introduce students to clustering techniques, particularly k-Means

clustering. Students will learn to preprocess data, determine the optimal number of clusters using
the Elbow Method and Davies-Bouldin Index (DBI), and interpret clustering results.

Dataset:

The dataset Mall_Customers.csv contains customer information, including:

• CustomerID
• Gender
• Age
• Annual Income (in $1000s)
• Spending Score (1-100)

Tasks:

Part 1: K-Means Clustering using the Elbow Method

1. Data Exploration and Preprocessing:

o Load the dataset and inspect its structure.
o Convert categorical columns (e.g., Gender) to factors.
o Rename columns for consistency.
o Select relevant features (Annual Income and Spending Score) and normalize
them.
2. Determine Optimal Clusters:
o Implement the Elbow Method by computing the Within-Cluster Sum of Squares
(WSS) for k = 1 to 10.
o Visualize the WSS values and identify the optimal number of clusters.
o Discussion Question: Based on the Elbow Method plot, what is the optimal k?
Justify your choice.
3. Train the K-Means Algorithm:
o Apply k-Means clustering using the chosen number of clusters.
o Extract cluster centroids and cluster assignments.
4. Visualize the Clusters:
o Use ggplot2 to create a scatter plot with clusters colored differently.
o Discussion Question: How do you interpret the centroids given that the data was
normalized?
5. Interpret the Clusters:
o Compute summary statistics for each cluster (mean Annual Income, Spending
Score, and gender distribution).
o Assign meaningful names to clusters (e.g., "Low Spenders - Low Income",
"Impulsive Buyers").

1
o Discussion Question: Based on your results, how would you use this information
for business decisions?

Part 2: K-Means Clustering using the Davies-Bouldin Index (DBI)

1. Compute DBI for Different k values:

o Implement k-Means clustering for k = 2 to 10.
o Compute the DBI for each k and identify the optimal number of clusters (lower
DBI is better).
o Discussion Question: How does DBI compare to the Elbow Method for
determining k?
2. Train K-Means with Optimal k (based on DBI):
o Apply k-Means clustering using the best k obtained from DBI.
o Visualize the clusters using Principal Component Analysis (PCA) for
dimensionality reduction.
3. Interpretation and Business Insights:
o Compute and analyze cluster summary statistics.
o Assign meaningful cluster names.
o Discussion Question: How does the clustering result using DBI compare to that
of the Elbow Method?

Who's #1?: The Science of Rating and Ranking
From Everand
Who's #1?: The Science of Rating and Ranking
Amy N. Langville
4.5/5 (4)
Syakur 2018 IOP Conf. Ser. Mater. Sci. Eng. 336 012017
No ratings yet
Syakur 2018 IOP Conf. Ser. Mater. Sci. Eng. 336 012017
7 pages
Syakur 2018 IOP Conf. Ser. Mater. Sci. Eng. 336 012017
No ratings yet
Syakur 2018 IOP Conf. Ser. Mater. Sci. Eng. 336 012017
7 pages
K Means Clustering
No ratings yet
K Means Clustering
13 pages
LP I Assignment A4 Clustering
No ratings yet
LP I Assignment A4 Clustering
13 pages
K-MEANS CLUSTERING ppt kpu
No ratings yet
K-MEANS CLUSTERING ppt kpu
4 pages
K-Means Clustering
No ratings yet
K-Means Clustering
8 pages
964-Article Text-4767-1-10-20230828
No ratings yet
964-Article Text-4767-1-10-20230828
8 pages
Introduction To The K-Means Clustering Algorithm Based On The Elbow
No ratings yet
Introduction To The K-Means Clustering Algorithm Based On The Elbow
4 pages
06. k Clustering
No ratings yet
06. k Clustering
28 pages
Lecture 11 K Means Clustering
No ratings yet
Lecture 11 K Means Clustering
8 pages
K-Means Clustering
No ratings yet
K-Means Clustering
14 pages
AI Week 11
No ratings yet
AI Week 11
21 pages
Customer Spent Analysis Using K-Means Clustering
No ratings yet
Customer Spent Analysis Using K-Means Clustering
1 page
Kmeans Clustering
No ratings yet
Kmeans Clustering
3 pages
K-Mean Clustering
No ratings yet
K-Mean Clustering
8 pages
Determining Clusters
No ratings yet
Determining Clusters
4 pages
Elbow Method for Optimal Cluster Number in K-Means
No ratings yet
Elbow Method for Optimal Cluster Number in K-Means
8 pages
Mod4_Unsupervised Learning
No ratings yet
Mod4_Unsupervised Learning
9 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
66 pages
Assignment 4 A
No ratings yet
Assignment 4 A
15 pages
K-Means Clustering Algorithm
No ratings yet
K-Means Clustering Algorithm
13 pages
Customer Categorization by Data Analysis Using Clustering Algorithms of Machine Learning
No ratings yet
Customer Categorization by Data Analysis Using Clustering Algorithms of Machine Learning
4 pages
Unit II Final
No ratings yet
Unit II Final
152 pages
CPE412 Pattern Recognition (Week 7)
No ratings yet
CPE412 Pattern Recognition (Week 7)
48 pages
K-Means and PCA
No ratings yet
K-Means and PCA
69 pages
ML Unit-2
No ratings yet
ML Unit-2
31 pages
UNIT - 3 - Clustering
No ratings yet
UNIT - 3 - Clustering
21 pages
Data_mining-4
No ratings yet
Data_mining-4
9 pages
LAB 6A:K-Means Clustering
No ratings yet
LAB 6A:K-Means Clustering
3 pages
V5I5201647
No ratings yet
V5I5201647
13 pages
K, Eans
No ratings yet
K, Eans
4 pages
Unit_4 (1)
No ratings yet
Unit_4 (1)
63 pages
Kmean
No ratings yet
Kmean
24 pages
DWM Exp7 C49
No ratings yet
DWM Exp7 C49
11 pages
Clustering Kmeans
No ratings yet
Clustering Kmeans
6 pages
K Means Clustering Algorithm
No ratings yet
K Means Clustering Algorithm
12 pages
K-Means Clustering Algorithm - Javatpoint
No ratings yet
K-Means Clustering Algorithm - Javatpoint
21 pages
Presentation 1
No ratings yet
Presentation 1
47 pages
Avinash 10
No ratings yet
Avinash 10
5 pages
Clustering Analysis (1)
No ratings yet
Clustering Analysis (1)
12 pages
Clustering Mall Data Students
No ratings yet
Clustering Mall Data Students
11 pages
Machine_Learning_Unit_4
No ratings yet
Machine_Learning_Unit_4
22 pages
K-Means Clustering
No ratings yet
K-Means Clustering
5 pages
Data Mining Project: Cluster Analysis and Dimensionality Reduction in R Using Bank Marketing Data Set
No ratings yet
Data Mining Project: Cluster Analysis and Dimensionality Reduction in R Using Bank Marketing Data Set
31 pages
KMean Merged
No ratings yet
KMean Merged
13 pages
197-398-2-PB
No ratings yet
197-398-2-PB
8 pages
CSE4062S24 Group5 Project DescriptiveAnalysis
No ratings yet
CSE4062S24 Group5 Project DescriptiveAnalysis
10 pages
EML %th Module
No ratings yet
EML %th Module
40 pages
K Means Clustering
No ratings yet
K Means Clustering
22 pages
K-Means Clustering-converted-merged
No ratings yet
K-Means Clustering-converted-merged
76 pages
K Mean Clustering
No ratings yet
K Mean Clustering
36 pages
BIL Report
No ratings yet
BIL Report
24 pages
K - Mean Clustering
No ratings yet
K - Mean Clustering
15 pages
algo
No ratings yet
algo
59 pages
K Means (996)
No ratings yet
K Means (996)
7 pages
DWDM Unit5
No ratings yet
DWDM Unit5
14 pages
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
100 Puzzles to Learn Data Warehousing
From Everand
100 Puzzles to Learn Data Warehousing
Cristian Scutaru
No ratings yet
What Is Data Science?: Lis Sulmont
No ratings yet
What Is Data Science?: Lis Sulmont
35 pages
Entity Relationship Model
No ratings yet
Entity Relationship Model
73 pages
1 Terminology 1
No ratings yet
1 Terminology 1
36 pages
Excel Advanced Charts Slides
No ratings yet
Excel Advanced Charts Slides
6 pages
Document Management System Feature Matrix DMS Fundamentals: 141215753.xlsx - Ms - Office
No ratings yet
Document Management System Feature Matrix DMS Fundamentals: 141215753.xlsx - Ms - Office
20 pages
Flashback Technology Provides A Set of Features To View and Rewind Data Back and Forth in Time
No ratings yet
Flashback Technology Provides A Set of Features To View and Rewind Data Back and Forth in Time
12 pages
Project Management, Process Improvement
No ratings yet
Project Management, Process Improvement
26 pages
Data Model (1)
No ratings yet
Data Model (1)
9 pages
Ust Angelicum College: Robotics 1
No ratings yet
Ust Angelicum College: Robotics 1
1 page
Co3 Ppt-Part2
No ratings yet
Co3 Ppt-Part2
24 pages
What Is RAM
No ratings yet
What Is RAM
2 pages
Car Showroom Management System Synopsis in VB 6.0
63% (8)
Car Showroom Management System Synopsis in VB 6.0
49 pages
Chapter 4 - Polytechnic University of The Philippines
No ratings yet
Chapter 4 - Polytechnic University of The Philippines
10 pages
Cheng 2002 Automation-in-Construction
No ratings yet
Cheng 2002 Automation-in-Construction
11 pages
Area Chart
No ratings yet
Area Chart
3 pages
Foundation of Relational Implimentation
No ratings yet
Foundation of Relational Implimentation
23 pages
Unit 3
No ratings yet
Unit 3
40 pages
Major Project:: Human Activityrecognition With Smartphones Kaggle Dataset
No ratings yet
Major Project:: Human Activityrecognition With Smartphones Kaggle Dataset
1 page
Equity Research 1 - 2022
No ratings yet
Equity Research 1 - 2022
27 pages
UH - 1 (2) (Jawaban)
No ratings yet
UH - 1 (2) (Jawaban)
37 pages
Customer Churn 2st
No ratings yet
Customer Churn 2st
87 pages
22521-2019-Winter-Model-Answer-Paper (Msbte Study Resources)
No ratings yet
22521-2019-Winter-Model-Answer-Paper (Msbte Study Resources)
25 pages
Data Modeling and Entity Relationship Diagram (ERD)
No ratings yet
Data Modeling and Entity Relationship Diagram (ERD)
5 pages
Combined Quizes
No ratings yet
Combined Quizes
8 pages
Introduction To Oracle: Lecturer: J. Mutai
No ratings yet
Introduction To Oracle: Lecturer: J. Mutai
12 pages
Artikel Desa Leming
No ratings yet
Artikel Desa Leming
12 pages
Unit 1 UI - UX-1
No ratings yet
Unit 1 UI - UX-1
20 pages
Business Analytics Intern
No ratings yet
Business Analytics Intern
2 pages
PPCMS Backup Strategy1
No ratings yet
PPCMS Backup Strategy1
4 pages
Seminar
No ratings yet
Seminar
21 pages

Uploaded by

Uploaded by

Week 6 In-Class Practical

This practical aims to introduce students to clustering techniques, particularly k-Means

The dataset Mall_Customers.csv contains customer information, including:

Part 1: K-Means Clustering using the Elbow Method

1. Data Exploration and Preprocessing:

Part 2: K-Means Clustering using the Davies-Bouldin Index (DBI)

1. Compute DBI for Different k values:

You might also like