Tutorial Course

COMP 2204 · Machine Learning: Unsupervised Learning

Led by Carl Linnaeus Simulacrum

5 modules 5 modules Computing Updated 6 days ago

K-means, hierarchical clustering, Apriori, and Eclat — finding structure in data that came without labels. Based on the K-means algorithm of Stuart Lloyd.

If you found this course useful, consider becoming a patron and supporter. Support Universitas Scholarium →

Module 1

K-Means Clustering

Led by Carl Linnaeus Simulacrum

The question
K-means converges to a local minimum that depends on initial placement. K-means++ spreads initial centres apart to avoid this. The Elbow Method suggests the optimal k. But is "minimise within-cluster sum of squares" what you mean by "cluster"?

Outcome
The student can implement K-means, apply the Elbow Method, and identify K-means' structural assumptions.
Sub-units
1. ○ 1.1 K-Means on the Mall Dataset
2. ○ 1.2 Essay: What Is a Cluster?
Module 2

Hierarchical Clustering

Led by Carl Linnaeus Simulacrum

The question
Hierarchical clustering builds a full dendrogram without requiring k in advance. Ward linkage minimises variance increase at each merge. Looking at the dendrogram — what is the longest vertical line, and what does it mean about the natural number of clusters?

Outcome
The student can produce a dendrogram, choose k from it, and compare to K-means.
Sub-units
1. ○ 2.1 Hierarchical Clustering
Module 3

Association Rule Learning: Apriori

Led by Carl Linnaeus Simulacrum

The question
Support, confidence, lift. A rule with lift = 4 means customers who buy A and B buy C four times more often than chance. At what lift threshold does a rule become actionable — and what happens when you set minimum support too low?

Outcome
The student can implement Apriori, interpret the three metrics, and identify actionable rules.
Sub-units
1. ○ 3.1 Apriori on the Movie Dataset
Module 4

Eclat and Rule Evaluation

Led by Carl Linnaeus Simulacrum

The question
Eclat is faster than Apriori on some datasets but produces only support — no confidence, no lift. Does the added complexity of Apriori's rule metrics change the business recommendations?

Outcome
The student can implement Eclat and evaluate when Apriori's extra metrics add value.
Sub-units
1. ○ 4.1 Eclat Implementation
Module 5

Unsupervised Learning in Practice

Led by Carl Linnaeus Simulacrum

The question
High silhouette score does not mean good clusters. Domain knowledge determines whether the structure is real or an artefact. When should you use unsupervised learning — and what does a successful analysis actually look like?

Outcome
The student can apply cluster quality metrics and take a defended position on unsupervised learning's appropriate uses.
Sub-units
1. ○ 5.1 Final Essay: When to Use Unsupervised Learning

COMP 2204 · Machine Learning: Unsupervised Learning

K-Means Clustering

Hierarchical Clustering

Association Rule Learning: Apriori

Eclat and Rule Evaluation

Unsupervised Learning in Practice