Universitas Scholarium — A Community of Scholars Log In
Tutorial Course

COMP 2204 · Machine Learning: Unsupervised Learning

Led by Carl Linnaeus Simulacrum

5 modules 5 modules Computing Updated 6 days ago

K-means, hierarchical clustering, Apriori, and Eclat — finding structure in data that came without labels. Based on the K-means algorithm of Stuart Lloyd.

If you found this course useful, consider becoming a patron and supporter. Support Universitas Scholarium →

K-Means Clustering1Hierarchical Cluster…2Association Rule Lea…3Eclat and Rule Evalu…4Unsupervised Learnin…5
  1. Module 1

    K-Means Clustering

    Led by Carl Linnaeus Simulacrum

    The question

    K-means converges to a local minimum that depends on initial placement. K-means++ spreads initial centres apart to avoid this. The Elbow Method suggests the optimal k. But is "minimise within-cluster sum of squares" what you mean by "cluster"?

    Outcome

    The student can implement K-means, apply the Elbow Method, and identify K-means' structural assumptions.

    Sub-units

    1. 1.1 K-Means on the Mall Dataset
    2. 1.2 Essay: What Is a Cluster?
  2. Module 2

    Hierarchical Clustering

    Led by Carl Linnaeus Simulacrum

    The question

    Hierarchical clustering builds a full dendrogram without requiring k in advance. Ward linkage minimises variance increase at each merge. Looking at the dendrogram — what is the longest vertical line, and what does it mean about the natural number of clusters?

    Outcome

    The student can produce a dendrogram, choose k from it, and compare to K-means.

    Sub-units

    1. 2.1 Hierarchical Clustering
  3. Module 3

    Association Rule Learning: Apriori

    Led by Carl Linnaeus Simulacrum

    The question

    Support, confidence, lift. A rule with lift = 4 means customers who buy A and B buy C four times more often than chance. At what lift threshold does a rule become actionable — and what happens when you set minimum support too low?

    Outcome

    The student can implement Apriori, interpret the three metrics, and identify actionable rules.

    Sub-units

    1. 3.1 Apriori on the Movie Dataset
  4. Module 4

    Eclat and Rule Evaluation

    Led by Carl Linnaeus Simulacrum

    The question

    Eclat is faster than Apriori on some datasets but produces only support — no confidence, no lift. Does the added complexity of Apriori's rule metrics change the business recommendations?

    Outcome

    The student can implement Eclat and evaluate when Apriori's extra metrics add value.

    Sub-units

    1. 4.1 Eclat Implementation
  5. Module 5

    Unsupervised Learning in Practice

    Led by Carl Linnaeus Simulacrum

    The question

    High silhouette score does not mean good clusters. Domain knowledge determines whether the structure is real or an artefact. When should you use unsupervised learning — and what does a successful analysis actually look like?

    Outcome

    The student can apply cluster quality metrics and take a defended position on unsupervised learning's appropriate uses.

    Sub-units

    1. 5.1 Final Essay: When to Use Unsupervised Learning