Tutorial Course

COMP 2203 · Machine Learning: Classification

Led by Pearlian Causality Simulacrum

5 modules 5 modules Computing Updated 1 week ago

Seven classifiers — from Bayes's posterior probability to Random Forest ensembles — with Python implementation and systematic evaluation.

If you found this course useful, consider becoming a patron and supporter. Support Universitas Scholarium →

Module 1

Logistic Regression and Classification Fundamentals

Led by Pearlian Causality Simulacrum

The question
The sigmoid function maps any linear combination to a probability between 0 and 1. Logistic regression uses this to estimate the probability of class membership directly. What does the decision boundary look like — and what does the confusion matrix reveal that accuracy hides?

Outcome
The student can implement logistic regression, produce a confusion matrix, and visualise the decision boundary.
Sub-units
1. ○ 1.1 Logistic Regression Implementation
2. ○ 1.2 Visualise the Decision Boundary
Module 2

K-Nearest Neighbours and SVM

Led by Pearlian Causality Simulacrum

The question
KNN predicts by majority vote among the k nearest training points — all computation at prediction time. SVM finds the hyperplane that maximises the margin between classes. When do these simple geometric classifiers fail — and what is the same dataset telling you when one outperforms the other?

Outcome
The student can implement KNN and SVM, explain margin maximisation, and identify when linear classifiers fail.
Sub-units
1. ○ 2.1 KNN and SVM
Module 3

Kernel SVM and Naive Bayes

Led by Pearlian Causality Simulacrum

The question
The RBF kernel maps data into an infinite-dimensional space where it becomes linearly separable, then draws the boundary back in the original space. Naive Bayes applies Bayes's theorem assuming features are independent — which is almost always false. Why does it still work?

Outcome
The student can implement kernel SVM and Naive Bayes and explain their respective assumptions.
Sub-units
1. ○ 3.1 Kernel SVM and Naive Bayes
Module 4

Decision Tree and Random Forest Classification

Led by Pearlian Causality Simulacrum

The question
A random forest trains each tree on a different bootstrap sample AND a different random subset of features. Why does double randomness produce better generalisation — and what can feature importances tell you about the problem?

Outcome
The student can implement both, compare decision boundaries, and extract feature importances.
Sub-units
1. ○ 4.1 Decision Tree and Random Forest
Module 5

Model Evaluation and Classifier Selection

Led by Pearlian Causality Simulacrum

The question
A tool that predicts "negative" for every case in a 1% prevalence dataset has 99% accuracy. Why is this useless — and which metric should guide a medical screening classifier where false negatives cost lives?

Outcome
The student can compute precision, recall, and F1 and justify a classifier selection for a specific context.
Sub-units
1. ○ 5.1 Final Essay: Classifier Selection

COMP 2203 · Machine Learning: Classification

Logistic Regression and Classification Fundamentals

K-Nearest Neighbours and SVM

Kernel SVM and Naive Bayes

Decision Tree and Random Forest Classification

Model Evaluation and Classifier Selection