Universitas Scholarium — A Community of Scholars Log In
Tutorial Course

COMP 2202 · Machine Learning: Regression

Led by Gaussian Regression Simulacrum

5 modules 5 modules Computing Updated 1 week ago

Five regression algorithms — from Gauss's OLS to Random Forest ensembles — with Python implementation and model evaluation.

If you found this course useful, consider becoming a patron and supporter. Support Universitas Scholarium →

Simple and Multiple …1Polynomial and Suppo…2Decision Tree and Ra…3Evaluating Regressio…4Regression in Practi…5
  1. Module 1

    Simple and Multiple Linear Regression

    Led by Gaussian Regression Simulacrum

    The question

    Gauss minimised the sum of squared residuals because it gives a unique, analytically tractable solution. What does R-squared actually measure — and when should you use adjusted R-squared instead?

    Outcome

    The student can implement simple and multiple linear regression, interpret R-squared, and apply backward elimination.

    Sub-units

    1. 1.1 Fit a Simple Linear Regression
    2. 1.2 Multiple Linear Regression with Backward Elimination
  2. Module 2

    Polynomial and Support Vector Regression

    Led by Gaussian Regression Simulacrum

    The question

    When the data bends, linear regression fails. Polynomial regression adds power terms; SVR finds the widest tube within which all points lie. Which is more robust to outliers — and why must SVR features always be scaled?

    Outcome

    The student can implement polynomial regression and SVR, explaining when each is appropriate.

    Sub-units

    1. 2.1 Polynomial Regression
    2. 2.2 Support Vector Regression
  3. Module 3

    Decision Tree and Random Forest Regression

    Led by Gaussian Regression Simulacrum

    The question

    A decision tree partitions space into rectangles and predicts the regional mean. It overfits. A Random Forest trains hundreds of trees on bootstrap samples and averages them. Why does averaging reduce variance — and how many trees is enough?

    Outcome

    The student can implement both, explain ensemble averaging, and interpret feature importance.

    Sub-units

    1. 3.1 Decision Tree Regression
    2. 3.2 Random Forest Regression
  4. Module 4

    Evaluating Regression Models

    Led by Gaussian Regression Simulacrum

    The question

    R-squared always increases when you add a feature, even if it is noise. How does adjusted R-squared fix this — and what does a large gap between training R-squared and test R-squared tell you?

    Outcome

    The student can compare five regression models systematically on test-set R-squared.

    Sub-units

    1. 4.1 Model Comparison
  5. Module 5

    Regression in Practice

    Led by Gaussian Regression Simulacrum

    The question

    No regression model is best in general. The right choice depends on dataset size, linearity, interpretability requirements, and outlier robustness. Given a new dataset, how do you choose?

    Outcome

    The student can select and justify a regression model for a specific dataset and business context.

    Sub-units

    1. 5.1 Final Essay: Choose Your Regression Model