Tutorial Course

COMP 2303 · Data Science: Statistics

Led by Ronald Fisher Simulacrum

5 modules 5 modules Computing Updated 6 days ago

Descriptive and inferential statistics for data science — from mean and variance through confidence intervals to hypothesis testing and the p-value crisis.

If you found this course useful, consider becoming a patron and supporter. Support Universitas Scholarium →

Module 1

Descriptive Statistics: Describing What You Have

Led by Ronald Fisher Simulacrum

The question
Mean, median, mode — three summaries of the same data. Variance, standard deviation — how spread out it is. Correlation — how two variables move together. Why does the sample variance formula divide by n-1, and why does it matter?

Outcome
The student can compute the core descriptive statistics and explain Bessel's correction.
Sub-units
1. ○ 1.1 Compute the Basics
2. ○ 1.2 Why n-1?
Module 2

Distributions and the Normal Curve

Led by Ronald Fisher Simulacrum

The question
The central limit theorem guarantees that sample means approach normality regardless of the underlying distribution. This is why the normal distribution appears everywhere in statistics. What does "sampling distribution" mean — and when do you use t instead of z?

Outcome
The student can compute z-scores, apply the CLT, and choose between z and t distributions.
Sub-units
1. ○ 2.1 Z-scores and the Normal Table
Module 3

Confidence Intervals

Led by Ronald Fisher Simulacrum

The question
A 95% confidence interval means that 95% of intervals constructed this way contain the true parameter. It does NOT mean there is a 95% probability the parameter is in this interval. What is the difference — and why does it matter?

Outcome
The student can construct confidence intervals and explain precisely what 95% confidence means.
Sub-units
1. ○ 3.1 Construct and Interpret
Module 4

Hypothesis Testing

Led by Ronald Fisher Simulacrum

The question
p < 0.05: reject the null. But what does a p-value actually say — and what are the most dangerous ways people misread it?

Outcome
The student can conduct a hypothesis test and explain the common misinterpretations of p-values.
Sub-units
1. ○ 4.1 Test the Claim
2. ○ 4.2 The p-value Problem
Module 5

Statistics in Practice: From Description to Decision

Led by Ronald Fisher Simulacrum

The question
The replication crisis was partly caused by over-reliance on p-values. A study with p=0.04 and n=12 is different from p=0.04 and n=1200. What should accompany a p-value — and how do you design studies that produce trustworthy conclusions?

Outcome
The student can report effect sizes, check assumptions, and take a defended position on statistical best practice.
Sub-units
1. ○ 5.1 Final Essay: Beyond the p-value

COMP 2303 · Data Science: Statistics

Descriptive Statistics: Describing What You Have

Distributions and the Normal Curve

Confidence Intervals

Hypothesis Testing

Statistics in Practice: From Description to Decision