Tutorial Course

CRDS 1002 · Probability and Statistical Thinking

Led by R.A. Fisher Simulacrum

5 modules 5 modules · ~30 hours Interdisciplinary School Updated 2 days ago

Probability and statistical thinking from frequentist, Bayesian, and subjective interpretations through distributions, hypothesis testing, correlation and causation, and Bayesian reasoning.

Module 1

Probability as a Language: Frequency, Bayesian, and Subjective Interpretations

Led by R.A. Fisher Simulacrum

The question
Probability is a number between 0 and 1 that expresses how likely something is — but what does "likely" mean? Three interpretations compete. The frequentist says: probability is the long-run frequency of an event (flip a fair coin 10,000 times; it lands heads approximately 5,000 times; the probability of heads is 0.5).

Outcome
The student can describe the three interpretations of probability (frequentist, Bayesian, subjective), state the three Kolmogorov axioms, define independence and conditional probability, and explain the difference between P(A|B) and P(B|A). (Probability as a language)
Sub-units
1. ○ 1.1 The Frequentist Interpretation: Probability as Long-Run Frequency
2. ○ 1.2 The Bayesian Interpretation: Probability as Degree of Belief
3. ○ 1.3 The Kolmogorov Axioms: The Mathematical Foundation
4. ○ 1.4 Independence and Conditional Probability
5. ○ 1.5 The Law of Total Probability and Bayes' Theorem in Action
Module 2

Distributions, Means, and the Central Limit Theorem

Led by R.A. Fisher Simulacrum

The question
A single number tells you almost nothing. The average income in a country could be £30,000 — but if half the population earns £10,000 and half earns £50,000, the average is misleading. The distribution tells you everything: not just where the centre is (the mean), but how spread out the values are (the variance), whether they are symmetrical (the normal distribution) or skewed (the income distribution), and how extreme the extremes can be (the tails).

Outcome
The student can describe three measures of centre and when each is appropriate, describe the normal distribution and the 68-95-99.7 rule, explain skewness and why it makes the mean misleading, and state the Central Limit Theorem and why it is foundational. (Distributions and the CLT)
Sub-units
1. ○ 2.1 Mean, Median, and Mode: Three Centres, Three Stories
2. ○ 2.2 Variance and Standard Deviation: Measuring Spread
3. ○ 2.3 The Normal Distribution: The Bell Curve and the 68-95-99.7 Rule
4. ○ 2.4 Skewness: When the Average Lies
5. ○ 2.5 The Central Limit Theorem: The Foundation of Statistical Inference
Module 3

Hypothesis Testing: What a p-Value Actually Means

Led by R.A. Fisher Simulacrum

The question
A pharmaceutical company claims its new drug reduces blood pressure. A clinical trial shows a 5 mmHg reduction in the treatment group compared to the placebo group. Is this a real effect, or could it have happened by chance? Hypothesis testing is the framework for answering this question — and the p-value is the number at its centre. The p-value is also the most misunderstood number in science: it does NOT tell you the probability that the drug works.

Outcome
The student can define the null and alternative hypotheses, correctly interpret a p-value, distinguish statistical significance from practical significance, describe Type I and Type II errors, and explain three common misinterpretations of the p-value. (Hypothesis testing)
Sub-units
1. ○ 3.1 The Null and Alternative Hypotheses
2. ○ 3.2 The p-Value: What It Actually Says
3. ○ 3.3 Statistical Significance and the α Threshold
4. ○ 3.4 Type I and Type II Errors: False Positives and False Negatives
5. ○ 3.5 Three Things the p-Value Is Not
Module 4

Correlation, Causation, and Simpson's Paradox

Led by Pearson Simulacrum

The question
Correlation is not causation — every statistics student learns this. But few can explain precisely why, and fewer still can identify the mechanisms by which correlation misleads. Ice cream sales correlate with drowning deaths — but ice cream does not cause drowning (both increase in summer).

Outcome
The student can compute and interpret r, explain three causal structures (direct, confounding, mediation), explain why only RCTs establish causation, and describe Simpson's Paradox with an example. (Correlation and causation)
Sub-units
1. ○ 4.1 The Correlation Coefficient: Measuring Association
2. ○ 4.2 Spurious Correlation and the Confounding Variable
3. ○ 4.3 The Three Causal Structures: Direct, Confounding, and Mediation
4. ○ 4.4 The Randomised Controlled Trial: The Gold Standard
5. ○ 4.5 Simpson's Paradox: When Aggregation Reverses the Truth
Module 5

Base Rate Neglect and Conditional Probability: Thinking with Bayes

Led by R.A. Fisher Simulacrum

The question
A test for a rare disease has 99% accuracy. You test positive. What is the probability you have the disease? If you answered "99%," you have committed base rate neglect — the most common and consequential error in probabilistic reasoning. If the disease affects 1 in 10,000 people, the true probability is approximately 1%. This module teaches the Bayesian framework for updating beliefs with evidence — the antidote to base rate neglect.

Outcome
The student can define base rate neglect, apply Bayes' theorem to a medical test scenario, convert between probabilities and natural frequencies, describe iterative Bayesian updating, and apply Bayesian reasoning to an everyday decision. (Bayesian thinking)
Sub-units
1. ○ 5.1 Base Rate Neglect: The Error the Brain Is Wired to Make
2. ○ 5.2 Bayes' Theorem in Full: The Calculation Step by Step
3. ○ 5.3 Natural Frequencies: The Human-Readable Version
4. ○ 5.4 Iterative Updating: When New Evidence Arrives
5. ○ 5.5 Bayesian Reasoning in Everyday Life

CRDS 1002 · Probability and Statistical Thinking

Probability as a Language: Frequency, Bayesian, and Subjective Interpretations

Distributions, Means, and the Central Limit Theorem

Hypothesis Testing: What a p-Value Actually Means

Correlation, Causation, and Simpson's Paradox

Base Rate Neglect and Conditional Probability: Thinking with Bayes