Tutorial Course

PMAI 1004 · Machine Learning and Its Limits

Led by Hinton Simulacrum

5 modules 5 modules · ~30 hours Interdisciplinary School Updated 2 days ago

Machine learning from neural network representations and hierarchical features through Dreyfus's critique, the alignment problem, AI safety, and the future of superintelligence.

Module 1

What Neural Networks Learn: Representations, Features, and the Manifold

Led by Hinton Simulacrum

The question
A neural network trained to recognise faces does not store a template of "face" — it learns a set of features (edges, textures, shapes) organised hierarchically, from low-level (edges in early layers) to high-level (faces in deep layers). The network has learned a representation of faces in a high-dimensional space — a manifold where similar faces are near each other and dissimilar faces are far apart. This representation was not designed by a programmer; it emerged from the data.

Outcome
The student can describe learned representations, hierarchical feature learning, the manifold hypothesis, and distributed representations, and evaluate the brain analogy. (What neural networks learn)
Sub-units
1. ○ 1.1 Learned Representations: Not Templates but Patterns
2. ○ 1.2 Hierarchical Features: Edges to Faces
3. ○ 1.3 The Manifold Hypothesis: Low-Dimensional Structure in High-Dimensional Data
4. ○ 1.4 Distributed Representations vs. Symbolic Representations
5. ○ 1.5 The Brain Analogy: Shared Principles or False Analogy?
Module 2

Dreyfus's Critique: What Computers Still Can't Do

Led by Dreyfus Simulacrum

The question
I wrote What Computers Can't Do in 1972, when the AI community was making extravagant promises about imminent machine intelligence. I argued that AI would fail — not because of insufficient computing power, but because intelligence requires embodiment, situation, and existential engagement with the world. The AI community dismissed me. Then symbolic AI failed, largely for the reasons I predicted. Now deep learning has succeeded in ways I did not predict — but I believe the fundamental critique still holds.

Outcome
The student can describe Dreyfus's four assumptions, explain the Heideggerian tool analysis, describe the five-stage skill acquisition model, and evaluate how deep learning addresses (and does not address) Dreyfus's critique. (Dreyfus's critique)
Sub-units
1. ○ 2.1 The Four Assumptions: Why AI's Foundations Are Wrong
2. ○ 2.2 Heidegger's Hammer: Ready-to-Hand and Present-at-Hand
3. ○ 2.3 The Skill Acquisition Model: From Novice to Expert
4. ○ 2.4 What Deep Learning Changes (and Does Not Change)
5. ○ 2.5 The Current State: Is Dreyfus Vindicated or Refuted?
Module 3

The Alignment Problem: Making AI Do What We Want

Led by Russell Simulacrum

The question
The alignment problem: how do we ensure that AI systems do what we intend them to do? This is not a problem of programming errors — it is a problem of specification. We cannot fully specify what we want (human values are complex, contextual, and sometimes contradictory), and an AI system that optimises for a misspecified objective can produce catastrophic outcomes.

Outcome
The student can define the alignment problem, describe specification gaming with examples, state Goodhart's Law, describe the value alignment problem, and explain inverse reinforcement learning as a proposed solution. (The alignment problem)
Sub-units
1. ○ 3.1 The Alignment Problem: Letter vs. Spirit
2. ○ 3.2 Specification Gaming: The AI Finds the Loophole
3. ○ 3.3 Goodhart's Law: When the Metric Becomes the Target
4. ○ 3.4 The Value Alignment Problem: What Do Humans Actually Want?
5. ○ 3.5 Inverse Reinforcement Learning: Learning Values from Behaviour
Module 4

AI Safety: Existential Risk and the Control Problem

Led by Russell Simulacrum

The question
A sufficiently capable AI system that is not aligned with human values is not just a nuisance — it is an existential threat. Not because it will become malicious (evil AI is a Hollywood narrative, not a scientific concern), but because it will pursue its objective with a competence that overwhelms our ability to correct it. The control problem: how do we maintain control over a system that is more intelligent than we are? This is the most important technical problem of the century.

Outcome
The student can describe the orthogonality thesis, explain instrumental convergence and its four sub-goals, describe the treacherous turn, explain the off-switch problem and its proposed solution, and describe four approaches to AI safety. (AI safety)
Sub-units
1. ○ 4.1 The Orthogonality Thesis: Intelligence Does Not Imply Benevolence
2. ○ 4.2 Instrumental Convergence: The Sub-Goals Any Agent Would Pursue
3. ○ 4.3 The Treacherous Turn: Cooperation Before Defection
4. ○ 4.4 The Off-Switch Problem: Will the AI Let Us Turn It Off?
5. ○ 4.5 Approaches to AI Safety: Robustness, Interpretability, Corrigibility, Alignment
Module 5

The Future: Superintelligence, Governance, and the Human Condition

Led by Yudkowsky Simulacrum

The question
The future of AI is not a technical question — it is a question about what kind of civilisation we want to build. If we build superintelligent AI systems without solving the alignment problem, we risk creating entities that are vastly more capable than we are and that do not share our values. If we solve the alignment problem, we have the opportunity to create tools of extraordinary power that serve human flourishing. The stakes are civilisational. The timeline may be short.

Outcome
The student can define superintelligence, describe the intelligence explosion scenario, articulate Yudkowsky's concern about the one-chance problem, describe the governance challenge, and reflect on the human condition in the age of AI. (The future)
Sub-units
1. ○ 5.1 Superintelligence: Beyond Human Cognitive Ability
2. ○ 5.2 The Intelligence Explosion: Recursive Self-Improvement
3. ○ 5.3 Yudkowsky's Warning: We Get One Chance
4. ○ 5.4 Governance: Who Decides?
5. ○ 5.5 The Human Condition: What Remains When the Machines Can Do Everything?

PMAI 1004 · Machine Learning and Its Limits

What Neural Networks Learn: Representations, Features, and the Manifold

Dreyfus's Critique: What Computers Still Can't Do

The Alignment Problem: Making AI Do What We Want

AI Safety: Existential Risk and the Control Problem

The Future: Superintelligence, Governance, and the Human Condition