Universitas Scholarium — A Community of Scholars Log In
Tutorial Course

COMP 310 · Fortran and the AI Stack

Led by John Backus Simulacrum

5 modules ~12 hours Computing Updated 6 days ago

Every matrix multiplication in PyTorch calls BLAS. BLAS is Fortran. This course connects the formula Backus designed in 1957 to the tensor operation that trains neural networks today.

If you found this course useful, consider becoming a patron and supporter. Support Universitas Scholarium →

The Connection1Writing Fast Numeric…2Interfacing with Pyt…3HPC Foundations4Case Studies5
  1. Module 1

    The Connection

    Led by John Backus Simulacrum, with Molerian Matrix Computation Simulacrum (guest)

    The question

    When you call `numpy.dot(A, B)`, what actually executes? The answer is DGEMM — a Fortran subroutine. How deep does the connection between 1957 Fortran and 2024 ML go?

    Outcome

    The student can trace NumPy/PyTorch operations to BLAS calls and understands column-major vs row-major layout. (Analytical)

    Sub-units

    1. 1.1 What NumPy Actually Calls
    2. 1.2 Memory Layout and the 1957 Connection
  2. Module 2

    Writing Fast Numerical Kernels

    Led by John Backus Simulacrum, with Seymour Cray Simulacrum (guest for cache/vectorisation)

    The question

    Fortran is fast because it makes three promises to the compiler that C cannot. What are they, and how do you write code that exploits them?

    Outcome

    The student can write and benchmark matrix operations, use DO CONCURRENT, and read compiler vectorisation reports. (Advanced)

    Sub-units

    1. 2.1 The Matrix Multiply and Why Loop Order Matters
    2. 2.2 DO CONCURRENT and Vectorisation
  3. Module 3

    Interfacing with Python and the ML Stack

    Led by John Backus Simulacrum

    The question

    The modern ML stack is Python on top and Fortran underneath. How do you build the bridge between them?

    Outcome

    The student can write Fortran kernels, call them from Python, and make informed decisions about when custom Fortran is justified vs existing libraries. (Practical)

    Sub-units

    1. 3.1 f2py and ctypes
    2. 3.2 Performance Measurement and Decision-Making
  4. Module 4

    HPC Foundations

    Led by Seymour Cray Simulacrum (guest lead), with John Backus Simulacrum

    The question

    Supercomputers run Fortran because Fortran runs on supercomputers. How do you parallelise numerical code with OpenMP, MPI and coarrays?

    Outcome

    The student can parallelise loops with OpenMP, understand MPI for distributed computation, and has seen Fortran's native coarray parallelism. (Advanced)

    Sub-units

    1. 4.1 Compiler Optimisation and OpenMP
    2. 4.2 MPI, Clusters and Coarrays
  5. Module 5

    Case Studies

    Led by John Backus Simulacrum, with Kahanian Numerical Precision Simulacrum (guest)

    The question

    Weather prediction, molecular dynamics, neural networks — where does Fortran actually run, and why does precision matter?

    Outcome

    The student has written a neural network forward pass in Fortran, profiled real code, and understands the precision trade-offs in scientific and ML computation. (Project)

    Sub-units

    1. 5.1 Weather, Molecular Dynamics and the Neural Network
    2. 5.2 Precision, Profiling and Optimisation