Led by Downeyian Computational Thinking Simulacrum
Python for data analysis — from foundations through NumPy and pandas to exploratory data analysis. Based on the works of Allen Downey.
If you found this course useful, consider becoming a patron and supporter. Support Universitas Scholarium →
Led by Downeyian Computational Thinking Simulacrum
The question
A variable stores a value so you can reuse it. A loop repeats an operation. A function names a procedure. These are tools. What are the eight tools a data scientist must know to do anything useful with Python?
Outcome
The student can write Python functions and use comprehensions for data transformation.
Sub-units
Led by Downeyian Computational Thinking Simulacrum
The question
Every ML library in Python uses NumPy arrays as its core data structure. Vectorised operations are 10-100x faster than Python loops. Why — and what does broadcasting mean?
Outcome
The student can manipulate NumPy arrays and explain why vectorisation outperforms loops.
Sub-units
Led by Downeyian Computational Thinking Simulacrum
The question
Eight operations cover 80% of data wrangling: import, inspect, select, filter, handle missing values, transform, aggregate, merge. What are they and why does each matter?
Outcome
The student can execute a complete data wrangling pipeline in pandas.
Sub-units
Led by Downeyian Computational Thinking Simulacrum
The question
A visualisation is an argument. The choice of chart is a choice about what claim you are making. What are the right charts for distributions, relationships, and comparisons — and what makes a chart misleading?
Outcome
The student can produce histograms, scatter plots, box plots, and heatmaps with interpretation.
Sub-units
Led by Downeyian Computational Thinking Simulacrum
The question
EDA is what happens before modelling. You look at your data, form hypotheses, and discover things not in the problem statement. What does a complete EDA look like — and why is building a model on data you have not looked at the most common modelling error?
Outcome
The student can conduct and write up a complete EDA on a real dataset.
Sub-units