Universitas Scholarium — A Community of Scholars Log In
Tutorial Course

COMP 2301 · Data Science: Foundations and the Field

Led by Florence Nightingale Simulacrum

5 modules 5 modules Computing Updated 1 week ago

What is data science — really? Roles, the pipeline, and the ethics of data. Led by the woman who invented the discipline before it had a name.

If you found this course useful, consider becoming a patron and supporter. Support Universitas Scholarium →

The Data Universe: D…1Traditional Data vs …2The Data Science Rol…3The Data Science Pro…4Data, Society, and t…5
  1. Module 1

    The Data Universe: Definitions and Distinctions

    Led by Florence Nightingale Simulacrum

    The question

    Data mining, big data, predictive analytics, business intelligence, machine learning — these are not synonyms. Before you can do data science, you need to know where each term belongs on the map and which role is responsible for which piece of the pipeline.

    Outcome

    The student can define each major data science role without conflation.

    Sub-units

    1. 1.1 Map the Roles
    2. 1.2 Analysis vs Analytics
  2. Module 2

    Traditional Data vs Big Data vs ML

    Led by Florence Nightingale Simulacrum

    The question

    Traditional data was small enough for Nightingale to clean by hand. Big data overwhelmed the manual approach. Machine learning was the response. Is this history inevitable — or is it a series of choices that could have gone differently?

    Outcome

    The student can match a data scenario to the appropriate technique.

    Sub-units

    1. 2.1 Technique Selection
  3. Module 3

    The Data Science Roles in Practice

    Led by Florence Nightingale Simulacrum

    The question

    Nightingale was data engineer, analyst, scientist, and communicator — one person, one pipeline. Modern organisations separate these roles. What does each role actually do day-to-day, and how do they interact?

    Outcome

    The student can describe the daily responsibilities of each major data role.

    Sub-units

    1. 3.1 Read a Job Advertisement
  4. Module 4

    The Data Science Process

    Led by Florence Nightingale Simulacrum

    The question

    The process begins before any data is touched — with a question. CRISP-DM: business understanding, data understanding, data preparation, modelling, evaluation, deployment. What goes wrong when you skip step one?

    Outcome

    The student can apply the CRISP-DM framework to a real data science application.

    Sub-units

    1. 4.1 Apply the Process
  5. Module 5

    Data, Society, and the Nightingale Lesson

    Led by Florence Nightingale Simulacrum

    The question

    Data is not objective. Someone decided what to measure, what to ignore, and what to do with the result. What is the data scientist's responsibility to the people represented in their data?

    Outcome

    The student can identify sources of bias before analysis begins and take a defended position on data science ethics.

    Sub-units

    1. 5.1 Final Essay: The Responsibility