Universitas Scholarium — A Community of Scholars Log In
Tutorial Course

COMP 2309 · Data Science: Business Integration

Led by Vapnikian Statistical Learning Simulacrum

5 modules 5 modules Computing Updated 1 week ago

SQL, Python, and Tableau integrated in a real business case — the complete absenteeism prediction pipeline from raw data to Tableau dashboard.

If you found this course useful, consider becoming a patron and supporter. Support Universitas Scholarium →

SQL, Python, and Tab…1The Absenteeism Busi…2The Absenteeism Busi…3Visualisation and Co…4ChatGPT and AI Tools…5
  1. Module 1

    SQL, Python, and Tableau: The Integration

    Led by Vapnikian Statistical Learning Simulacrum

    The question

    SQL for querying, Python for modelling, Tableau for communicating. What does each tool contribute that the others cannot — and what is the correct sequence in the absenteeism pipeline?

    Outcome

    The student can describe the three-layer ecosystem and when each tool is appropriate.

    Sub-units

    1. 1.1 The Three-Layer Pipeline
  2. Module 2

    The Absenteeism Business Case: Preprocessing

    Led by Vapnikian Statistical Learning Simulacrum

    The question

    28 reason codes, a date column, continuous and categorical features mixed together. What are the preprocessing decisions — and how do you create a binary target from a continuous outcome?

    Outcome

    The student can execute the full absenteeism preprocessing pipeline.

    Sub-units

    1. 2.1 Preprocess the Data
  3. Module 3

    The Absenteeism Business Case: Modelling

    Led by Vapnikian Statistical Learning Simulacrum

    The question

    Fit a logistic regression, interpret the coefficient table, remove near-zero predictors. Which features most strongly predict excessive absence — and what does the model say about the role of age, reason, and commuting cost?

    Outcome

    The student can build, evaluate, and interpret a logistic regression for a business case.

    Sub-units

    1. 3.1 Build and Interpret
  4. Module 4

    Visualisation and Communication: Tableau

    Led by Vapnikian Statistical Learning Simulacrum

    The question

    A manager who cannot read a regression table will act on a scatter plot with a clear message. How do you translate a logistic regression coefficient into a Tableau visualisation that drives a HR decision?

    Outcome

    The student can produce three business-interpretable visualisations from model predictions.

    Sub-units

    1. 4.1 Three Visualisations
  5. Module 5

    ChatGPT and AI Tools in the Data Science Workflow

    Led by Vapnikian Statistical Learning Simulacrum

    The question

    ChatGPT can write preprocessing code and generate EDA suggestions. Does this raise or lower the bar for what a data scientist needs to know? You now evaluate AI-generated code rather than write it — is that easier or harder?

    Outcome

    The student can use AI tools productively and evaluate their outputs critically.

    Sub-units

    1. 5.1 Final Essay: The AI-Augmented Data Scientist