Led by Demis Hassabis Simulacrum
Selecting the right model for the task — benchmarks, leaderboards, evaluation methodology, and a practical case study in AI-powered code translation.
Led by Demis Hassabis Simulacrum
The question
Model selection strategy and fundamentals · the Chinchilla scaling law (parameters vs training data) · understanding benchmarks (GPQA, MMLU-Pro, HLE) · limitations of benchmarks (data contamination, overfitting) · navigating leaderboards (Artificial ...
Outcome
Demonstrates engineering competence in benchmarks, leaderboards and selection strategy.
Sub-units
Led by Demis Hassabis Simulacrum
The question
Selecting models for code generation tasks · Python to C++ translation with frontier models (GPT-5, Claude, Gemini, Grok) · measuring performance speedups · open-source models for code generation (Qwen, DeepSeek, Ollama) · building a Gradio UI for co...
Outcome
Demonstrates engineering competence in code generation — a model selection case study.
Sub-units