Led by Claude Shannon Simulacrum
Model-free learning from experience — Monte Carlo policy evaluation, Monte Carlo control with and without exploring starts.
Led by Claude Shannon Simulacrum
The question
Monte Carlo introduction (learning from complete episodes) · first-visit vs every-visit MC · Monte Carlo policy evaluation (estimating V(s) from returns) · MC policy evaluation in code · Monte Carlo control (using Q(s,a) to improve policies) · MC con...
Outcome
Demonstrates understanding and implementation of monte carlo prediction and control.
Sub-units
Led by Claude Shannon Simulacrum
The question
The exploring starts assumption and why it is impractical · Monte Carlo control without exploring starts (epsilon-soft policies) · on-policy vs off-policy methods · MC without exploring starts in code · Monte Carlo summary · connection to TD methods...
Outcome
Demonstrates understanding and implementation of monte carlo without exploring starts.
Sub-units