← Back to Gallery

VAD-CFR Algorithm Mechanics

Volatility-Adaptive Discounted Counterfactual Regret Minimization — Discovered by AlphaEvolve

Panel 1: Information Set Leduc Poker
Current Information Set
Holding Jack • Board: Queen • After: Check
Hand
J
|
Board
Q
Current Strategy (Iteration 0)
Fold
33.3%
Call
33.3%
Raise
33.3%
Panel 2: Volatility Monitor Innovation 1
Instantaneous |regret|
EWMA Volatility
Threshold
Panel 3: Regret Accumulation Comparison Innovation 2: Asymmetric Boosting
Standard CFR
VAD-CFR (with boost & discount)
✨ Positive regret boosted by 1.1x ✨
Panel 4: Discount Factor Innovation 1
0 0.5 1
0.950 discount factor
discount = base * (1 + vol_scale * volatility)
Higher volatility → Stronger discounting → Faster forgetting
Panel 5: Warm-Start Phase Innovation 3
Iteration: 0 / 1000
WARM-UP: Strategy accumulation blocked
Early iterations produce noisy strategies — excluded from average
iter 500
0 250 500 750 1000
Hard Warm-Start Rule:
if iteration < 500: skip strategy accumulation
if iteration ≥ 500: accumulate into average strategy