├── data/ # Data loading and preprocessing ├── src/ # Core algorithm implementation │ ├── bandit/ # Multi-armed bandit framework │ ├── scoring/ # Relative difference scoring & Bayesian regularization ...
How does a gambler maximize winnings from a row of slot machines? This is the inspiration for the "multi-armed bandit problem," a common task in reinforcement learning in which "agents" make choices ...