Multi-Armed Bandit

Optimize
Price: $0.01/callLatency: <1msComplexity: O(n)

Choose the best option among 3 alternatives with uncertain payoffs. UCB1 balances exploitation of known-good options with exploration of uncertain ones.

Click "Run Algorithm" to see results

Input Schema

arms: Array of {id, name, pulls?, totalReward?}
algorithm: 'ucb1' | 'thompson' | 'epsilon-greedy'
config: {explorationConstant?, rewardDecay?}

Output Fields

selectedscorealgorithmexploitationexplorationregret