Multi-Armed Bandit
OptimizePrice: $0.01/callLatency: <1msComplexity: O(n)
Choose the best option among 3 alternatives with uncertain payoffs. UCB1 balances exploitation of known-good options with exploration of uncertain ones.
Click "Run Algorithm" to see results
Input Schema
arms: Array of {id, name, pulls?, totalReward?}
algorithm: 'ucb1' | 'thompson' | 'epsilon-greedy'
config: {explorationConstant?, rewardDecay?}
Output Fields
selectedscorealgorithmexploitationexplorationregret