arxiv:2604.13740
Michal Valko
AI & ML interests
large language models, reasoning, fine-tuning, test-time computation, reinforcement learning with human feedback, world models
Recent Activity
authored a paper about 1 hour ago
Spectral Thompson sampling authored a paper about 1 hour ago
Covariance-adapting algorithm for semi-bandits with application to sparse rewards authored a paper about 1 hour ago
Online learning with noisy side observations