2

Decoupling Exploration and Policy Optimization: Uncertainty Guided Tree Search for Hard Exploration

Pre-Print 2026 (New!)

End-to-End Efficient RL for Linear Bellman Complete MDPs with Deterministic Transitions

Pre-Print 2026

Adaptive Matrix Online Learning through Smoothing with Guarantees for Nonsmooth Nonconvex Optimization

Pre-Print 2026

An Ellipsoid Algorithm for Online Convex Optimization

NeurIPS 2025

Oracle-Efficient Adversarial Reinforcement Learning via Max-Following

ICML 2025 Workshop

Is a Good Foundation Necessary for Efficient Reinforcement Learning? The Computational Role of the Base Model in Exploration

Pre-Print 2025

Beating Adversarial Low-Rank MDPs with Unknown Transition and Bandit Feedback

NeurIPS 2024

Fully Unconstrained Online Learning

NeurIPS 2024

On the Sample Complexity of Imitation Learning for Smoothed Model Predictive Control

CDC 2024

The Power of Resets in Online Reinforcement Learning

NeurIPS 2024 (Spotlight)