-
* denotes an equal contribution
-
denotes a paper receiving an award.
Preprints
- Deep Double Q-learning
Prabhat Nagarajan, Martha White, and Marlos C. Machado
Preprint. June 2025.
2026
-
Accelerating Q-learning through Efficient Value-Sharing across Actions
Prabhat Nagarajan, Brett Daley, Martha White and Marlos C. Machado
International Conference on Machine Learning (ICML), Spotlight, July 2026.Best Paper Runner-up at Adaptive and Learning Agents Workshop at AAMAS 2026
- When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
Vincent Liu, Prabhat Nagarajan, Andrew Patterson, and Martha White
International Conference on Autonomous Agents and Multiagent Systems (AAMAS). May 2026.
2025
- An Analysis of Action-Value Temporal-Difference Methods That Learn State Values
Brett Daley*, Prabhat Nagarajan*, Martha White, and Marlos C. Machado
Reinforcement Learning Journal (RLJ). August 2025.
2021
-
Periodic Intra-Ensemble Knowledge Distillation for Reinforcement Learning
Zhang-Wei Hong, Prabhat Nagarajan, and Guilherme J. Maeda
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD). September 2021.
-
Reconnaissance for Reinforcement Learning with Safety Constraints
Shin-ichi Maeda, Hayato Watahiki, Yi Ouyang, Shintaro Okada, Masanori Koyama, and Prabhat Nagarajan
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD). September 2021.
-
ChainerRL: A Deep Reinforcement Learning Library
Yasuhiro Fujita, Prabhat Nagarajan, Toshiki Kataoka, and Takahiro Ishikawa
Journal of Machine Learning Research (JMLR). 22(77):1−14, April 2021.
2020
-
Distributed Reinforcement Learning of Targeted Grasping with Active Vision for Mobile Manipulators
Yasuhiro Fujita, Kota Uenishi, Avinash Ummadisingu, Prabhat Nagarajan, Shimpei Masuda, and Mario Ynocente Castro
International Conference on Intelligent Robots and Systems (IROS) October 2020.
2019
-
Learning Latent State Spaces for Planning through Reward Prediction
Aaron Havens, Yi Ouyang, Prabhat Nagarajan, and Yasuhiro Fujita
Workshop on Deep Reinforcement Learning at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019) December 2019. -
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations
Daniel S. Brown, Wonjoon Goo, Prabhat Nagarajan, and Scott Niekum
International Conference on Machine Learning (ICML) June 2019. -
Deterministic Implementations for Reproducibility in Deep Reinforcement Learning
Prabhat Nagarajan, Garrett Warnell, and Peter Stone
AAAI 2019 Workshop on Reproducible AI. January 2019.
2018
-
Nondeterminism as a Reproducibility Challenge for Deep Reinforcement Learning
Prabhat Nagarajan
Master's Thesis, The University of Texas at Austin, August 2018
Committee: Peter Stone (Supervisor), Scott Niekum -
The Impact of Nondeterminism on Reproducibility in Deep Reinforcement Learning
Prabhat Nagarajan, Garrett Warnell, and Peter Stone
2nd Reproducibility in Machine Learning Workshop at ICML 2018, Stockholm, Sweden.