Prabhat Nagarajan

* denotes an equal contribution

Deep Double Q-learning
Prabhat Nagarajan, Martha White, and Marlos C. Machado
Preprint. June 2025.
arXiv

Accelerating Q-learning through Efficient Value-Sharing across Actions
Prabhat Nagarajan, Brett Daley, Martha White and Marlos C. Machado
International Conference on Machine Learning (ICML), Spotlight, July 2026.
Best Paper Runner-up at Adaptive and Learning Agents Workshop at AAMAS 2026

PDF
When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
Vincent Liu, Prabhat Nagarajan, Andrew Patterson, and Martha White
International Conference on Autonomous Agents and Multiagent Systems (AAMAS). May 2026.
arXiv BibTeX

An Analysis of Action-Value Temporal-Difference Methods That Learn State Values
Brett Daley*, Prabhat Nagarajan*, Martha White, and Marlos C. Machado
Reinforcement Learning Journal (RLJ). August 2025.
PDF arXiv BibTeX Code

Periodic Intra-Ensemble Knowledge Distillation for Reinforcement Learning
Zhang-Wei Hong, Prabhat Nagarajan, and Guilherme J. Maeda
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD). September 2021.
PDF BibTeX Code
Reconnaissance for Reinforcement Learning with Safety Constraints
Shin-ichi Maeda, Hayato Watahiki, Yi Ouyang, Shintaro Okada, Masanori Koyama, and Prabhat Nagarajan
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD). September 2021.
PDF BibTeX
ChainerRL: A Deep Reinforcement Learning Library
Yasuhiro Fujita, Prabhat Nagarajan, Toshiki Kataoka, and Takahiro Ishikawa
Journal of Machine Learning Research (JMLR). 22(77):1−14, April 2021.
JMLR page PDF arXiv BibTeX

Distributed Reinforcement Learning of Targeted Grasping with Active Vision for Mobile Manipulators
Yasuhiro Fujita, Kota Uenishi, Avinash Ummadisingu, Prabhat Nagarajan, Shimpei Masuda, and Mario Ynocente Castro
International Conference on Intelligent Robots and Systems (IROS) October 2020.
PDF arXiv BibTeX Video

Learning Latent State Spaces for Planning through Reward Prediction
Aaron Havens, Yi Ouyang, Prabhat Nagarajan, and Yasuhiro Fujita
Workshop on Deep Reinforcement Learning at the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019) December 2019.
PDF BibTeX arXiv

Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations
Daniel S. Brown, Wonjoon Goo, Prabhat Nagarajan, and Scott Niekum
International Conference on Machine Learning (ICML) June 2019.
PDF BibTeX arXiv Code

Deterministic Implementations for Reproducibility in Deep Reinforcement Learning
Prabhat Nagarajan, Garrett Warnell, and Peter Stone
AAAI 2019 Workshop on Reproducible AI. January 2019.
PDF BibTeX arXiv Code

Nondeterminism as a Reproducibility Challenge for Deep Reinforcement Learning
Prabhat Nagarajan
Master's Thesis, The University of Texas at Austin, August 2018
Committee: Peter Stone (Supervisor), Scott Niekum
PDF BibTeX

The Impact of Nondeterminism on Reproducibility in Deep Reinforcement Learning
Prabhat Nagarajan, Garrett Warnell, and Peter Stone
2nd Reproducibility in Machine Learning Workshop at ICML 2018, Stockholm, Sweden.
PDF BibTeX Code