Deep Reinforcement Learning
My Master's thesis focused on reproducibility in deep reinforcement learning. Specifically, I studied the impact of nondeterminism in algorithm implementations on our ability to reproduce results.
At Preferred Networks I have worked on targeted grasping for robotics. We combine techniques from goal-conditioned reinforcement learning (hindsight experience replay), deep reinforcement learning (QT-OPT), and distributed training to achieve target grasping on a human support robot.
Inverse Reinforcement Learning
We leverage ranked demonstrations to improve upon a suboptimal demonstrator in high-dimensional deep reinforcement learning tasks.