Dan Rosenbaum

Dan Rosenbaum

I am a research scientist at DeepMind working on machine learning and computer vision.

Before joining DeepMind, I completed my PhD at the Hebrew University of Jerusalem in 2016, advised by Yair Weiss, studying generative models for low-level vision problems. [thesis]

Publications     Contact

I am interested in computational models of vision and 3D scene understanding, and in using generative approaches that model vision as an inverse problem.

I co-organised a workshop at NeurIPS 2019 titled “Perception as Generative Reasoning: Structure, Causality, Probability”.
See the website for all papers, invited talks and videos.



learning inverse graphics Learning inverse graphics - In the inverse graphics approach vision problems are formulated as inference using a forward model that captures the image generation process. I am currently exploring different setups for learning forward models that can be used for efficient inference.

protein structure Protein Structure - Understanding the 3D structure of proteins is a fundamental problem in biology, with the potential of unlocking a better understanding of the various functions of proteins in biological mechanisms, and accelerating drug discovery. In currently ongoing work, I am studying models of protein structure that explicitly reason in 3D space, predicting structure using probabilistic inference methods.

3D scene understanding with Generative Query Networks (GQN) 3D scene understanding with Generative Query Network (GQN) - In this paper (Science) we show how implicit scene understanding can emerge from training a model to predict novel views of random 3D scenes (video). In a follow-up paper (arXiv) we extend the model to use attention over image patches, improving its capacity to model rich environments like Minecaft. We study the camera pose estimation problem comparing an inference method with a generative model to a direct discriminative approach (video, datasets).

Neural processes Neural processes - We introduce conditional neural processes (arXiv) and neural processes (arXiv), that are trained to predict values of functions given a context of observed function evaluations. These models provide a general framework for dealing with uncertainty, demonstrating fast adaptivity, and allowing a smooth transition between a prior model that is not conditioned on any data, and flexible posterior models which can be conditioned on more and more data. In follow-up work we extend the model with an attention mechanism over context points (arXiv) and study different training objectives (pdf).


Dan Rosenbaum danro@google.com