I am a Ph.D. student at Mila and The University of Montreal, advised by Prof. Yoshua Bengio, Dr. Anirudh Goyal, and Prof. Michael Mozer. I am also a visiting researcher at Meta, where I work with Dr. Nicolas Ballas.
My research is rooted in building cognitive science-inspired deep learning techniques. Broadly, I am interested in designing models that learn and reason like humans. Most recently, I have been exploring the metacognitive abilities of large language models (LLMs) in the context of mathematical problem solving (1). In earlier work, I developed hybrid architectures that integrate recurrent networks with transformers to effectively handle long-context modeling (2). Another major thread of my Ph.D. has focused on object-centric learning, where I have worked on building general-purpose visual representations that capture compositional structure and enable downstream reasoning (3, 4, 5).
Going forward, I am particularly interested in:
Prior to my current role, I gained valuable research experience across academia and industry through several internships. I was a research intern at Valence Labs, where I worked with Dr. Jason Hartford on experimental design strategies for estimating the effects of gene knockouts in cells. Before that, I interned at Microsoft Research NYC with Dr. Alex Lamb on reinforcement learning. Prior to starting my Ph.D., I spent a year at Mila working with Dr. Anirudh Goyal and Prof. Yoshua Bengio on cognitive science-inspired deep learning projects, now published at NeurIPS 2021 and ICLR 2022. During my undergraduate studies, I was a Google Summer of Code student developer with Preferred Networks, where I contributed CUDA-optimized implementations of RNNs, GRUs, and LSTMs to the ChainerX deep learning library. I also collaborated with Prof. Rajiv Ratn Shah at IIIT Delhi on applied NLP projects, and with Prof. Aditya Gopalan at IISc Bangalore on time-series forecasting models for urban pollution.
![]() |
Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Mozer, Sanjeev Arora NeurIPS 2024 Paper Probing the metacognitive capabilities of LLMs to improve mathematical problem solving. |
![]() |
Aniket Didolkar*, Andrii Zadaianchuk*, Rabiul Awal*, Maximilian Seitzer, Efstratios Gavves, Aishwarya Agrawal CVPR 2025 Paper / Project Page / Code User-controllable visual representation learning. |
![]() |
Aniket Didolkar*, Andrii Zadaianchuk, Anirudh Goyal, Michael Curtis Mozer, Yoshua Bengio, Georg Martius, Maximilian Seitzer* ICLR 2024 Paper / Code / Project Page Building object-centric models from a foundation model perspective. |
![]() |
Aniket Didolkar, Anirudh Goyal, Yoshua Bengio ICLR 2024 Paper Unsupervised Object-Discovery via two cycle-consistency objectives. |
![]() |
Aniket Didolkar, Kshitij Gupta, Anirudh Goyal, Alex Lamb, Nan Rosemary Ke, Yoshua Bengio NeurIPS, 2022 Paper / slides Merging transformers with recurrent networks to effectively handle long-context tasks. |
![]() |
Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Didolkar, Dipendra Misra, Dylan Foster, Lekan Molu, Rajan Chari, Akshay Krishnamurthy, John Langford, TMLR 2023 Paper Algorithm for discovery of the minimal controllable latent state that has all the information for controlling an agent while learning to discard all other irrelevant information. |
![]() |
Anirudh Goyal, Aniket Didolkar, Alex Lamb, Kartikeya Badola, Nan Rosemary Ke, Nasim Rahaman, Jonathan Binas, Charles Blundell, Michael Mozer, Yoshua Bengio ICLR, 2022 (Oral Presentation - top 5% of accepted paper) Paper Facilitating communication between modules using a limited-capacity bottleneck. |
![]() |
Aniket Didolkar*, Anirudh Goyal*, Nan Rosemary Ke, Charles Blundell, Philippe Beaudoin, Nicolas Heess, Michael Mozer, Yoshua Bengio NeurIPS, 2021 Paper World models via sparsely communicating recurrent modules. |
![]() |
Nan Rosemary Ke*, Aniket Didolkar*, Sarthak Mittal, Anirudh Goyal, Guillaume Lajoie, Stefan Bauer, Danilo Rezende, Yoshua Bengio, Michael Mozer, Christopher Pal NeurIPS Dataset and Benchmark Track, 2021 Paper / Code A new and highly-flexible benchmark for evaluation of causal discovery in model-based RL. |
![]() |
Amit Jindal, Aniket Didolkar, Arijit Ghosh Chowdhury, Ramit Sawhney, Rajiv Ratn Shah, Di Jin Coling, 2020 Paper Proposed a new formulation of mixup for NLP. |
![]() |
Amit Jindal, Narayanan Elavathur, Ranganatha, Aniket Didolkar, Arijit Ghosh Chowdhury, Ramit Sawhney, Rajiv Ratn Shah, Di Jin Interspeech, 2020 Paper Data augmentation using mixup for speech. |