Micah Carroll

I’m a 5th year Artificial Intelligence PhD student at Berkeley advised by Anca Dragan and Stuart Russell within BAIR and CHAI. I’m thankful to be supported by the NSF Fellowship.

I’m interested in humans’ preference, value, and belief changes, and how they may be affected by interactions with AI systems. I’ve studied this both in generality (with the language of DR-MDPs), and more specifically in the context of recommender systems, investigating how the choice of algorithm might affect us users. I’m probably best known for my work on human-AI collaboration, and developing the Overcooked-AI benchmark.

Outside of research, I enjoy inline skating 🛹, watching movies 🎥, and finding new music 🎵. Before immigrating to the US, I grew up in the amazingly chaotic city of Livorno 🇮🇹 – visit if you get the chance!

Publications

On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback

Marcus Williams*, Micah Carroll*, Adhyyan Narang, Constantin Weisser, Brendan Murphy, Anca Dragan

ICLR 2025

Paper Thread Code Talk

Beyond Preferences in AI Alignment

Tan Zhi-Xuan, Micah Carroll, Matija Franklin, Hal Ashton

Philosophical Studies 2024

Paper Thread

AI Alignment with Changing and Influenceable Reward Functions

Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan

ICML 2024

Paper Thread Talk

Characterizing Manipulation from AI Systems

Micah Carroll*, Alan Chan*, Henry Ashton, David Krueger

EAAMO 2023

Paper

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Stephen Casper, Xander Davies, ..., Micah Carroll, ..., Erdem Bıyık, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell

TMLR 2023

Paper Thread

Engagement, User Satisfaction, and the Amplification of Divisive Content on Social Media

Smitha Milli, Micah Carroll, Sashrika Pandey, Yike Wang, Anca Dragan

PNAS Nexus 2025

Paper Thread

Harms from Increasingly Agentic Algorithmic Systems

Alan Chan, Rebecca Salganik, Alva Markelius, Chris Pang, Nitarshan Rajkumar, Dmitrii Krasheninnikov, Lauro Langosco, Zhonghao He, Yawen Duan, Micah Carroll, Michelle Lin, Alex Mayhew, Katherine Collins, Maryam Molamohammadi, John Burden, Wanru Zhao, Shalaleh Rismani, Konstantinos Voudouris, Umang Bhatt, Adrian Weller, David Krueger, Tegan Maharaj

FAccT 2023

Paper

Who Needs to Know? Minimal Knowledge for Optimal Coordination

Niklas Lauffer, Ameesh Shah, Micah Carroll, Michael Dennis, Stuart Russell

ICML 2023

Paper Thread

Uni[MASK]: Unified Inference in Sequential Decision Problems

Micah Carroll, Orr Paradise, Jessy Lin, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

NeurIPS 2022 (Oral)

Paper Thread Code

Estimating and Penalizing Induced Preference Shifts in Recommender Systems

Micah Carroll, Dylan Hadfield-Menell, Stuart Russell, Anca Dragan

ICML 2022 (Spotlight)

Paper Thread

Optimal Behavior Prior: Improving Human-AI Collaboration Through Generalizable Human Models

Mesut Yang, Micah Carroll, Anca Dragan

Human-in-the-loop Learning (HILL) Workshop, NeurIPS 2022

Paper Thread Code

Time-Efficient Reward Learning via Visually Assisted Cluster Ranking

David Zhang, Micah Carroll, Andreea Bobu, Anca Dragan