I am a research scientist at Meta, building next generation language models, multimodal models, and generative AI. Previously, I completed a PhD in AI at Stanford, advised by Percy Liang, Jure Leskovec and Chris Manning, and worked at Google DeepMind. I am interested in building multimodal foundation models that can assist humans in diverse tasks. In particular, I work on: Multimodal understanding