Avatar

Senthil Purushwalkam

Senior Research Scientist

Salesforce AI Research
Palo Alto, CA

About Me:

Hello! I am a Senior AI Research Scientist at Salesforce, specializing in Computer Vision and NLP.

At Salesforce, my work includes:

  • Developing cutting-edge large multimodal models
  • Improving the fidelity and controllability of visual generation models
  • Enhancing NLP models for RAG systems
  • Minimizing hallucinations in LLMs and multimodal models
  • Creating tailored language and multimodal solutions for enterprise applications

Before joining Salesforce, I received a PhD from Carnegie Mellon University working with Prof. Abhinav Gupta. In the past, I have worked on a very diverse range of topics in Computer Vision, Robotics and NLP. Check out my papers below and reach out to me if you would like to chat!

Education

  • PhD in Robotics, 2022

    Robotics Institute, Carnegie Mellon University

  • MS in Robotics, 2017

    Robotics Institute, Carnegie Mellon University

  • BTech Electronics and Electrical Engineering, 2014

    Indian Institute of Technology (IIT), Guwahati

News

  • 22 August 2024

    Released an open source video generation model - xGen-VideoSyn
  • 16 August 2024

    Released an open source state-of-the-art multimodal model - BLIP-3 (xGen-MM)
  • 25 Jan 2024

    Paper on personalized image generation - BootPIG
  • 21 November 2023

    CVPR Paper on aligning diffusion models to human preferences. Check it out here
  • 15 October 2023

    Paper on Image-to-3D generation accepted at Neurips 2023. Stop by at our poster to chat!

Publications

(view list)


Icon Comprehensive evaluation of VLMs using dense scene graphs

Trust but Verify: Programmatic VLM Evaluation in the Wild


Icon Benchmark for evaluating context-faithfulness of LLMs

FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"


Icon State-of-the-art LLM for RAG, outperforms GPT-4o, Claude-3.5-sonnet, Command R+

SFR-RAG: Towards Contextually Faithful LLMs


Icon Latest open-source video generation model

xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations


Icon Open-source Multimodal LM, State-of-the-art under 5B parameters

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models



Diffusion Model Alignment Using Direct Preference Optimization






Audio-Visual Floorplan Reconstruction







Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles


Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks


Combining the Best of Graphical Models and ConvNets for Semantic Segmentation


Automatic Segmentation of Adipose Tissue from Thigh Magnetic Resonance Images


Research Experience

 
 
 
 
 

Senior Research Scientist

Salesforce AI Research

Aug 2024 – Present Palo Alto
 
 
 
 
 

Research Scientist

Salesforce AI Research

Jun 2022 – Aug 2024 Palo Alto
 
 
 
 
 

Visiting Researcher

Facebook AI Research

Aug 2020 – Dec 2020 Pittsburgh
 
 
 
 
 

Research Intern

Facebook AI Research

May 2020 – Aug 2020 Austin (remotely from Pittsburgh)
 
 
 
 
 

Research Intern

Facebook AI Research

Sep 2018 – Dec 2018 New York
 
 
 
 
 

Research Intern

Adobe Research

May 2017 – Aug 2017 San Francisco
Advisor: Bryan Russell
 
 
 
 
 

Graduate Research Assistant

Robotics Institute, Carnegie Mellon University

Aug 2015 – May 2022 Pittsburgh
Advisor: Abhinav Gupta
 
 
 
 
 

Research Intern

Machine Learning & Perception Group, Viginia Tech

Oct 2014 – Jun 2015 Blacksburg
Advisor: Dhruv Batra
 
 
 
 
 

Research Intern

Machine Intelligence Laboratory, University of Tokyo

May 2014 – Aug 2014 Tokyo
 
 
 
 
 

Research Intern

Machine Learning & Perception Group, Viginia Tech

May 2013 – Aug 2013 Blacksburg