Chengshuai Shi

Postdoctoral Fellow

Princeton Language and Intellegence, Princeton University

Phone: 434-218-9860

I am currently a Postdoctoral Fellow at the Princeton Language and Intelligence (PLI) initiative at Princeton University, where I work closely with Professor Chi Jin. Prior to joining PLI, I worked for a year as a Senior Machine Learning Research Engineer in the AI group at Bloomberg, New York City.

I received my Ph.D. in Electrical Engineering from the University of Virginia in 2024, where I was advised by Professor Cong Shen. During my Ph.D. (2021–2024), I was honored to be supported by the Bloomberg Data Science Ph.D. fellowship.

My research interests lie in machine learning, with a focus on intelligent decision-making. Specifically, I work on:

Foundational principles in areas such as reinforcement learning, multi-armed bandits, game theory, and multi-agent systems;
Applications in emerging fields including wireless communication, recommender systems, and large language models

News

01/2026: One paper accepted to ICLR 2026!
- “Efficient Multi-objective Prompt Optimization via Pure-exploration Bandits”: This work studies multi-objective prompt optimization under a budget constraint using tools from pure-exploration bandits. It extends the framework introduced in our previous work (NeurIPS 2024), which established a connection between bandit algorithms and prompt optimization.

10/2025: Will attend the 2025 EAS Trailblazers Symposium at Caltech as one of the seven trailblazers!

09/2025: One paper accepted to NeurIPS 2025!
- “Greedy Sampling Is Provably Efficient For RLHF”: We show that, under both the Bradley–Terry model and a more general preference model, greedy sampling based on empirical estimates is provably efficient for RLHF with the KL-regularized objective.

09/2025: I have joined Princeton University as a Postdoctoral Fellow at the Princeton Language and Intelligence (PLI) initiative! Feeling excited for this new adventure!

05/2025: One paper accepted to UAI 2025!
- “Augmenting Online RL with Offline Data is All You Need: A Unified Hybrid RL Algorithm Design and Analysis”: We propose a unified RL approach that combines offline data with online exploration, yielding improved sub-optimality and regret bounds while revealing distinct coverage requirements.