Chengshuai Shi

picture 

Senior Researcher
Bloomberg AI, New York, NY

Ph.D. in Electrical Engineering (Graduated in 2024)
University of Virginia, Charlottesville, VA
Advisor: Prof. Cong Shen

Email (for Bloomberg-related contacts): cshi128 at bloomberg dot net
Email (for general contacts): cs7ync at virginia dot edu

Phone: 434-218-9860

About this picture: My mother draw this “Dr. Cat” wearing a graduation cap when I was in high school. It is one of my greatest joys to have my family witness me wearing one for real as I graduate with a doctoral degree.

Google Scholar Profile

Research Interest

I am broadly interested in machine learning research, with a primary focus on reinforcement learning (RL). My current work explores emerging topics such as foundation models, particularly in-context RL and RL from human feedback (RLHF). My previous research concentrated on the multi-agent systems of RL (including multi-armed bandits (MAB)), emphasizing both theoretical foundations and practical applications, in fields like wireless communication and recommender systems. Additionally, I have a strong interest in distributed learning and federated learning.

News

  • 09/2024: Three papers accepted to NeurIPS 2024!

    • [Co-first-authored] “Efficient Prompt Optimization Through the Lens of Best Arm Identification”: A systematical relationship between prompt optimization and best arm identification in MAB is established in this work. Built upon this relationship, a general framework is proposed to efficiently perform prompt optimization. This is a joint work with Kun Yang (UVA), Zihan Chen (UVA), Prof. Jundong Li (UVA), Prof. Jing Yang (PSU), and Prof. Cong Shen (UVA).

      • A preliminary version appears in ICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation Models, May 2024.

    • [First-authored] “Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained Models”: We provide theoretical understanding and empirical evidences on the in-context learning capabilities of pre-trained transformers in multiplayer competitive games. This is a joint work with Kun Yang (UVA), Prof. Jing Yang (PSU), and Prof. Cong Shen (UVA).

      • A preliminary version appears in ICLR 2024 Workshop on Generative Models for Decision Making, May 2024.

    • “Mixture of Demonstrations for In-Context Learning”: We provide a novel framework to effectively select demosntrations for prompts used in in-context learning, leveraing an expert-wise training strategy. This is a joint work with Song Wang (UVA), Zihan Chen (UVA), Prof. Cong Shen (UVA), and Prof. Jundong Li (UVA).

  • 09/2024: New preprint available!

    • Our work “Building Math Agents with Multi-Turn Iterative Preference Learning” is available on arXiv. In this work, we proposed a multi-turn direct preference learning framework to enhance the mathematical reasoning capabilities of large language models (LLMs), with its superiority demonstrated by both theoretical guarantees and empirical results. This is a joint work with many amazing researchers from UIUC, Princeton, Google Research, and Google Deepmind.

  • 06/2024: One paper accepted to Transactions on Machine Learning Research (TMLR)!

    • [First-authored] Our paper “Harnessing the Power of Federated Learning in Federated Contextual Bandits” is accepted to Transactions on Machine Learning Research, which provides a novel federated contextual bandits design capable of flexibly incorporating federated learning protocols. This is a joint work with Ruida Zhou (UCLA), Kun Yang (UVA), and Prof. Cong Shen (UVA).

    • A preliminary version appears in NeurIPS 2023 Workshop on Multi-Agent Security, Dec. 2023.

  • 11/2023: One paper accepted to IEEE Transations on Signal Processing!

    • [First-authored] Our paper “Reward Teaching for Federated Multi-Armed Bandits” is accepted to IEEE Transactions on Signal Processing, which leverages implicit reward adjustments to guide autonomous bandit agents in a federated multi-armed bandit system. This is a joint work with Wei Xiong (UIUC), Prof. Cong Shen (UVA), and Prof. Jing Yang (PSU).