Tiansheng Wen

prof_pic.jpg

There is a crack in everything,

that's how the light gets in.✨

I am Tiansheng Wen, a first-year Ph.D. student at Georgia Tech, advised by Prof. Pan Li.

My research broadly lies in core machine learning, especially sparsity, with applications to LLMs and LLM-driven agentic learning. Recently, I have been mainly focusing on:

  • Token sparsity in on-policy distillation.
  • Sparse and structured representations for edge-friendly scalable retrieval.
  • Agentic learning for structured data.

Before joining Georgia Tech, I spent seven wonderful years at Xidian University for my B.S. and M.S. degrees, where I was fortunate to be supervised by Prof. Bo Chen.

I also spent an incredible year at Stony Brook University working with Prof. Chenyu You. I am deeply grateful to my close collaborators, Prof. Stefanie Jegelka and Yifei Wang, for our work on efficient and scalable sparse methods.

I am actively open to research collaboration. Feel free to reach out!

news

May, 2026 No more K-means! Our paper Single-Stage Sparse Multi-Vector Retrieval (SSR) was accepted to ICML 2026, showing how sparse codes can naturally organize token-level matches for faster, fine-grained retrieval without a separate clustering step. šŸŽ‰šŸŽ‰
Mar, 2026 Thrilled to share that our paper A Non-negative VAE: the Generalized Gamma Belief Network has been accepted by IEEE TPAMI! šŸŽ‰šŸŽ‰
Mar, 2026 A new chapter begins! Beyond excited to join Georgia Institute of Technology as a PhD student this Fall! Atlanta, I’m coming! Go Jackets! šŸāœØ
Feb, 2026 One paper accepted at TCSVT! First benchmark for multimodal remote sensing detection under real-world cloud degradations.šŸŽ‰šŸŽ‰
Jan, 2026 New members in sparsity family! Two papers on ultra-sparse embeddings and sparse feature attention accepted at ICLR 2026. Catch us in Rio de Janeiro!🌊🐚
Oct, 2025 Join ByteDance Bandai as a Research Intern, targeting Agentic RL!āš”ļøļøāš”ļø
Jul, 2025 The wait is over! CSR is now available in Sentence-Transformer v5.0! šŸ¤—šŸ¤—
May, 2025 Our paper Contrastive Sparse Representations (CSR) was selected for an Oral Presentation (Top 1%) at ICML 2025! CSR compresses SOTA 4k-dim embeddings to just 32 active dimensions, achieving ~100Ɨ faster retrieval for RAG and vector databases with minimal accuracy loss.šŸ†šŸ†
Mar, 2025 Our recent work Contrastive Sparse Representation has generated considerable interest as a promising alternative approach for efficient embedding retrieval, and we have been invited to publish the model on Hugging Face and Sentence-Transformer! Code available at Link. šŸ¤—šŸ¤—
Feb, 2025 Our paper LanCE has been accepted by CVPR 2025! šŸŽ‰šŸŽ‰
Jul, 2024 Our paper HICE-Score has been accepted by CVPR 2025! šŸŽ‰šŸŽ‰

selected publications

  1. ICML
    Oral
    csr.png
    Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
    Tiansheng Wen*, Yifei Wang* , Zequn Zeng , Zhong Peng , Yudi Su , Xinyang Liu , Bo Chen , Hongwei Liu , Stefanie Jegelka , and Chenyu You
    ICML, Oral Presentation (Top 1%) , 2025
  2. Scaling Attention via Feature Sparsity
    Yan Xie* , Tiansheng Wen*, Tang Da Huang , Bo Chen , Chenyu You , Stefanie Jegelka , and Yifei Wang
    ICLR , 2026
  3. CSRV2: Unlocking Ultra-Sparse Embeddings
    Lixuan Guo* , Yifei Wang* , Tiansheng Wen*, Yifan Wang , Aosong Feng , Bo Chen , Stefanie Jegelka , and Chenyu You
    ICLR , 2026
  4. A Non-negative VAE: the Generalized Gamma Belief Network
    Zhibin Duan , Tiansheng Wen, Muyao Wang , Bo Chen , and Mingyuan Zhou
    TPAMI , 2026