Hi! My name is Mingsheng Li (中文名: 李铭晟). I am currently a final-year Master’s student in Artificial Intelligence at Fudan University, advised by Prof. Tao Chen. I am also fortunate to work closely with Dr. Hongyuan Zhu from A*STAR, Singapore, Dr. Gang Yu, Dr. Xin Chen, and Dr. Chi Zhang from Tencent, and Dr. Bo Zhang from Shanghai AI Lab. Before this, I received my bachelor’s degree in Electronic Engineering from Fudan University in 2022.

I work in the fields of deep learning and computer vision, with particular focuses on large models, multi-modal learning and embodied AI. My research pursues to develop robust and scalable general-purpose AI systems to solve complex problems.

📣 I am actively looking for researcher / Ph.D. opportunities. Please check out my resume here.

🔥 News

Sep. 2024. 🎉🎉 One Paper (3DET-Mamba) is accepted to NeurIPS 2024! Comming Soon)
Jul. 2024. 🎉🎉 M3DBench is accepted to ECCV 2024.
Jun. 2024. 🚀🚀 We release WI3D, the first approach that can generalize well-trained 3D detectors to learn novel classes with the aid of foundation models.
Jun. 2024. 🎉🎉 Our LGD, a new method for lightweight model pre-training, is accepted to T-MM 2024. Code is released now!
Apr. 2024. 🎉🎉 Our state-of-the-art 3D dense captioning method Vote2Cap-DETR++ , is accepted to T-PAMI 2024.
Jul. 2024. 🚀🚀 We release M3DBench , a new 3D instruction-following dataset with interleaved multi-modal prompts and a new benchmark to assess large models across 3D vision-centric tasks.
Feb. 2024. 🎉🎉 Our Large Language 3D Assistant, LL3DA , is accepted to CVPR 2024.

📝 Recent Works

T-MM 2024

WI3D: Weakly Incremental 3D Detection via Vision Foundation Models
T-MM 2024
Mingsheng Li, Sijin Chen, Shengji Tang, Hongyuan Zhu, Yanyan Fang, Xin Chen, Zhuoyuan Li, Fukun Yin, Gang Yu, Tao Chen

paper

Introducing new categories to well-trained 3D detectors with 2D foundation models.

ECCV 2024

M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
ECCV 2024 |
Mingsheng Li, Xin Chen, Chi Zhang, Sijin Chen, Hongyuan Zhu, Fukun Yin, Gang Yu, Tao Chen

project | arXiv | github

Propose a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.

T-MM 2024

Lightweight Model Pre-training via Language Guided Knowledge Distillation
T-MM 2024
Mingsheng Li, Lin Zhang, Mingzhen Zhu, Zilong Huang, Gang Yu, Jiayuan Fan, Tao Chen

paper | arXiv | github

Language-guided distillation enhances model pre-training.

T-PAMI 2024

Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
T-PAMI 2024 |
Sijin Chen, Hongyuan Zhu, Mingsheng Li, Xin Chen, Peng Guo, Yinjie Lei, Gang Yu, Taihao Li, Tao Chen

paper | arXiv | github

Decoupled feature extraction and task decoding for 3D Dense Captioning.

CVPR 2024

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
CVPR 2024 |
Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen

project | arXiv | github | youtube

Propose a Large Language 3D Assistant that responds to both visual interactions and textual instructions in complex 3D environments.

🥇 Awards and Scholarships

2024. National Scholarship (rank 1/244).
2023. 2nd Prize of Graduate Academic Scholarship.
2022. Outstanding Graduate of Fudan University.
2020. 2nd Prize of China Undergraduate Mathematical Contest in Modeling.
2020. STEM (Science, Technology, Engineering, Mathematics) Scholarship.
2019. 1st Prize of Chinese Mathematics Competitions (Top 20).
2019. National Encouragement Scholarship.

📖 Educations

Sep. 2022 - Jun. 2025 (expected). Master student at Fudan University.
Sep. 2018 - Jun. 2022. Bachelor student at Fudan University.

💻 Internships

2024.01 - Present. Intern Researcher, Shanghai AI Lab, Shanghai, China.