Opportunities
I'm in GAP year and looking for PhD/intern positions. Please feel free to reach out if you believe I could be a suitable addition to your lab.
|
Research
I'm interested in large language models and embodied AI.
|
|
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Shengding Hu,
Yuge Tu,
Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, Xinrong Zhang, Zheng Leng Thai, Kaihuo Zhang, Chongyi Wang, Yuan Yao, Chenyang Zhao, Jie Zhou, Jie Cai, Zhongwu Zhai, Ning Ding, Chao Jia, Guoyang Zeng, Dahai Li, Zhiyuan Liu, Maosong Sun
arXiv, 2024
github /
blog /
arXiv
Our small LLM with 2.4B non-embedding parameters surpasses Llama-13B / Mistral-7B! Made possible by extensive model wind tunnel experiments for optimal scaling and the Warmup-Stable-Decay (WSD) learning rate scheduler for continuous training.
|
|
LEGENT: Open Platform for Embodied Agents
Zhili Cheng, Zhitong Wang, Jinyi Hu, Shengding Hu, An Liu,
Yuge Tu,
Pengkai Li, Lei Shi, Zhiyuan Liu, Maosong Sun
arXiv, 2024
github /
demo /
video /
arXiv
LEGENT is a wonderful 3D environment for developing communicative and manipulable agents using LLM and LMM. Have fun playing in it!
|
© 2024 Yuge Tu. All rights reserved. Adapted from Jon Barron's source code.
|
|