Chenxu Hu     胡晨旭

Hey, I am Chenxu Hu, currently a Ph.D. student in Computer Science at IIIS, Tsinghua University advised by Prof. Hang Zhao. I am also a research assistant in MARS Lab at Tsinghua University.

I am especially interested in multi-modal machine learning and audio & speech processing, including speech synthesis, audio-visual learning, and some novel tasks combining audio, vision, language and other modalities. My research vision is to enable machines to learn, reason and interact from multi-modal inputs, just like human beings.

Previously, I received my B.E. in Computer Science from Chu Kochen Honors College, Zhejiang University.

E-Mail | Google Scholar | Github


* indicates equal contribution


[NEW] ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory
Chenxu Hu*, Jie Fu*, Chenzhuang Du, Simian Luo, Junbo Zhao, Hang Zhao
LLM @ IJCAI 2023
paper | project | code


DIFF-FOLEY: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
Simian Luo, Chuanhao Yan, Chenxu Hu, Hang Zhao
NeurIPS 2023
paper | project | code


DiCLET-TTS: Diffusion Model based Cross-lingual Emotion Transfer for Text-to-Speech — A Study between English and Mandarin
Tao Li, Chenxu Hu, Jian Cong, Xinfa Zhu, Jingbei Li, Qiao Tian, Yuping Wang, Lei Xie
Transactions on Audio, Speech, and Language Processing (TASLP)
paper | project


ViP3D: End-to-end Visual Trajectory Prediction via 3D Agent Queries
Junru Gu*, Chenxu Hu*, Tianyuan Zhang, Xuanyao Chen, Yilun Wang, Yue Wang, Hang Zhao
CVPR 2023
paper | project | code


Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Zhengxi Liu*, Qiao Tian*, Chenxu Hu*, Xudong Liu, Menglin Wu, Yuping Wang, Hang Zhao, Yuxuan Wang
paper | project


Neural Dubber: Dubbing for Videos According to Scripts
Chenxu Hu, Qiao Tian, Tingle Li, Yuping Wang, Yuxuan Wang, Hang Zhao
NeurIPS 2021
paper | project


CVC: Contrastive Learning for Non-parallel Voice Conversion
Tingle Li*, Yichen Liu*, Chenxu Hu*, Hang Zhao
Interspeech 2021
paper | project | code


FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren*, Chenxu Hu*, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu
ICLR 2021
paper | project | code

Presentations & Talks
Aug. 2023 Invited talk at JiangMen TechBeat, "ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory" (Talk)
Aug. 2023 Invited paper talk at Symposium on Large Language Models (LLM 2023), "ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory"
Jan. 2022 Invited talk at JiangMen TechBeat, "Neural Dubber: Dubbing for Videos According to Scripts" (Talk)
June 2021 Invited paper talk at Sight and Sound Workshop, CVPR 2021, "Neural Dubber: Dubbing for Silent Videos According to Scripts" (Talk)
May 2021 Poster presentation at ICLR 2021, "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Media Coverage
Jiqizhixin(机器之心) Blog A Chinese news about ChatDB
Slator Neural Dubber: TikTok Parent Company ByteDance Explores Automated Dubbing
Jiqizhixin(机器之心) Blog A Chinese news about Neural Dubber
Microsoft Research Blog FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Microsoft Azure AI Blog Neural Text to Speech extends support to 15 more languages with state-of-the-art AI quality
2021 - present
2021 - 2022

Template credits: Changan Chen