About me

Interested in audio-visual multi-modal processing with machine learning methods. Recieved the B.Sc degree in the department of automation from Tsinghua University in 2018, and is now a master student with the department of computer science and technology at Tsinghua University.

Education

2021-Present

Department of Computer Science and Technology

Tsinghua University

I am studing for my master's degree supervised by Prof. Thomas Fang Zheng and Prof. Dong Wang in Center of Speech and Language Technologies (CSLT) now.

2014-2018

Department of Automation

Tsinghua University

Publications

[Published] L.T.Li, X.L.Li, H.Y.Jiang, C.Chen, R.H.Hou, D.Wang CN-Celeb-AV: A Multi-Genre Audio-Visual Dataset for Person Recognition, by Interspeech 2023 (the 24th INTERSPEECH Conference )

ISCA-Archive; Arxiv; Web;

[Published] H.R.Sun, D.Wang, L.T.Li, C.Chen, T.F.Zheng Random Cycle Loss and Its Application to Voice Conversion, by TPAMI (IEEE Transactions on Pattern Analysis and Machine Intelligence)

IEEEXplore;

[Published] C.Chen, D.Wang, T.F.Zheng CN-CVS: A Mandarin Audio-Visual Dataset for Large Vocabulary Continuous Visual to Speech Synthesis, by ICASSP 2023 (2023 IEEE International Conference on Acoustics, Speech, and Signal Processing)

IEEEXplore; Web;

[Published] H.R.Sun, C.Chen, L.T.Li, D.Wang CycleFlow: Purify Information Factors by Cycle Loss, by Odyssey 2022 The Speaker and Language Recognition Workshop

ISCA-Archive; Arxiv; Demo;

Projects

Video to Speech Synthesis

Python Deep Learning Audio-Visual Lip Reading

Target at restoring the corresponding speech signal from visual information in lip movement alone. We have collected a large-scale mandarin audio-visual dataset as the benchmark of this project.

Project Web; Dataset

Skills

Program Language: Python, C++, JavaScript

Markup Language: LaTeX, HTML, Markdown

Deep Learning Framework: Pytorch, Lightning

Contact

Email me: chenc21@mails.tsinghua.edu.cn | c-c14@tsinghua.org | chenchen@cslt.org

My Google Scholar Profile

My Github Profile

CSLT