Dr. Xiangru Lian (in Chinese: 廉相如, and a photo) is currently a senior research scientist at Kwai Inc. He received his PhD degree in Computer Science from University of Rochester in 2019. He received his Bachelor degree in Physics from University of Science and Technology of China in 2015.
His research interests cover many aspects of machine learning and optimization, with an emphasis on the scenarios with big data. He has both strong programming skills and great theoretical background. He has published important results including the theoretical justification of asynchronous SGD (NIPS 2015 spotlight paper), and the first decentralized SGD with linear speedup (NIPS 2017 oral paper), in many top AI conferences such as ICML, NIPS, and AISTATS. He has applied his research to many important real world industrial problems, and developed many software frameworks used in production during his previous internships in IBM T.J. Watson Research Center, Tencent AI Lab, and Kwai Y-Lab.
For more details such as the projects he has done, see his curriculum vitæ.
- View my public key or send me an email containing an encrypted message.
Note that I generally avoid keeping proprietary software or software with known security concerns such as WeChat, Telegram, and Skype running on my machines, so usually I will not be available on these platforms. I do use them in isolated environments such as systemd-nspawn and android work profile when required.
- (In Chinese) How to reduce 95% of the communication cost in training deep neural networks.
- (In Chinese) Kwai Inc.: A new 640x faster GPU based ad recommendation deep learning system [AI Front] [China Daily] [InfoQ] [CSDN] [AcFun] [Tencent Cloud News], where the new training system for ad recommendation designed and implemented by me is introduced.
- IBM: High-Efficiency Distributed Learning for Speech Modeling, where our AD-PSGD algorithm is used to accelerate training state of the art speech models.
- New IBM technique cuts AI speech recognition training time from a week to 11 hours using our AD-PSGD algorithm as the distributed training algorithm.
- (In Chinese) Tencent Cloud Computing News about our decentralized training research.
- IBM: Distributing Control of Deep Learning Training Delivers 10x Performance Improvement.
- Skymind: An Introduction to Distributed Training of Neural Networks, where our research on staleness-aware ASGD is used in Deeplearning4j.