Jian-Hui Duan

却顾所来径,苍苍横翠微

Currently, I am an algorithm researcher in the ByteDance Seed LLM team, and my main job responsibilities involve training algorithms and data mining. In 2022, I obtained a Master’s degree in Computer Science from Nanjing University, and in 2019, I received a Bachelor’s degree in Computer Science from Nanjing University.

Research Interests

My main research direction focuses on machine learning and data mining methods with demonstrable theoretical foundations. Meanwhile, I am passionate about efficient deep learning training approaches. Therefore, my main research areas cover the following topics:

  • Exploration of training optimization and training methods
  • High-quality training data mining
  • Data distribution shift on deep learning
  • The integration of fundamental theories of machine learning with ultra-large-scale training
  • Reasoning ability of LLM/VLM and its generalization

Selected Publications

  • Jian-Hui Duan, Wenzhong Li, Derun Zou, Ruichen Li, Sanglu Lu, Federated Learning with Data-Agnostic Distribution Fusion, The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, Jun 18-22, 2023.
  • Jian-Hui Duan, Wenzhong Li, Sanglu Lu, FedDNA: Federated Learning with Decoupled Normalization-Layer Aggregation for Non-IID Data, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2021), Bilbao, Spain, Sep 13-17, 2021.
  • Jian-Hui Duan, Wenzhong Li, Xiao Zhang, Sanglu Lu, Forecasting fine-grained city-scale cellular traffic with sparse crowdsourced measurements, Computer Networks, 39(2461-2475), Volume 214, pp 1-14, Sep 4 2022.
  • Wangxiang Ding, Wenzhong Li, Zhijie Zhang, Chen Wan, Jianhui Duan, Sanglu Lu, Time-varying Gaussian Markov Random Fields Learning for Multivariate Time Series Clustering, IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 35, no. 11, Nov 2023.
  • Derun Zou, Xusheng Liu, Lintan Sun, Jian-Hui Duan, Ruichen Li, Yeting Xu, Wenzhong Li, Sanglu Lu, FedMC: Federated Reinforcement Learning on the Edge with Meta-Critic Networks, IEEE International Performance, Computing, and Communications Conference (IPCCC’22), Austin, Texas, USA, November 11-13, 2022.

目前为字节跳动大模型团队(Seed)的一名算法研究员,主要工作内容为训练算法与数据挖掘。2022年于南京大学获得计算机工学硕士学位,2019年于南京大学获得计算机理学学士学位。

研究方向

我的主要研究方向为理论基础可论证的机器学习和数据挖掘方法,同时对高效的深度学习训练方式充满热情。因此主要的研究领域为以下课题:

  • 训练优化与训练方式探索
  • 高质量训练数据挖掘
  • 数据分布漂移对深度学习的影响
  • 机器学习基础理论与超大规模训练的结合
  • 大语言模型推理能力及其泛化性

发表论文

  • Jian-Hui Duan, Wenzhong Li, Derun Zou, Ruichen Li, Sanglu Lu, Federated Learning with Data-Agnostic Distribution Fusion, The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, Jun 18-22, 2023.
  • Jian-Hui Duan, Wenzhong Li, Sanglu Lu, FedDNA: Federated Learning with Decoupled Normalization-Layer Aggregation for Non-IID Data, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2021), Bilbao, Spain, Sep 13-17, 2021.
  • Jian-Hui Duan, Wenzhong Li, Xiao Zhang, Sanglu Lu, Forecasting fine-grained city-scale cellular traffic with sparse crowdsourced measurements, Computer Networks, 39(2461-2475), Volume 214, pp 1-14, Sep 4 2022.
  • Wangxiang Ding, Wenzhong Li, Zhijie Zhang, Chen Wan, Jianhui Duan, Sanglu Lu, Time-varying Gaussian Markov Random Fields Learning for Multivariate Time Series Clustering, IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 35, no. 11, Nov 2023.
  • Derun Zou, Xusheng Liu, Lintan Sun, Jian-Hui Duan, Ruichen Li, Yeting Xu, Wenzhong Li, Sanglu Lu, FedMC: Federated Reinforcement Learning on the Edge with Meta-Critic Networks, IEEE International Performance, Computing, and Communications Conference (IPCCC’22), Austin, Texas, USA, November 11-13, 2022.