Hanmin Li

Biography

Hi there! I am Hanmin Li, a PhD candidate in Computer Science at KAUST under the supervision of Prof. Peter Richtárik. My research focus lies at the intersection of optimization and large language models (LLMs), with a focus on training efficiency and scalability. I am also interested in distributed training and the theoretical foundations of learning from decentralized data.

More broadly, my interests also extends to the theory of modern machine learning, including first-order methods, convex and non-convex optimization, and operator theory, as well as applied areas like deep learning and language modeling.

Before starting my Ph.D., I earned my master degress in Computer Science also at KAUST, after completing a B.S. in Computer Science and Technology at the School of the Gifted Young in the University of Science and Technology of China (USTC).

Currently, I am working on:

Distributed training of large language models (LLMs), including experience with large-scale GPU clusters and training using PyTorch Distributed Data Parallel (DDP).
Efficient optimizer design for large-scale training, with a focus on advancing the Muon optimizer and its variants to achieve faster convergence and improved scalability.
Designing efficient algorithms for large language models (LLMs), with a focus on both theoretical analysis and empirical validation.

For any inquiries, feel free to contact me at hanmin.li@kaust.edu.sa.

I am currently open to internship opportunities in related areas.

Papers

The Ball-Proximal (=”Broximal”) Point Method: a New Algorithm, Convergence Theory, and Applications
Kaja Gruntkowska, Hanmin Li, Aadi Rane, Peter Richtárik, arXiv preprint. • [paper] • [BibTex]

The Power of Extrapolation in Federated Learning
Hanmin Li, Kirill Acharya, and Peter Richtárik., NeurIPS 2024. • [paper] • [BibTex]

On the Convergence of FedProx with Extrapolation and Inexact Prox
Hanmin Li, and Peter Richtárik., NeurIPS 2024 OPT-ML Workshop Poster. • [paper] • [BibTex]

Det-CGD: Compressed Gradient Descent with Matrix Stepsizes for Non-Convex Optimization
Hanmin Li, Avetik Karagulyan, Peter Richtárik, ICLR 2024. • [paper] • [BibTex]

Variance reduced distributed non-convex optimization using matrix stepsizes
Hanmin Li, Avetik Karagulyan, Peter Richtárik, NeurIPS 2023 FL@FM Workshop. • [paper] • [BibTex]

SD2: spatially resolved transcriptomics deconvolution through integration of dropout and spatial information
Haoyang Li, Hanmin Li, Juexiao Zhou, Xin Gao, Bioinformatics. • [paper] • [BibTex]

→ full list

Talks

Poster Presentation of On the Convergence of FedProx with Extrapolation and Inexact Prox
Dec 15, 2024 — NeurIPS 2024 OPT-ML Workshop, Vancouver, Canada

Poster Presentation of The Power of Extrapolation in Federated Learning
Dec 11, 2024 — NeurIPS 2024, Vancouver, Canada

Talk of Det-CGD: Compressed Gradient Descent with Matrix Stepsizes for Non-Convex Optimization
Jun 27, 2024 — EUROPT 2024, Lund, Sweden

Poster Presentation of Det-CGD: Compressed Gradient Descent with Matrix Stepsizes for Non-Convex Optimization
May 07, 2024 — ICLR 2024, Vienne, Austria

→ full list

Reviewer Services

NeurIPS 24’, 25’
NeurIPS OPT-ML 24’
ICLR 25’
ICML 25’
JMLR
IEEE TNNLS
IEEE TSP
Optimization Methods and Software.

Hanmin Li (李瀚民)

Biography

Recent News

Papers

Talks

Reviewer Services

Hanmin Li (李 瀚民)

Biography

Recent News

Papers

Talks

Reviewer Services

Hanmin Li (李瀚民)