Skip to content
View Hiroki11x's full-sized avatar
🦙
🦙

Organizations

@jphacks @rioyokotalab @crest-deep @TITAMAS @RotaPlusPlus @Agents-NY @ArtHackDay-Plus1 @MLHPC @montreal-academy

Block or report Hiroki11x

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Hiroki11x/README.md

About

I am a Ph.D. candidate in Computer Science at Université de Montréal and Mila - Quebec AI Institute, advised by Ioannis Mitliagkas.

My research focuses on large-scale optimization and distributed training for deep learning, with an emphasis on high-performance computing and the efficient training of large language models. I am particularly interested in semi-synchronous and large-batch training regimes, including critical batch size analysis, as well as non-Euclidean optimization and the design of scalable optimization algorithms.

Previously, I worked on out-of-distribution generalization and confidence calibration, and investigated optimization dynamics in generative models. More recently, my research has expanded to efficient fine-tuning and fairness in large language models, bridging theoretical insights with practical large-scale systems.

I most recently completed a research internship at Meta Superintelligence Lab - Infra (Menlo Park), working on large-batch training and optimization for foundation models. I have also been a Student Researcher at Google DeepMind (Mountain View) and a research intern at Microsoft Research (Redmond).

I am a recipient of the Masason Foundation Fellowship and the RBC Borealis Fellowship. I received my B.Sc. (2017) and M.Sc. (2019) from the Tokyo Institute of Technology, graduating as Valedictorian.

CV (Jan 2026) / Résumé (Jan 2026) / Biography

Pinned Loading

  1. horovod/horovod horovod/horovod Public

    Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

    Python 14.7k 2.3k

  2. LossLandscapeGeometry LossLandscapeGeometry Public

    No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths (ICML2024)

    Shell 8

  3. Optimizer_Comparison_OOD Optimizer_Comparison_OOD Public

    Empirical Study on Optimizer Selection for Out-of-Distribution Generalization (TMLR2023)

    Python 5

  4. ConjugateGradient_GAN ConjugateGradient_GAN Public

    Conjugate Gradient Method for Generative Adversarial Networks (AISTATS2023)

    Python 4 1

  5. Timm_OOD_Calibration Timm_OOD_Calibration Public

    An Empirical Study of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration (TMLR2025)

    Python 3

  6. Pseudo-Asynchronous-LocalSGD Pseudo-Asynchronous-LocalSGD Public

    Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training (TMLR2025)

    Python 1