Yuekai Sun

profile pic

Curriculum Vitae

Education

2010–2015
PhD Computational & Mathematical Engineering, Stanford University
Thesis: Regularization in high-dimensional statistics
2006–2010
BA Computational and Applied Mathematics (CAAM), Rice University

Academic appointments

2025–
Researcher, MBZUAI Institute of Foundation Models (IFM)
2025–
MS Data Science Program Director, University of Michigan
2016–
Assistant → Associate Professor of Statistics, University of Michigan
Promoted to Associate Professor (with tenure) in 2023
2017–2018
Visiting Scholar, Stanford University
2015–2016
Neyman Visiting Assistant Professor, UC Berkeley

Other experience

2025–
Academic Collaborator, Meta
2013
Research Intern, Technicolor Research & Innovation
2008–2009
Teaching Fellow, Breakthrough Houston

Honors and awards

2017
ASQ Jack Youden Prize
2015
Stanford ICME Super Hero Award
2010
Stanford School of Engineering Graduate Fellowship
2009
Rice CAAM Chevron Prize
2009
Rice Engineering Louis J Walsh Scholarship
2008
Rice Engineering WM Moody Jr Scholarship

Grants

Sep 2024–Aug 2027
CIF: Small: Learning in Strategic Environments with Applications in Algorithmic Fairness
NSF CCF 2414918. Award amount: $544k.
Jul 2024–Jun 2025
OpenAI Superalignment Fast Grant: A Mathematical Theory of Weak-to-strong Generalization
Award amount: $102k.
Aug 2022–
Jul 2024
Confident Learning with Uncertainty Estimates
DARPA HR00112290111. PI: Elizabeth Hou. My/Total award amount: $184k/$1M.
Aug 2021–
Jul 2024
A Transfer Learning Approach to Algorithmic Fairness
NSF DMS 2113373. Award amount: $150k.
Sep 2021–
Aug 2024
ATD: Algorithmic Threat Detection and Mitigation with Robust Machine Learning
NSF DMS 2027737. Award amount: $330k.
Sep 2019–
Aug 2022
Integrative Analysis of Heterogeneous Datasets with High-Dimensional and Non-Standard Models
NSF DMS 1916271. Award amount: $180k.
Aug 2018–
Jul 2021
ATD: Collaborative Research: Statistically Principled Real-Time Detection of Anomalies in Temporal Network Data
NSF DMS 1830247. My/Total award amount: $75k/$200k.

Professional activities

2023–
Co-editor (with Samory Kpotufe and Richard Samworth), Statistical Science Special Issue on “Learning Across Distributions”
2023
Organizing committee member, 2023 ICSA Applied Statistics Symposium
2022–
Associate Editor, Statistical Science

Teaching

University of Michigan

Win 2020-23, 25
STATS 606: Computation and Optimization Methods in Statistics
This course combines STATS 608a and STATS 607b.
Fall 2024, 22, 18
STATS 415: Data Mining and Machine Learning
Win 2022
STATS 701: Topics in Algorithmic Fairness
Fall 2021, 17, 16
Win 17
STATS 413: Applied Regression Analysis
Fall 20
STATS 451: Bayesian Data Analysis
Fall 18, 17
STATS 608a: Optimization Methods in Statistics
Sum 18
VE 488: Data Mining and Machine Learning
This course was offered at the UM-SJTU Joint Institute; it is identical to STATS 415.
Win 19, 18
STATS 607b: Numerical Methods in Statistics

UC Berkeley

Spr 2016
STAT 153: Introduction to Time Series
Fall 2015
STAT 201B: Introduction to Statistics at an Advanced Level

PhD students

2024–
Jiwoo Han (co-advised with Mouli Banerjee)
2023–
Andrej Leban
2021–
Seamus Somerstep (co-advised with Ya’acov Ritov)
2021–
Daniele Bracale (co-advised with Mouli Banerjee)
2021–
Felipe Maia Polo (co-advised with Mouli Banerjee)
2020–2025
Pramit Das (co-advised with Mouli Banerjee)
Thesis: Generative Machine Learning, Granger Causality, and Optimal Intervention in Self-exciting Spatiotemporal Processes
Pramit is an operations research analyst at American Airlines.
2018–2024
Songkai Xue
Thesis: Advances in Machine Learning Safety
Songkai is a research scientist at Huawei.
2018–2024
Subha Maity (co-advised with Mouli Banerjee)
Thesis: An Exploration of the Statistical Challenges and Fairness Implications of Transfer Learning
Subha is a faculty member at the University of Waterloo.
2016–2022
Laura Niss (co-advised with Ambuj Tewari)
Thesis: Topics in Sequential Decision Making and Algorithmic Fairness
Laura is a member of the technical staff at MIT Lincoln Laboratory.
2014–2020
Roger Fan (co-advised with Shuheng Zhou)
Thesis: Covariance Estimation with Missing and Dependent Data
Roger is a research scientist at Amazon.
2014–2019
Ruofei Zhao
Thesis: Convergence and Consistency Results in Spectral Clustering and Gaussian Mixture Models
Ruofei is a quantitative researcher at Sunrise Futures LLC.

Papers

See Google Scholar for up-to-date citation metrics.

Conference papers

Bridging Human and LLM Judgments: Understanding and Narrowing the Gap
F Maia Polo, X Wang, M Yurochkin, G Xu, M Banerjee, Y Sun. NeurIPS 2025.

Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families
F Maia Polo, S Somerstep, L Choshen, Y Sun, M Yurochkin. NeurIPS 2025.

Microfoundation Inference for Strategic Prediction
D Bracale, S Maity, F Maia Polo, S Somerstep, M Banerjee, Y Sun. AISTATS 2025.

Learning the Distribution Map in Reverse Causal Performative Prediction
D Bracale, S Maity, S Somerstep, M Banerjee, Y Sun. AISTATS 2025.

LiveXiv — A Multi-Modal Live Benchmark Based on Arxiv Papers Content
N Shabtay, F Maia Polo, S Doveh, W Lin, J Mirza, L Chosen, M Yurochkin, Y Sun, A Arbelle, L Karlinsky, R Giryes. ICLR 2025.

A transfer learning framework for weak-to-strong generalization
S Somerstep, F Maia Polo, M Banerjee, Y Ritov, M Yurochkin, Y Sun. ICLR 2025.

Distributionally robust performative prediction
S Xue, Y Sun. NeurIPS 2024.

Efficient multi-prompt evaluation of LLMs
F Maia Polo, R Xu, L Weber, M Silva, O Bhardwaj, L Choshen, A Oliveira, Y Sun, M Yurochkin. NeurIPS 2024.

Weak Supervision Performance Evaluation via Partial Identification
F Maia Polo, M Yurochkin, M Banerjee, S Maity, Y Sun. NeurIPS 2024.

Aligners: Decoupling LLMs and Alignment
L Ngweta, M Agarwal, S Maity, A Gittens, Y Sun, M Yurochkin. EMNLP Findings 2024.
A short version appeared as a ICLR 2024 TinyPaper.

Prompt Exploration with Prompt Regression
M Feffer, R Xu, Y Sun, M Yurochkin. COLM 2024.

Large Language Model Routing with Benchmark Datasets
T Shnitzer, A Ou, M Silva, K Soule, Y Sun, J Solomon, N Thompson, M Yurochkin. COLM 2024.

tinyBenchmarks: evaluating LLMs with few examples
F Maia Polo, L Weber, L Choshen, Y Sun, G Xu, M Yurochkin. ICML 2024.

Algorithmic Fairness in Performative Policy Learning: Escaping the Impossibility of Group Fairness
S Somerstep, Y Ritov, Y Sun. FAccT 2024.

Learning in reverse causal strategic environments with ramifications on two-sided markets
S Somerstep, Y Sun, Y Ritov. ICLR 2024.

Fusing Models with Complementary Expertise
H Wang, F Maia Polo, Y Sun, S Kundu, E Xing, M Yurochkin. ICLR 2024.

An Investigation of Representation and Allocation Harms in Contrastive Learning
S Maity, M Agarwal, M Yurochkin, Y Sun. ICLR 2024.

Conditional independence testing under misspecified inductive biases
F Maia Polo, Y Sun, M Banerjee. NeurIPS 2023.

Simple Disentanglement of Style and Content in Visual Representations
L Ngweta, S Maity, A Gittens, Y Sun, M Yurochkin. ICML 2023.

Understanding new tasks through the lens of training data via exponential tilting
S Maity, M Yurochkin, M Banerjee, Y Sun. ICLR 2023.

Predictor-corrector algorithms for stochastic optimization under gradual distribution shift
S Maity, D Mukherjee, M Banerjee, Y Sun. ICLR 2023.

ISAAC Newton: Input-based Approximate Curvature for Newton’s Method
F Petersen, T Sutter, C Borgelt, D Huh, H Kuehne, Y Sun, O Deussen. ICLR 2023.

Calibrated Data-Dependent Constraints with Exact Satisfaction Guarantees
S Xue, M Yurochkin, Y Sun. NeurIPS 2022.

Domain Adaptation meets Individual Fairness. And they get along.
D Mukherjee, F Petersen, M Yurochkin, Y Sun. NeurIPS 2022.

Post-processing for Individual Fairness
F Petersen, D Mukherjee, Y Sun, M Yurochkin. NeurIPS 2021.

On sensitivity of meta-learning to support data
M Agarwal, M Yurochkin, Y Sun. NeurIPS 2021.

Does enforcing fairness mitigate biases caused by subpopulation shift?
S Maity, D Mukherjee, M Yurochkin, Y Sun. NeurIPS 2021.

Outlier-Robust Optimal Transport
D Mukherjee, A Guha, J Solomon, Y Sun, M Yurochkin. ICML 2021.

Statistical Inference for Individual Fairness
S Maity, S Xue, M Yurochkin, Y Sun. ICLR 2021.

Individually Fair Rankings
A Bower, H Eftekhari, M Yurochkin, Y Sun. ICLR 2021.

Individually fair gradient boosting
A Vargo, F Zhang, M Yurochkin, Y Sun. ICLR 2021.

SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness
M Yurochkin, Y Sun. ICLR 2021.

Two simple ways to learn individual fairness metrics from data
D Mukherjee, M Yurochkin, M Banerjee, Y Sun. ICML 2020.

Auditing ML models for individual bias and unfairness
S Xue, M Yurochkin, Y Sun. AISTATS 2020.

Federated Learning with Matched Averaging
H Wang, M Yurochkin, Y Sun, D Papailiopoulos, Y Khazaeni. ICLR 2020.

Training individually fair machine learning models with Sensitive Subspace Robustness
M Yurochkin, A Bower, Y Sun. ICLR 2020.

Dirichlet Simplex Nest and Geometric Inference
M Yurochkin, A Guha, Y Sun, XL Nguyen. ICML 2019.

Precision Matrix Estimation with Noisy and Missing Data
R Fan, B Jang, Y Sun, S Zhou. AISTATS 2019.

Debiasing representations by removing unwanted variation due to protected attributes
A Bower, L Niss, Y Sun, A Vargo. FAT/ML 2018.

Feature-distributed sparse regression: a screen-and-clean approach
J Yang, MW Mahoney, M Saunders, Y Sun. NIPS 2016.

Evaluating the statistical significance of biclusters
JD Lee, Y Sun, JE Taylor. NIPS 2015.

Learning Mixtures of Linear Classifiers
Y Sun, S Ioannidis, A Montanari. ICML 2014.

On model selection consistency of regularized M-estimators
JD Lee, Y Sun, JE Taylor. NIPS 2013.
A journal version appeared in the Electronic Journal of Statistics in 2015.

Proximal Newton-type methods for minimizing composite functions
JD Lee, Y Sun, MA Saunders. NIPS 2012.
A journal version appeared in the SIAM Journal on Optimization in 2014.

Journal papers

How does overparametrization affect performance on minority groups?
S Maity, S Roy, S Xue, M Yurochkin, Y Sun. Transactions of Machine Learning Research (2025).

Dynamic Pricing in the Linear Valuation Model using Shape Constraints
D Bracale, M Banerjee, Y Sun, K Stoll, S Turki. Transactions of Machine Learning Research (2025).

A linear adjustment based approach to posterior drift in transfer learning
S Maity, D Dutta, J Terhorst, Y Sun, M Banerjee. Biometrika (2024).

Minimax optimal approaches to the label shift problem
S Maity, Y Sun, M Banerjee. Journal of Machine Learning Research (2022).

Meta-analysis of heterogeneous data: integrative sparse regression in high-dimensions
S Maity, Y Sun, M Banerjee. Journal of Machine Learning Research (2022).

Matrix Completion Methods for the Total Electron Content Video Reconstruction
H Sun, Z Hua, J Ren, S Zou, Y Sun, Y Chen. Annals of Applied Statistics (2022+).

Uniform bounds for invariant subspace perturbations
A Damle, Y Sun. SIAM Journal of Matrix Analysis and Applications (2020).

Statistical convergence of the EM algorithm on Gaussian mixture models
R Zhao, Y Li, Y Sun. Electronic Journal of Statistics (2020).

Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v. 3.0
L Heirendt et al. Nature Protocols (2019).

A geometric approach to archetypal analysis and nonnegative matrix factorization
A Damle, Y Sun. Technometrics (2017).

Communication-efficient sparse regression
JD Lee, Q Liu, Y Sun, JE Taylor. Journal of Machine Learning Research (2017).

Exact post-selection inference, with application to the lasso
JD Lee, DL Sun, Y Sun, JE Taylor. Annals of Statistics (2016).

Do genome‐scale models need exact solvers or clearer standards?
A Ebrahim et al. Molecular Systems Biology (2015).

Systems biology definition of the core proteome of metabolism and expression is consistent with high-throughput data
L Yang et al. Proceedings of the National Academy of Sciences (2015).

On model selection consistency of regularized M-estimators
JD Lee, Y Sun, JE Taylor. Electronic Journal of Statistics (2015).
A conference version appeared at NIPS 2013.

Proximal Newton-type methods for minimizing composite functions
JD Lee, Y Sun, MA Saunders. SIAM Journal on Optimization (2014).
A conference version appeared at NIPS 2012.

Humidity effects on anisotropic nanofriction behaviors of aligned carbon nanotube carpets
J Zhang, H Lu, Y Sun, L Ci, PM Ajayan, J Lou. ACS Applied Materials & Interfaces (2013).

Robust flux balance analysis of multiscale biochemical reaction networks
Y Sun, RMT Fleming, I Thiele, MA Saunders. BMC Bioinformatics (2013).

Nanostructure on taro leaves resists fouling by colloids and bacteria under submerged conditions
J Ma, Y Sun, K Gleichauf, J Lou, Q Li. Langmuir (2011).

Regular and reverse nanoscale stick-slip behavior: Modeling and experiments
F Landolsi, Y Sun, H Lu, FH Ghorbel, J Lou. Applied Surface Science (2010).

Nanoscale friction dynamic modeling
F Landolsi, FH Ghorbel, J Lou, H Lu, Y Sun. ASME Journal of Dynamic Systems, Measurement & Control (2009).

Friction and adhesion properties of vertically aligned multi-walled carbon nanotube arrays and fluoro-nanodiamond films
H Lu, J Goldman, F Ding, Y Sun, MX Pulikkathara, VN Khabashesku, BI Yakobson, J Lou. Carbon (2008).

Mesoscale reverse stick-slip nanofriction behavior of vertically aligned multiwalled carbon nanotube superlattices
J Lou, F Ding, H Lu, J Goldman, Y Sun, BI Yakobson. Applied Physics Letters (2008).

Patents

Learning Mahalanobis Distance Metrics from Data
M Yurochkin, D Mukherjee, M Banerjee, Y Sun, S Upadhyay. US 2022/0405529 A1.

Book chapters

Communication Efficient Model Fusion
M Yurochkin, Y Sun.
In Federated Learning: A Comprehensive Overview of Methods and Applications. H Ludwig, N Baracaldo (eds). Springer (2022).

Personalization in Federated Learning
M Agarwal, M Yurochkin, Y Sun.
In Federated Learning: A Comprehensive Overview of Methods and Applications. H Ludwig, N Baracaldo (eds). Springer (2022).

Technical reports

K2-V2: A 360-Open, Reasoning-Enhanced LLM
K2 Team: Institute of Foundation Models. IFM Technical Report: 2025-12-02.

On uniform consistency of spectral embeddings
R Zhao, S Xue, Y Sun.

Achieving Representative Data via Convex Hull Feasibility Sampling Algorithms
L Niss, Y Sun, A Tewari.

An inexact subsampled proximal Newton-type method for large-scale machine learning
X Liu, CJ Hsieh, JD Lee, Y Sun.

On conditional parity as a notion of non-discrimination in machine learning
Y Ritov, Y Sun, R Zhao.

Valid post-correction inference for censored regression problems
Y Sun, JE Taylor.

Talks