Papers

See the research page for representative papers and Google Scholar for citation metrics.

Preprints

Optimal Intervention for Self-triggering Spatial Networks with Application to Urban Crime Analytics
P Das, M Banerjee, Y Sun.

Learning to Choose or Choosing to Learn: Best-of-N vs. Supervised Fine-Tuning for Bit String Generation
S Somerstep, V Raman, U Subedi, Y Sun.

Optimal Nonlinear Online Learning under Sequential Price Competition via s-Concavity
D Bracale, M Banerjee, C Shi, Y Sun.
/ dbracale / Revenue-Maximization-Under-Sequential-Price-Competition

Likelihood-Free Estimation for Spatiotemporal Hawkes processes with missing data and application to predictive policing
P Das, M Banerjee, Y Sun.
/ pramit2020 / WGAN-for-estimation-in-ST-Hawkes-process-with-missingness

Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families
F Maia Polo, S Somerstep, L Choshen, Y Sun, M Yurochkin.
/ felipemaiapolo / sloth

How does overparametrization affect performance on minority groups?
S Maity, S Roy, S Xue, M Yurochkin, Y Sun.
/ smaityumich / overparameterization

Conference papers

Microfoundation Inference for Strategic Prediction
D Bracale, S Maity, F Maia Polo, S Somerstep, M Banerjee, Y Sun. AISTATS 2025.
/ felipemaiapolo / microfoundation_inference

Learning the Distribution Map in Reverse Causal Performative Prediction
D Bracale, S Maity, S Somerstep, M Banerjee, Y Sun. AISTATS 2025.
/ dbracale / LearningDistributionPP-Code

LiveXiv — A Multi-Modal Live Benchmark Based on Arxiv Papers Content
N Shabtay, F Maia Polo, S Doveh, W Lin, J Mirza, L Chosen, M Yurochkin, Y Sun, A Arbelle, L Karlinsky, R Giryes. ICLR 2025.
/ NimrodShabtay / LiveXiv

A transfer learning framework for weak-to-strong generalization
S Somerstep, F Maia Polo, M Banerjee, Y Ritov, M Yurochkin, Y Sun. ICLR 2025.
/ UMich-FATML / w2s-refinement

Distributionally robust performative prediction
S Xue, Y Sun. NeurIPS 2024.

Efficient multi-prompt evaluation of LLMs
F Maia Polo, R Xu, L Weber, M Silva, O Bhardwaj, L Choshen, A Oliveira, Y Sun, M Yurochkin. NeurIPS 2024.
/ felipemaiapolo / prompteval

Weak Supervision Performance Evaluation via Partial Identification
F Maia Polo, M Yurochkin, M Banerjee, S Maity, Y Sun. NeurIPS 2024.
/ felipemaiapolo / wsbounds

Aligners: Decoupling LLMs and Alignment
L Ngweta, M Agarwal, S Maity, A Gittens, Y Sun, M Yurochkin. EMNLP Findings 2024.
/ lilianngweta / aligners
A short version appeared as a ICLR 2024 TinyPaper.

Prompt Exploration with Prompt Regression
M Feffer, R Xu, Y Sun, M Yurochkin. COLM 2024.
/ UMich-FATML / prompt-regression

Large Language Model Routing with Benchmark Datasets
T Shnitzer, A Ou, M Silva, K Soule, Y Sun, J Solomon, N Thompson, M Yurochkin. COLM 2024.
/ UMich-FATML / llm-routing

tinyBenchmarks: evaluating LLMs with few examples
F Maia Polo, L Weber, L Choshen, Y Sun, G Xu, M Yurochkin. ICML 2024.
/ felipemaiapolo / tinyBenchmarks

Algorithmic Fairness in Performative Policy Learning: Escaping the Impossibility of Group Fairness
S Somerstep, Y Ritov, Y Sun. FAccT 2024.
/ UMich-FATML / fair-performative-prediction

Learning in reverse causal strategic environments with ramifications on two-sided markets
S Somerstep, Y Sun, Y Ritov. ICLR 2024.
/ UMich-FATML / anti-causal-strategic-classification

Fusing Models with Complementary Expertise
H Wang, F Maia Polo, Y Sun, S Kundu, E Xing, M Yurochkin. ICLR 2024.
/ hwang595 / FoE-ICLR2024

An Investigation of Representation and Allocation Harms in Contrastive Learning
S Maity, M Agarwal, M Yurochkin, Y Sun. ICLR 2024.
/ smaityumich / CL-representation-harm

Conditional independence testing under misspecified inductive biases
F Maia Polo, Y Sun, M Banerjee. NeurIPS 2023.
/ felipemaiapolo / cit

Simple Disentanglement of Style and Content in Visual Representations
L Ngweta, S Maity, A Gittens, Y Sun, M Yurochkin. ICML 2023.
/ lilianngweta / PISCO

Understanding new tasks through the lens of training data via exponential tilting
S Maity, M Yurochkin, M Banerjee, Y Sun. ICLR 2023.
/ smaityumich / exponential-tilting

Predictor-corrector algorithms for stochastic optimization under gradual distribution shift
S Maity, D Mukherjee, M Banerjee, Y Sun. ICLR 2023.
/ smaityumich / concept-drift

ISAAC Newton: Input-based Approximate Curvature for Newton’s Method
F Petersen, T Sutter, C Borgelt, D Huh, H Kuehne, Y Sun, O Deussen. ICLR 2023.
/ Felix-Petersen / isaac

Calibrated Data-Dependent Constraints with Exact Satisfaction Guarantees
S Xue, M Yurochkin, Y Sun. NeurIPS 2022.

Domain Adaptation meets Individual Fairness. And they get along.
D Mukherjee, F Petersen, M Yurochkin, Y Sun. NeurIPS 2022.

Post-processing for Individual Fairness
F Petersen, D Mukherjee, Y Sun, M Yurochkin. NeurIPS 2021.
/ Felix-Petersen / fairness-post-processing

On sensitivity of meta-learning to support data
M Agarwal, M Yurochkin, Y Sun. NeurIPS 2021.
/ UMich-FATML / adversarial-supports

Does enforcing fairness mitigate biases caused by subpopulation shift?
S Maity, D Mukherjee, M Yurochkin, Y Sun. NeurIPS 2021.
/ UMich-FATML / no-tradeoff

Outlier-Robust Optimal Transport
D Mukherjee, A Guha, J Solomon, Y Sun, M Yurochkin. ICML 2021.
/ debarghya-mukherjee / Robust-Optimal-Transport

Statistical Inference for Individual Fairness
S Maity, S Xue, M Yurochkin, Y Sun. ICLR 2021.
/ smaityumich / individual-fairness-testing

Individually Fair Rankings
A Bower, H Eftekhari, M Yurochkin, Y Sun. ICLR 2021.
/ UMich-FATML / SenSTIR

Individually fair gradient boosting
A Vargo, F Zhang, M Yurochkin, Y Sun. ICLR 2021.
/ UMich-FATML / BuDRO

SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness
M Yurochkin, Y Sun. ICLR 2021.
/ UMich-FATML / SenSeI

Two simple ways to learn individual fairness metrics from data
D Mukherjee, M Yurochkin, M Banerjee, Y Sun. ICML 2020.
/ debarghya-mukherjee / Fair_metric_learning

Auditing ML models for individual bias and unfairness
S Xue, M Yurochkin, Y Sun. AISTATS 2020.

Federated Learning with Matched Averaging
H Wang, M Yurochkin, Y Sun, D Papailiopoulos, Y Khazaeni. ICLR 2020.
/ IBM / FedMA

Training individually fair machine learning models with Sensitive Subspace Robustness
M Yurochkin, A Bower, Y Sun. ICLR 2020.
/ IBM / sensitive-subspace-robustness

Dirichlet Simplex Nest and Geometric Inference
M Yurochkin, A Guha, Y Sun, XL Nguyen. ICML 2019.
/ moonfolk / VLAD

Precision Matrix Estimation with Noisy and Missing Data
R Fan, B Jang, Y Sun, S Zhou. AISTATS 2019.

Debiasing representations by removing unwanted variation due to protected attributes
A Bower, L Niss, Y Sun, A Vargo. FAT/ML 2018.
/ Amandarg / debias

Feature-distributed sparse regression: a screen-and-clean approach
J Yang, MW Mahoney, M Saunders, Y Sun. NIPS 2016.

Evaluating the statistical significance of biclusters
JD Lee, Y Sun, JE Taylor. NIPS 2015.

Learning Mixtures of Linear Classifiers
Y Sun, S Ioannidis, A Montanari. ICML 2014.

Journal papers

Dynamic Pricing in the Linear Valuation Model using Shape Constraints
D Bracale, M Banerjee, Y Sun, K Stoll, S Turki. Transactions of Machine Learning Research (2025).
/ dbracale / DP_via_Antitonic_TMLR_2025

A linear adjustment based approach to posterior drift in transfer learning
S Maity, D Dutta, J Terhorst, Y Sun, M Banerjee. Biometrika (2024).
/ smaityumich / linearly-shifted-transfer

Minimax optimal approaches to the label shift problem
S Maity, Y Sun, M Banerjee. Journal of Machine Learning Research (2022).
/ smaityumich / label-shift

Meta-analysis of heterogeneous data: integrative sparse regression in high-dimensions
S Maity, Y Sun, M Banerjee. Journal of Machine Learning Research (2022).
/ smaityumich / MrLasso

Matrix Completion Methods for the Total Electron Content Video Reconstruction
H Sun, Z Hua, J Ren, S Zou, Y Sun, Y Chen. Annals of Applied Statistics (2022).
/ husun0822 / TEC_impute

Uniform bounds for invariant subspace perturbations
A Damle, Y Sun. SIAM Journal of Matrix Analysis and Applications (2020).
/ asdamle / rowwise-perturbation

Statistical convergence of the EM algorithm on Gaussian mixture models
R Zhao, Y Li, Y Sun. Electronic Journal of Statistics (2020).

Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v. 3.0
L Heirendt et al. Nature Protocols (2019).
/ opencobra / cobratoolbox

A geometric approach to archetypal analysis and nonnegative matrix factorization
A Damle, Y Sun. Technometrics (2017).
/ yuekai / archetypes
ASQ Jack Youden Award

Communication-efficient sparse regression
JD Lee, Q Liu, Y Sun, JE Taylor. Journal of Machine Learning Research (2017).

Exact post-selection inference, with application to the lasso
JD Lee, DL Sun, Y Sun, JE Taylor. Annals of Statistics (2016).

Do genome‐scale models need exact solvers or clearer standards?
A Ebrahim et al. Molecular Systems Biology (2015).
/ opencobra / m_model_collection

Systems biology definition of the core proteome of metabolism and expression is consistent with high-throughput data
L Yang et al. Proceedings of the National Academy of Sciences (2015).

On model selection consistency of regularized M-estimators
JD Lee, Y Sun, JE Taylor. Electronic Journal of Statistics (2015).
A conference version appeared at NIPS 2013.

Proximal Newton-type methods for minimizing composite functions
JD Lee, Y Sun, MA Saunders. SIAM Journal on Optimization (2014).
/ yuekai / PNOPT
A conference version appeared at NIPS 2012.

Humidity effects on anisotropic nanofriction behaviors of aligned carbon nanotube carpets
J Zhang, H Lu, Y Sun, L Ci, PM Ajayan, J Lou. ACS Applied Materials & Interfaces (2013).

Robust flux balance analysis of multiscale biochemical reaction networks
Y Sun, RMT Fleming, I Thiele, MA Saunders. BMC Bioinformatics (2013).

Nanostructure on taro leaves resists fouling by colloids and bacteria under submerged conditions
J Ma, Y Sun, K Gleichauf, J Lou, Q Li. Langmuir (2011).

Regular and reverse nanoscale stick-slip behavior: Modeling and experiments
F Landolsi, Y Sun, H Lu, FH Ghorbel, J Lou. Applied Surface Science (2010).

Nanoscale friction dynamic modeling
F Landolsi, FH Ghorbel, J Lou, H Lu, Y Sun. ASME Journal of Dynamic Systems, Measurement & Control (2009).

Friction and adhesion properties of vertically aligned multi-walled carbon nanotube arrays and fluoro-nanodiamond films
H Lu, J Goldman, F Ding, Y Sun, MX Pulikkathara, VN Khabashesku, BI Yakobson, J Lou. Carbon (2008).

Mesoscale reverse stick-slip nanofriction behavior of vertically aligned multiwalled carbon nanotube superlattices
J Lou, F Ding, H Lu, J Goldman, Y Sun, BI Yakobson. Applied Physics Letters (2008).

Book chapters

Communication Efficient Model Fusion
In Federated Learning: A Comprehensive Overview of Methods and Applications. H Ludwig, N Baracaldo (eds). Springer (2022).

Personalization in Federated Learning
In Federated Learning: A Comprehensive Overview of Methods and Applications. H Ludwig, N Baracaldo (eds). Springer (2022).

Technical reports

On uniform consistency of spectral embeddings
R Zhao, S Xue, Y Sun.

Achieving Representative Data via Convex Hull Feasibility Sampling Algorithms
L Niss, Y Sun, A Tewari.

An inexact subsampled proximal Newton-type method for large-scale machine learning
X Liu, CJ Hsieh, JD Lee, Y Sun.

On conditional parity as a notion of non-discrimination in machine learning
Y Ritov, Y Sun, R Zhao.

Valid post-correction inference for censored regression problems
Y Sun, JE Taylor.