Yuekai Sun

Papers

See the research page for a list of representative papers and Google Scholar for citation metrics. Feel free to contact me if you wish to access a paper that is not freely accessible.

Preprints

Efficient multi-prompt evaluation of LLMs
F Maia Polo, R Xu, L Weber, M Silva, O Bhardwaj, L Choshen, A Oliveira, Y Sun, M Yurochkin.

A statistical framework for weak-to-strong generalization
S Somerstep, F Maia Polo, M Banerjee, Y Ritov, M Yurochkin, Y Sun.

Prompt Exploration with Prompt Regression
M Feffer, R Xu, Y Sun, M Yurochkin. to appear in COLM 2024.

Learning the Distribution Map in Reverse Causal Performative Prediction
D Bracale, S Maity, S Somerstep, M Banerjee, Y Sun.

Aligners: Decoupling LLMs and Alignment
L Ngweta, M Agarwal, S Maity, A Gittens, Y Sun, M Yurochkin.
A short version appear as an ICLR 2024 TinyPaper.

Large Language Model Routing with Benchmark Datasets
T Shnitzer, A Ou, M Silva, K Soule, Y Sun, J Solomon, N Thompson, M Yurochkin. to appear in COLM 2024.

Journal papers

A linear adjustment based approach to posterior drift in transfer learning
S Maity, D Dutta, J Terhorst, Y Sun, M Banerjee. Biometrika (2024).
/smaityumich/linearly-shifted-transfer

Minimax optimal approaches to the label shift problem
S Maity, Y Sun, M Banerjee. Journal of Machine Learning Research (2022).
/smaityumich/label-shift

Meta-analysis of heterogeneous data: integrative sparse regression in high-dimensions
S Maity, Y Sun, M Banerjee. Journal of Machine Learning Research (2022).
/smaityumich/MrLasso

Matrix Completion Methods for the Total Electron Content Video Reconstruction
H Sun, Z Hua, J Ren, S Zou, Y Sun, Y Chen. Annals of Applied Statistics (2022).
/husun0822/TEC_impute

Uniform bounds for invariant subspace perturbations
A Damle, Y Sun. SIAM Journal of Matrix Analysis and Applications (2020).
/asdamle/rowwise-perturbation

Statistical convergence of the EM algorithm on Gaussian mixture models
R Zhao, Y Li, Y Sun. Electronic Journal of Statistics (2020).

Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v. 3.0
L Heirendt et al. Nature Protocols (2019).
/opencobra/cobratoolbox

A geometric approach to archetypal analysis and nonnegative matrix factorization
A Damle, Y Sun. Technometrics (2017).
/yuekai/archetypes
ASQ Jack Youden Award

Communication-efficient sparse regression
JD Lee, Q Liu, Y Sun, JE Taylor. Journal of Machine Learning Research (2017).

Exact post-selection inference, with application to the lasso
JD Lee, DL Sun, Y Sun, JE Taylor. Annals of Statistics (2016).
/selective-inference

Do genome‐scale models need exact solvers or clearer standards?
A Ebrahim et al. Molecular Systems Biology (2015).

Systems biology definition of the core proteome of metabolism and expression is consistent with high-throughput data
L Yang et al. Proceedings of the National Academy of Sciences (2015).

On model selection consistency of regularized M-estimators
JD Lee, Y Sun, JE Taylor. Electronic Journal of Statistics (2015).
A conference version appeared at NIPS 2013.

Proximal Newton-type methods for minimizing composite functions
JD Lee, Y Sun, MA Saunders. SIAM Journal on Optimization (2014).
/yuekai/PNOPT
A conference version appeared at NIPS 2012.

Humidity effects on anisotropic nanofriction behaviors of aligned carbon nanotube carpets
J Zhang, H Lu, Y Sun, L Ci, PM Ajayan, J Lou. ACS Applied Materials & Interfaces (2013).

Robust flux balance analysis of multiscale biochemical reaction networks
Y Sun, RMT Fleming, I Thiele, MA Saunders. BMC Bioinformatics (2013).
/opencobra/cobratoolbox

Nanostructure on taro leaves resists fouling by colloids and bacteria under submerged conditions
J Ma, Y Sun, K Gleichauf, J Lou, Q Li. Langmuir (2011).

Regular and reverse nanoscale stick-slip behavior: Modeling and experiments
F Landolsi, Y Sun, H Lu, FH Ghorbel, J Lou. Applied Surface Science (2010).

Nanoscale friction dynamic modeling
F Landolsi, FH Ghorbel, J Lou, H Lu, Y Sun. ASME Journal of Dynamic Systems, Measurement & Control (2009).

Friction and adhesion properties of vertically aligned multi-walled carbon nanotube arrays and fluoro-nanodiamond films
H Lu, J Goldman, F Ding, Y Sun, MX Pulikkathara, VN Khabashesku, BI Yakobson, J Lou. Carbon (2008).

Mesoscale reverse stick-slip nanofriction behavior of vertically aligned multiwalled carbon nanotube superlattices
J Lou, F Ding, H Lu, J Goldman, Y Sun, BI Yakobson. Applied Physics Letters (2008).

Conference papers

tinyBenchmarks: evaluating LLMs with few examples
F Maia Polo, L Weber, L Choshen, Y Sun, G Xu, M Yurochkin. ICML 2024.

Algorithmic Fairness in Performative Policy Learning: Escaping the Impossibility of Group Fairness
S Somerstep, Y Ritov, Y Sun. FAccT 2024.

Learning in reverse causal strategic environments with ramifications on two-sided markets
S Somerstep, Y Sun, Y Ritov. ICLR 2024.

Fusing Models with Complementary Expertise
H Wang, F Maia Polo, Y Sun, S Kundu, E Xing, M Yurochkin. ICLR 2024.

An Investigation of Representation and Allocation Harms in Contrastive Learning
S Maity, M Agarwal, M Yurochkin, Y Sun. ICLR 2024.

Conditional independence testing under misspecified inductive biases
F Maia Polo, Y Sun, M Banerjee. NeurIPS 2023.

Simple Disentanglement of Style and Content in Visual Representations
L Ngweta, S Maity, A Gittens, Y Sun, M Yurochkin. ICML 2023.
/lilianngweta/PISCO

Understanding new tasks through the lens of training data via exponential tilting
S Maity, M Yurochkin, M Banerjee, Y Sun. ICLR 2023.

Predictor-corrector algorithms for stochastic optimization under gradual distribution shift
S Maity, D Mukherjee, M Banerjee, Y Sun. ICLR 2023.
/smaityumich/concept-drift

ISAAC Newton: Input-based Approximate Curvature for Newton’s Method
F Petersen, T Sutter, C Borgelt, D Huh, H Kuehne, Y Sun, O Deussen. ICLR 2023.

Calibrated Data-Dependent Constraints with Exact Satisfaction Guarantees
S Xue, M Yurochkin, Y Sun. NeurIPS 2022.

Domain Adaptation meets Individual Fairness. And they get along.
D Mukherjee, F Petersen, M Yurochkin, Y Sun. NeurIPS 2022.

Post-processing for Individual Fairness
F Petersen, D Mukherjee, Y Sun, M Yurochkin. NeurIPS 2021.
/Felix-Petersen/fairness-post-processing

On sensitivity of meta-learning to support data
M Agarwal, M Yurochkin, Y Sun. NeurIPS 2021.

Does enforcing fairness mitigate biases caused by subpopulation shift?
S Maity, D Mukherjee, M Yurochkin, Y Sun. NeurIPS 2021.

Outlier-Robust Optimal Transport
D Mukherjee, A Guha, J Solomon, Y Sun, M Yurochkin. ICML 2021.
/debarghya-mukherjee/Robust-Optimal-Transport

Statistical Inference for Individual Fairness
S Maity, S Xue, M Yurochkin, Y Sun. ICLR 2021.
/smaityumich/individual-fairness-testing

Individually Fair Rankings
A Bower, H Eftekhari, M Yurochkin, Y Sun. ICLR 2021.

Individually fair gradient boosting
A Vargo, F Zhang, M Yurochkin, Y Sun. ICLR 2021.

SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness
M Yurochkin, Y Sun. ICLR 2021.

Two simple ways to learn individual fairness metrics from data
D Mukherjee, M Yurochkin, M Banerjee, Y Sun. ICML 2020.
/debarghya-mukherjee/Fair_metric_learning

Auditing ML models for individual bias and unfairness
S Xue, M Yurochkin, Y Sun. AISTATS 2020.

Federated Learning with Matched Averaging
H Wang, M Yurochkin, Y Sun, D Papailiopoulos, Y Khazaeni. ICLR 2020.
/IBM/FedMA

Training individually fair machine learning models with Sensitive Subspace Robustness
M Yurochkin, A Bower, Y Sun. ICLR 2020.
/IBM/sensitive-subspace-robustness

Dirichlet Simplex Nest and Geometric Inference
M Yurochkin, A Guha, Y Sun, XL Nguyen. ICML 2019.
/moonfolk/VLAD

Precision Matrix Estimation with Noisy and Missing Data
R Fan, B Jang, Y Sun, S Zhou. AISTATS 2019.

Debiasing representations by removing unwanted variation due to protected attributes
A Bower, L Niss, Y Sun, A Vargo. FAT/ML 2018.
/Amandarg/debias

Feature-distributed sparse regression: a screen-and-clean approach
J Yang, MW Mahoney, M Saunders, Y Sun. NIPS 2016.

Evaluating the statistical significance of biclusters
JD Lee, Y Sun, JE Taylor. NIPS 2015.

Learning Mixtures of Linear Classifiers
Y Sun, S Ioannidis, A Montanari. ICML 2014.

Book chapters

Communication Efficient Model Fusion
In Federated Learning: A Comprehensive Overview of Methods and Applications. H Ludwig, N Baracaldo (eds). Springer (2022).

Personalization in Federated Learning
In Federated Learning: A Comprehensive Overview of Methods and Applications. H Ludwig, N Baracaldo (eds). Springer (2022).

Technical reports

On uniform consistency of spectral embeddings
R Zhao, S Xue, Y Sun.

How does overparametrization affect performance on minority groups?
S Maity, S Roy, S Xue, M Yurochkin, Y Sun.
/smaityumich/overparameterization

Achieving Representative Data via Convex Hull Feasibility Sampling Algorithms
L Niss, Y Sun, A Tewari.

An inexact subsampled proximal Newton-type method for large-scale machine learning
X Liu, CJ Hsieh, JD Lee, Y Sun.

On conditional parity as a notion of non-discrimination in machine learning
Y Ritov, Y Sun, R Zhao.

Valid post-correction inference for censored regression problems
Y Sun, JE Taylor.