Yuekai Sun

Papers

See the research page for representative papers and Google Scholar for citation metrics and a comprehensive list of publications.

Preprints

tinyBenchmarks: evaluating LLMs with few examples
F Maia Polo, L Weber, L Choshen, Y Sun, G Xu, M Yurochkin.

Learning in reverse causal strategic environments with ramifications on two-sided markets
S Somerstep, Y Sun, Y Ritov. ICLR 2024.

Fusing Models with Complementary Expertise
H Wang, F Maia Polo, Y Sun, S Kundu, E Xing, M Yurochkin. ICLR 2024.

Large Language Model Routing with Benchmark Datasets
T Shnitzer, A Ou, M Silva, K Soule, Y Sun, J Solomon, N Thompson, M Yurochkin.

An Investigation of Representation and Allocation Harms in Contrastive Learning
S Maity, M Agarwal, M Yurochkin, Y Sun. ICLR 2024.
/smaityumich/CL-representation-harm

On uniform consistency of spectral embeddings
R Zhao, S Xue, Y Sun.

How does overparametrization affect performance on minority groups?
S Maity, S Roy, S Xue, M Yurochkin, Y Sun.
/smaityumich/overparameterization

Journal papers

A linear adjustment based approach to posterior drift in transfer learning
S Maity, D Dutta, J Terhorst, Y Sun, M Banerjee. Biometrika (2023).
/smaityumich/linearly-shifted-transfer

Minimax optimal approaches to the label shift problem
S Maity, Y Sun, M Banerjee. Journal of Machine Learning Research (2022).
/smaityumich/label-shift

Meta-analysis of heterogeneous data: integrative sparse regression in high-dimensions
S Maity, Y Sun, M Banerjee. Journal of Machine Learning Research (2022).
/smaityumich/MrLasso

Matrix Completion Methods for the Total Electron Content Video Reconstruction
H Sun, Z Hua, J Ren, S Zou, Y Sun, Y Chen. Annals of Applied Statistics (2022).
/husun0822/TEC_impute

Uniform bounds for invariant subspace perturbations
A Damle, Y Sun. SIAM Journal of Matrix Analysis and Applications (2020).
/asdamle/rowwise-perturbation

Statistical convergence of the EM algorithm on Gaussian mixture models
R Zhao, Y Li, Y Sun. Electronic Journal of Statistics (2020).

Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v. 3.0
L Heirendt et al. Nature Protocols (2019).
/opencobra/cobratoolbox

A geometric approach to archetypal analysis and nonnegative matrix factorization
A Damle, Y Sun. Technometrics (2017).
/yuekai/archetypes
ASQ Jack Youden Award

Communication-efficient sparse regression
JD Lee, Q Liu, Y Sun, JE Taylor. Journal of Machine Learning Research (2017).

Exact post-selection inference, with application to the lasso
JD Lee, DL Sun, Y Sun, JE Taylor. Annals of Statistics (2016).
/selective-inference

Do genome‐scale models need exact solvers or clearer standards?
A Ebrahim et al. Molecular Systems Biology (2015).

Systems biology definition of the core proteome of metabolism and expression is consistent with high-throughput data
L Yang et al. Proceedings of the National Academy of Sciences (2015).

On model selection consistency of regularized M-estimators
JD Lee, Y Sun, JE Taylor. Electronic Journal of Statistics (2015).
A conference version appeared at NIPS 2013.

Proximal Newton-type methods for minimizing composite functions
JD Lee, Y Sun, MA Saunders. SIAM Journal on Optimization (2014).
/yuekai/PNOPT
A conference version appeared at NIPS 2012.

Humidity effects on anisotropic nanofriction behaviors of aligned carbon nanotube carpets
J Zhang, H Lu, Y Sun, L Ci, PM Ajayan, J Lou. ACS Applied Materials & Interfaces (2013).

Robust flux balance analysis of multiscale biochemical reaction networks
Y Sun, RMT Fleming, I Thiele, MA Saunders. BMC Bioinformatics (2013).
/opencobra/cobratoolbox

Nanostructure on taro leaves resists fouling by colloids and bacteria under submerged conditions
J Ma, Y Sun, K Gleichauf, J Lou, Q Li. Langmuir (2011).

Regular and reverse nanoscale stick-slip behavior: Modeling and experiments
F Landolsi, Y Sun, H Lu, FH Ghorbel, J Lou. Applied Surface Science (2010).

Nanoscale friction dynamic modeling
F Landolsi, FH Ghorbel, J Lou, H Lu, Y Sun. ASME Journal of Dynamic Systems, Measurement & Control (2009).

Friction and adhesion properties of vertically aligned multi-walled carbon nanotube arrays and fluoro-nanodiamond films
H Lu, J Goldman, F Ding, Y Sun, MX Pulikkathara, VN Khabashesku, BI Yakobson, J Lou. Carbon (2008).

Mesoscale reverse stick-slip nanofriction behavior of vertically aligned multiwalled carbon nanotube superlattices
J Lou, F Ding, H Lu, J Goldman, Y Sun, BI Yakobson. Applied Physics Letters (2008).

Conference papers

Code for papers in conferences hosted on OpenReview are available on OpenReview (so they are not linked to here).

Conditional independence testing under misspecified inductive biases
F Maia Polo, Y Sun, M Banerjee. NeurIPS 2023.

Simple Disentanglement of Style and Content in Visual Representations
L Ngweta, S Maity, A Gittens, Y Sun, M Yurochkin. ICML 2023.
/lilianngweta/PISCO

Understanding new tasks through the lens of training data via exponential tilting
S Maity, M Yurochkin, M Banerjee, Y Sun. ICLR 2023.

Predictor-corrector algorithms for stochastic optimization under gradual distribution shift
S Maity, D Mukherjee, M Banerjee, Y Sun. ICLR 2023.
/smaityumich/concept-drift

ISAAC Newton: Input-based Approximate Curvature for Newton’s Method
F Petersen, T Sutter, C Borgelt, D Huh, H Kuehne, Y Sun, O Deussen. ICLR 2023.

Calibrated Data-Dependent Constraints with Exact Satisfaction Guarantees
S Xue, M Yurochkin, Y Sun. NeurIPS 2022.

Domain Adaptation meets Individual Fairness. And they get along.
D Mukherjee, F Petersen, M Yurochkin, Y Sun. NeurIPS 2022.

Post-processing for Individual Fairness
F Petersen, D Mukherjee, Y Sun, M Yurochkin. NeurIPS 2021.
/Felix-Petersen/fairness-post-processing

On sensitivity of meta-learning to support data
M Agarwal, M Yurochkin, Y Sun. NeurIPS 2021.

Does enforcing fairness mitigate biases caused by subpopulation shift?
S Maity, D Mukherjee, M Yurochkin, Y Sun. NeurIPS 2021.

Outlier-Robust Optimal Transport
D Mukherjee, A Guha, J Solomon, Y Sun, M Yurochkin. ICML 2021.
/debarghya-mukherjee/Robust-Optimal-Transport

Statistical Inference for Individual Fairness
S Maity, S Xue, M Yurochkin, Y Sun. ICLR 2021.
/smaityumich/individual-fairness-testing

Individually Fair Rankings
A Bower, H Eftekhari, M Yurochkin, Y Sun. ICLR 2021.

Individually fair gradient boosting
A Vargo, F Zhang, M Yurochkin, Y Sun. ICLR 2021.

SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness
M Yurochkin, Y Sun. ICLR 2021.

Two simple ways to learn individual fairness metrics from data
D Mukherjee, M Yurochkin, M Banerjee, Y Sun. ICML 2020.
/debarghya-mukherjee/Fair_metric_learning

Auditing ML models for individual bias and unfairness
S Xue, M Yurochkin, Y Sun. AISTATS 2020.

Federated Learning with Matched Averaging
H Wang, M Yurochkin, Y Sun, D Papailiopoulos, Y Khazaeni. ICLR 2020.
/IBM/FedMA

Training individually fair machine learning models with Sensitive Subspace Robustness
M Yurochkin, A Bower, Y Sun. ICLR 2020.
/IBM/sensitive-subspace-robustness

Dirichlet Simplex Nest and Geometric Inference
M Yurochkin, A Guha, Y Sun, XL Nguyen. ICML 2019.
/moonfolk/VLAD

Precision Matrix Estimation with Noisy and Missing Data
R Fan, B Jang, Y Sun, S Zhou. AISTATS 2019.

Debiasing representations by removing unwanted variation due to protected attributes
A Bower, L Niss, Y Sun, A Vargo. FAT/ML 2018.
/Amandarg/debias
This conference is now called ACM FAccT.

Feature-distributed sparse regression: a screen-and-clean approach
J Yang, MW Mahoney, M Saunders, Y Sun. NIPS 2016.

Evaluating the statistical significance of biclusters
JD Lee, Y Sun, JE Taylor. NIPS 2015.

Learning Mixtures of Linear Classifiers
Y Sun, S Ioannidis, A Montanari. ICML 2014.

Book chapters

Communication Efficient Model Fusion
In Federated Learning: A Comprehensive Overview of Methods and Applications. H Ludwig, N Baracaldo (eds). Springer (2022).

Personalization in Federated Learning
In Federated Learning: A Comprehensive Overview of Methods and Applications. H Ludwig, N Baracaldo (eds). Springer (2022).

Technical reports

Achieving Representative Data via Convex Hull Feasibility Sampling Algorithms
L Niss, Y Sun, A Tewari.

An inexact subsampled proximal Newton-type method for large-scale machine learning
X Liu, CJ Hsieh, JD Lee, Y Sun.

On conditional parity as a notion of non-discrimination in machine learning
Y Ritov, Y Sun, R Zhao.

Valid post-correction inference for censored regression problems
Y Sun, JE Taylor.