Zijian Guo |
Research TopicsMy research develops statistical foundations for reliable and generalizable inference in modern data settings where classical assumptions often fail. Below is a directory of my main methodological areas and their applications. Generalizable Multi-source LearningDistributional Robustness · Minimax Optimization · Federated Learning
Making models reliable when data come from multiple heterogeneous studies, hospitals, or environments. Rather than assuming a single pooled distribution, I define robust targets using minimax objectives that guard against worst-case shifts, accompanied by efficient primal-dual algorithms and valid uncertainty quantification. Reliable Causal InferenceInstrumental Variables (IV) · Robust Identification · Causal Invariance
Centering on instrumental variables (IV), I develop identification and inference tools that remain meaningful even when instruments are imperfect or invalid. My work also extends to under-identification, non-regular inference, and causal invariance learning to identify relationships that remain stable across different environments. High-dimensional Uncertainty QuantificationHigh-dimensional Inference · Multiple Testing · Hidden Confounding
Turning high-dimensional modeling into trustworthy scientific conclusions. I develop inference-first toolkits—including confidence intervals, hypothesis tests, and multiple testing procedures—that remain valid in realistic regimes beyond simple linear models, addressing complications like hidden confounding and endogeneity. Nonstandard InferenceUncertainty Quantification · Post-selection Inference · Perturbation Methods
Providing reliable uncertainty quantification in modern problems where standard Wald intervals break down. I develop inference methods based on perturbation, resampling, and stability-driven constructions for scenarios involving optimization-defined targets, data-adaptive selection, or lack of standard regularity conditions. Applications to Genetics and Health DataMendelian Randomization · Multi-center EHRs · Biomarker Discovery
Making large, heterogeneous datasets—such as genetics, proteomics, and multi-center EHRs—actionable for science. Examples include MR-SPI for robust biomarker discovery via valid genetic instruments selection, and SurvMaximin for transporting survival risk prediction models across hospitals without sharing patient-level data. |