Zijian Guo |
AI4ScienceSummary (LLM read my papers; human bias-correction applied)In my AI4Science / biomedical data work, I develop statistical methods that make large, heterogeneous datasets—such as genetics/proteomics and multi-center EHRs—more actionable for science: the goal is to draw conclusions that remain reliable when real-world complications arise (e.g., imperfect instruments in genetics, strong cross-hospital heterogeneity, and privacy constraints that limit data sharing). Concretely, one line of work proposes MR-SPI, a robust Mendelian randomization approach that automatically selects valid genetic instruments and then performs post-selection inference, so causal biomarker discovery is less sensitive to invalid SNPs; it is applied to UK Biobank proteomics (912 proteins) to identify proteins associated with Alzheimer’s disease, with follow-up structural analysis via AlphaFold2. Another line of work develops SurvMaximin, a robust federated/transfer learning method for survival risk prediction that borrows strength across centers using one-time summary sharing (no patient-level data) and is designed to remain stable even when site-specific models are highly heterogeneous, improving performance especially for smaller target sites.
underline indicates supervised students ; # indicates equal contribution; * indicates alphabetical ordering ; ✉ indicates corresponding authorship. |