About me
I am a Research Associate at Harvard School of Public Health, exploring network models for multiomic data. Please feel free to email me (kshutta@hsph.harvard.edu) if you would like to learn about my research! I’d love to talk more and build collaborations.
Research news
You can see my full research profile on Google Scholar. Here are a few exciting recent updates:
21 December 2025: COPD/IPF EWAS preprint up
Network science meets the epigenome as we unravel COPD/IPF differential etiology in our new preprint: Switch-like methylation of functional pathways distinguishes COPD and idiopathic pulmonary fibrosis.
Citation: Shutta, K.H., Huang, Y., Carey, V.J., Yun, J.H., Hobbs, B., Elias, J.A., Lee, C.G., Brown, K.K., Criner, G., Flaherty, K., Limper, A., Sciurba, F., Wise, R.A., Martinez, F.J., Silverman, E.K., Quackenbush, J., and DeMeo, D.L., 2025. Switch-like methylation of functional pathways distinguishes COPD and idiopathic pulmonary fibrosis. medRxiv, pp.2025-12. https://doi.org/10.64898/2025.12.18.25342312
Fall 2025: NHLBI K25 Mentored Quantitative Research Career Development Award received
I’m thrilled to expand my research in network models of lung disease thanks to my recently funded K25 award from the National Heart, Lung, and Blood Institute of the NIH: Multi-omic network methods for mapping molecular trajectories of age-related lung diseases.
20 October 2025: tcga-data-nf and NetworkDataCompanion are published in GigaScience!
The Quackenbush lab has published a Nextflow workflow for reproducible processing and analysis of gene regulatory networks in TCGA. Check out the results in our Gigascience paper, where we use DRAGON and PANDA networks to investigate consensus molecular subtypes of colon cancer, identifying key epigenetic features related to subtype-specific regulatory differences.
I’m very grateful to have had the opportunity to work on this
project, which is led by my talented colleague Viola
Fanfani. I’m the lead developer and maintainer of the associated R
package NetworkDataCompanion, which can be used
with the Nextflow workflow or as a standalone tool to assist with
analysis of TCGA data. We hope this tool will be useful for many and we
welcome contributions from the community!
Citation: Fanfani, V., Shutta, K.H., Mandros, P., Fischer, J., Saha, E., Micheletti, S., Chen, C., Guebila, M.B., Lopes-Ramos, C.M. and Quackenbush, J., 2025. Reproducible processing of TCGA regulatory networks. GigaScience, giaf126, https://doi.org/10.1093/gigascience/giaf126
Key research contributions
DRAGON for multi-omic GGMs
DRAGON is our tool for multi-omic Gaussian graphical modeling. DRAGON lives in the Network Zoo and is available as part of netZooPy and netZooR.
Citation: Shutta, K.H.+, Weighill, D.+, Burkholz, R., Guebila, M.B., Zacharias, H.U., Quackenbush, J. and Altenbuchinger, M., 2023. DRAGON: determining regulatory associations using graphical models on multi-omic networks. Nucleic Acids Research, 51(3), e15-e15. https://doi.org/10.1093/nar/gkac1157 +: equal contribution
SpiderLearner
SpiderLearner is our ensemble method for estimating Gaussian
graphical models. Our SpiderLearner Quickstart Guide will get you up and
running with the corresponding R package ensembleGGM in
just a few minutes. We are always looking for feedback via the ensembleGGM Github repository!
Citation: Shutta, K. H., Balzer, L. B., Scholtens, D.
M., & Balasubramanian, R. (2023). SpiderLearner: An
ensemble approach to Gaussian graphical model estimation. Statistics in
Medicine, 42(13), 2116-2133. https://doi.org/10.1002/sim.9714
Learning about GGMs
Our tutorial on Gaussian graphical models is a great starting point for applied statisticians to get up and running with GGM analyses. You can find a stand-alone RMarkdown document with the tutorial code here. Citation: Shutta, K.H., De Vito, R., Scholtens, D.M. and Balasubramanian, R., 2022. Gaussian graphical models with applications to omics analyses. Statistics in medicine. https://doi.org/10.1002/sim.9546
Factor analysis for network models of multi-study data
Check out our preprint on graphical modeling of multi-study data! Our method builds on the multi-study factor analysis (MSFA) method of Roberta De Vito and her colleagues. We use latent variables to estimate shared and condition-specific Gaussian graphical models. Citation: Shutta, K.H., Scholtens, D.M., Lowe Jr., W.L., Balasubramanian, R., and De Vito, R., 2022 arXiv, https://doi.org/10.48550/arXiv.2210.12837
Metabolomics of mental health and cardiometabolic health
I’m proud to have contributed to several publications that use metabolomics to understand the relationship between mental health and cardiometabolic health! This work was supported by the NIH NIA (award number R01-AG051600). Some of my primary contributions are cited below.
- Metabolites associated with chronic psychosocial distress Shutta, K.H. +, Balasubramanian, R. +, Huang, T., Jha, S.C., Zeleznik, O.A., Kroenke, C.H., Tinker, L.F., Smoller, J.W., Casanova, R., Tworoger, S.S., Manson, J.E., Clish, C.B., Rexrode, K.M., Hankinson, S.E.++, and Kubzansky, L.D.++, 2021. Plasma metabolomic profiles associated with chronic distress in women. Psychoneuroendocrinology, 133, p.105420. https://doi.org/10.1016/j.psyneuen.2021.105420.+, ++: equal contribution
- Mental health and incident cardiovascular disease Balasubramanian, R., Shutta, K. H., Guasch-Ferre, M., Huang, T., Jha, S. C., Zhu, Y., … Hankinson, S.E., & Kubzansky, L. D. (2023). Metabolomic profiles of chronic distress are associated with cardiovascular disease risk and inflammation-related risk factors. Brain, Behavior, and Immunity, 114, 262-274. https://doi.org/10.1016/j.bbi.2023.08.010
- Mental health and diabetes risk Huang, T., Zhu, Y., Shutta, K. H., Balasubramanian, R., Zeleznik, O. A., Rexrode, K. M., … Kubzansky, L.D., & Hankinson, S. E. (2023). A Plasma Metabolite Score Related to Psychological Distress and Diabetes Risk: A Nested Case-control Study in US Women. The Journal of Clinical Endocrinology & Metabolism, dgad731. https://doi.org/10.1210/clinem/dgad731
Presentations
- A tutorial on the
icensmispackage, presented at the ASA Conference on Statistical Practice 2024 - DRAGON, presented at NetBioMed 2022, a satellite of the NetSci 2022 conference
Resources
NetworkDataCompanion Quickstart Demonstration of TCGA data wrangling tools, focused on handling methylation and gene expression data from the Genomic Data Commons.
CSP 2024: Demonstration of
icensmisfor self-reported, error-prone time-to-event outcomes The un-knit RMarkdown is also available here.SpiderLearner Quickstart Guide Standalone tutorial for running SpiderLearner with the
ensembleGGMR package.
CV
Please email me if you would like my CV.