I am a final year PhD Candidate in Biomedical Data Science at Stanford University. My research interests broadly span (1) AI-human collaboration and (2) the discovery of class-salient data elements within high-dimensional unstructured data (e.g. large images, time-series, graphs, or text corpora) to expand our understanding of science and medicine. I am fortunate to be advised and mentored by Parag Mallick (Radiology) and Christopher Ré (Computer Science) and spend much of my time with the Hazy Research group at the Stanford AI Lab (SAIL). My graduate training and research have been graciously funded via the NIH (BD2K, NLM), the Stanford Data Science Institute (Data Science Scholarship), the International Alliance for Cancer Early Detection (Canary-ACED Graduate Fellowship), and Stanford's Institute for Human-Centered Artificial Intelligence (HAI).
I am on the industrial job market for Research Scientist roles, particularly for Responsible AI and/or AI4Science teams. Please feel free to contact me about opportunities!
Life bits: On weekends, you can find me and my partner road-tripping to one of California's numerous regional, state, or national parks to spend time on the water and trails. Outside of hiking and camping, my hobbies include painting, gardening, and hosting regular themed dinner and cocktail parties. I also love city trekking with friends and popping into cafés and roasteries, museums, used bookstores (hunting for vintage maths sections), and outdoor pubs with live music sessions. Reach out if you'd like to chat — always open to kaffeeklastch!
As a side note: My name is most similarly pronounced as Batman's city of residence, "Gotham" (i.e. GAW-thum). There are many intonation variants for my Sanskrit-originating name across India, but this pronunciation is the closest to the North Indian variant and is the one I go by.
Intuitively, my goal is to work with AI to understand emergent phenomena or better inform our decision-making. To do this, we are pushing the needle on Explainable AI copilots capable of parsing unstructured scientific data for the discovery of class-salient regions of that data. Methodologically, I currently work in the intersection of (A) deep learning architecture development, (B) relational reasoning over long-range contexts, and (C) interpretability and concept discovery. Motivated by scientific and biomedical applications, my thesis work focuses on bettering our understanding of spatial systems and the key actors that distinguish such systems from one another. To this end, we seek to extract human-interpretable, differentially expressed regions captured in high-resolution, multiplexed imagery. My thesis aims include:
While my thesis research focuses on explaianble AI in the context of high-dimensional computer vision, I've been privileged to work in several methods and application domains, including: graph representation learning (for small molecule function prediction), ML for large-scale deployment (clinical decision support via the EHR, mobile health, global health monitoring via satellite imagery), factorization methods (multi-omics & wearables data integration), applied inference (to identify cancer-associated gene enhancers), and mathematical and physics-based modeling (of tumor growth and protein shedding). A copy of my CV can be found at the bottom of this page. Stay tuned for future work on:
Grammar Matters: Exploring Grammatical Variation’s Role in Improving Fine-Tuned LLMs for Biomedical Relation Extraction
Varun Tandon, Gautam Machiraju, Parag Mallick
In review.
Spatial Statistics for Spatial Biology
Hunter Boyce, Gautam Machiraju, Parag Mallick
In review.
Prospectors: Leveraging Short Contexts to Mine Salient Objects in High-dimensional Imagery
Gautam Machiraju, Arjun Desai, James Zou, Christopher Ré, Parag Mallick
International Conference on Machine Learning (ICML) 3rd workshop on Interpretable Machine Learning for Healthcare (IMLH) 2023.
Development and Evaluation of an Image-based Deep Learning Algorithm to Classify Skin Lesions from Mpox Virus Infection
Alexander Henry Thieme, Yuanning Zheng, Gautam Machiraju, et al.
Nature Medicine (2023).
A Dataset Generation Framework for Evaluating Megapixel Image Classifiers & their Explanations
Gautam Machiraju, Sylvia Plevritis, Parag Mallick
European Conference on Computer Vision (ECCV), 2022.
Developing Machine Learning Models to Personalize Care Levels among Emergency Room Patients for Hospital Admission
Minh Nguyen, Conor Corbin, Tiffany Eulalio, Nicolai Ostberg, Gautam Machiraju, Ben Marafino, Michael Baiocchi, Christian Rose, Jonathan Chen
Journal of the American Medical Informatics Association (2021).
Multicompartment Modeling of Protein Shedding Kinetics During Vascularized Tumor Growth
Gautam Machiraju, Parag Mallick, Hermann Frieboes
Nature Scientific Reports (2020).
Grammar Matters: Exploring Grammatical Variation’s Role in Improving Fine-Tuned LLMs for Biomedical Relation Extraction
Varun Tandon, Gautam Machiraju, Parag Mallick
In review.
Spatial Statistics for Spatial Biology
Hunter Boyce, Gautam Machiraju, Parag Mallick
In review.
Prospectors: Leveraging Short Contexts to Mine Salient Objects in High-dimensional Imagery
Gautam Machiraju, Arjun Desai, James Zou, Christopher Ré, Parag Mallick
International Conference on Machine Learning (ICML) 3rd workshop on Interpretable Machine Learning for Healthcare (IMLH) 2023.
Development and Evaluation of an Image-based Deep Learning Algorithm to Classify Skin Lesions from Mpox Virus Infection
Alexander Henry Thieme, Yuanning Zheng, Gautam Machiraju, et al.
Nature Medicine (2023).
A Dataset Generation Framework for Evaluating Megapixel Image Classifiers & their Explanations
Gautam Machiraju, Sylvia Plevritis, Parag Mallick
European Conference on Computer Vision (ECCV), 2022.
A Community-based Approach to Image Analysis of Cells, Tissues and Tumors
CSBC/PS-ON Image Analysis Working Group§‡, Juan Carlos Vizcarra, Erik A. Burlingame, Yury Goltsev, Brian S. White, Darren Tyson, Artem Sokolov
Computerized Medical Imaging and Graphics (2022).
Developing Machine Learning Models to Personalize Care Levels among Emergency Room Patients for Hospital Admission
Minh Nguyen, Conor Corbin, Tiffany Eulalio, Nicolai Ostberg, Gautam Machiraju, Ben Marafino, Michael Baiocchi, Christian Rose, Jonathan Chen
Journal of the American Medical Informatics Association (2021).
Small Molecule Property Prediction via Proxy Labeling and Multi-scale Learning
Gautam Machiraju, Parag Mallick
Preprint (2021).
Multicompartment Modeling of Protein Shedding Kinetics During Vascularized Tumor Growth
Gautam Machiraju, Parag Mallick, Hermann Frieboes
Nature Scientific Reports (2020).
Multi-omics Factorization Illustrates the Added Value of Deep Learning Approaches
Gautam Machiraju, David Amar, Euan Ashley
Preprint (2019).
More details (projects, collaborators, talks, academic service, relevant coursework) can be found on my CV and LinkedIn page.
One of my favorite aspects of research is thinking about aesthetic and design when communicating technical ideas. This drive to understand ideas by visually communicating them (often to myself) sparked as a dyslexic Maths undergraduate. Despite my numerous interests in Maths, I struggled to parse and conceptualize blocks of textual abstraction in modern mathematical presentation, typical of standard teaching materials. I thus relied heavily on intuition and visual proofs as mental anchors. Thanks in part to training as a CIR Scholar at Stanford's Hasso Plattner Institute of Design, I cartoon-ify almost everything I work on and often spend Friday afternoons reflecting on, mocking up, and refining any discussed concepts.