SHAP value
Definition
SHAP (SHapley Additive exPlanations) values are a unified measure of feature importance derived from game theory that quantifies each feature's contribution to a machine learning model's prediction. In biological research, SHAP values explain how individual variables (genes, proteins, clinical markers) influence model outputs by calculating their marginal contribution across all possible feature combinations. Unlike traditional feature importance metrics, SHAP values provide both magnitude and direction of effect, are model-agnostic, and satisfy consistency properties. This makes them invaluable for interpreting complex predictive models in genomics, drug response prediction, and disease classification, where understanding which biological features drive predictions is as important as the predictions themselves.
Visualize SHAP value in Nodes Bio
Researchers can visualize SHAP values as node attributes in biological networks, where node size or color intensity represents feature importance for specific predictions. In gene regulatory networks, SHAP values can highlight which transcription factors most influence disease outcomes. Edge weights can represent interaction effects between features, revealing synergistic relationships that drive model predictions and identifying key regulatory hubs in complex biological systems.
Visualization Ideas:
- Gene regulatory network with SHAP values as node sizes showing predictive importance for disease classification
- Protein interaction network colored by SHAP contribution to drug response predictions
- Multi-omics network with edge weights representing SHAP interaction values between feature pairs
Example Use Case
A pharmaceutical team develops a machine learning model to predict patient response to an immunotherapy drug based on gene expression profiles. Using SHAP values, they identify that PD-L1 expression, tumor mutational burden, and three specific immune cell markers contribute most strongly to positive response predictions. By visualizing these SHAP values in a protein-protein interaction network, they discover that the top predictive genes cluster in the interferon-gamma signaling pathway, suggesting this pathway as a biomarker panel for patient stratification in clinical trials.