manifold learning
Definition
Manifold learning is a dimensionality reduction technique that assumes high-dimensional data lies on or near a lower-dimensional manifold (curved surface) embedded in the high-dimensional space. Unlike linear methods like PCA, manifold learning algorithms (t-SNE, UMAP, Isomap) preserve local neighborhood structures and non-linear relationships in the data. In biological research, this is crucial because biological systems often exhibit complex, non-linear behaviors. Manifold learning reveals hidden patterns in multi-omics datasets, single-cell sequencing data, and protein structures by projecting them into 2D or 3D spaces while maintaining meaningful biological relationships, enabling researchers to identify cell types, disease states, or functional clusters that would be obscured in high-dimensional space.
Visualize manifold learning in Nodes Bio
Researchers can use manifold learning outputs to construct network graphs where nodes represent samples (cells, patients, proteins) and edges connect neighbors in the learned manifold space. This reveals biological relationships and clusters as network communities. Nodes Bio enables visualization of these manifold-derived networks alongside molecular interaction data, helping identify which genes or pathways drive observed clustering patterns in reduced-dimensional space.
Visualization Ideas:
- Single-cell RNA-seq manifold networks with cells as nodes colored by cell type or expression levels
- Patient similarity networks derived from manifold coordinates of multi-omics data
- Protein conformational space networks showing structural relationships learned from high-dimensional feature spaces
Example Use Case
A cancer researcher performs single-cell RNA sequencing on tumor samples, generating expression data for 20,000 genes across 50,000 cells. Using UMAP manifold learning, they reduce this to 2D coordinates revealing distinct cell populations. By importing these coordinates into Nodes Bio as a network where cells cluster by similarity, they overlay known immune checkpoint genes and discover that a specific tumor-infiltrating lymphocyte subpopulation, previously uncharacterized, highly expresses novel immunotherapy targets, suggesting a new therapeutic strategy for resistant tumors.