The mission of the Cancer Cell Map Initiative (CCMI) is to enable a new era of cancer discovery and treatment based on the complete elucidation of the molecular networks underlying cancer. This information will be critical for developing computational models of cancer cells that will enable both basic research and clinical decision-making.

Before the first human genome was sequenced, there was the expectation that the genetic code would reveal the secrets of life, leading to new treatments for diseases. Completed early this century, the Human Genome Project created a catalog of our 20,000 genes but did not tell us how these genes work together or what goes wrong when people get sick. Recent work comparing genomes from tumors to that of healthy tissues found that any given mutation is quite rare; hardly ever do the same patients have mutations in the same genes, except for a few well-known cancer genes.

Genome analysis has long worked according to the laws of statistical association. To firmly link a mutation to disease, we need to observe that the mutation occurs more often that would be expected by chance. However, the heterogeneity described above means that recurrent patterns are not observed for most mutations. To make matters worse, patients presenting with such patterns are often now labeled an “N-of-1”, to capture the idea that they cannot be joined together with any other individuals to be analyzed and treated as a larger cohort. Patients enduring this desultory fate stand alone, without a friend even in disease.

The CCMI is generating comprehensive maps of the key protein-protein and genetic interactions underlying cancer, and is developing computational methods using these maps to identify new drug targets and groups of patients with shared outcomes. Protein-protein interaction maps – the complete set of proteins that bind to another protein – tell us about the physical structure of cancer cells. Genetic interaction maps – knowing how deleting one gene impacts how cells respond to the loss of another gene – tell us about how groups of genes function as pathways and networks. New drug targets and patient subtypes will be identified using a variety of machine learning algorithms using a state-of-the-art supercomputer cluster. Unsupervised methods such as clustering and network propagation will be used to identify patient subtypes and drug targets, respectively, and supervised methods like neural networks will be trained to predict outcomes based genetic information.