Pan-cancer analysis

Last updated January 13, 2024

Pan-cancer analysis aims to examine the similarities and differences among the genomic and cellular alterations found across diverse tumor types.^[1]^[2] International efforts have performed pan-cancer analysis on exomes and the whole genomes of cancers, the latter including their non-coding regions. In 2018, The Cancer Genome Atlas (TCGA) Research Network used exome, transcriptome, and DNA methylome data to develop an integrated picture of commonalities, differences, and emergent themes across tumor types.

Another project, pan-cancer analysis of RNA-binding proteins (RBPs) across human cancers,^[4] explored the expression, somatic copy number alteration, and mutation profiles of 1,542 RBPs in ~7,000 clinical specimens across 15 cancer types. This study characterized the oncogenic properties of six RBPs—NSUN6, ZC3H13, BYSL, ELAC1, RBMS3, and ZGPAT—in colorectal and liver cancer cell lines.

Several studies have found a causal, predictable connection between genomic alterations (single-nucleotide variants or large copy number variants) and gene expression across all tumor types. This pan-cancer relationship between genomic status and transcriptomic quantitative data can predict a specific genomic alteration from gene expression profiles alone;^[5] it can also be used as the basis for machine learning approaches.

Pan-cancer studies

Pan-cancer studies aim to detect the genes whose mutation is conducive to oncogenesis, as well as recurrent genomic events or aberrations between different tumors. For these studies, it is necessary to standardize the data between multiple platforms, establishing criteria between different researchers to work on the data and present the results. Omics data allow the rapid identification and quantification of thousands of molecules in a single experiment. Genomics addresses the potential that certain genes will be expressed, proteomics addresses what genes are in fact being expressed, and metabolomics addresses what has happened in the tissue being studied. The combination of all of them gives information about the biological system.

Comparison of primary and metastatic solid tumors

Pan-cancer Whole-Genome Comparison of Primary and Metastatic Solid Tumours is a comprehensive research study published in Nature exploring genomic disparities between untreated early-stage primary tumors and treated late-stage metastatic tumors. Conducted through a harmonized analysis of 7,108 whole-genome-sequenced tumors across 23 cancer types, the study aimed to understand the impact of genomic changes on disease progression and therapy resistance.^[6]

Overview

Metastatic tumors exhibited lower intratumor heterogeneity and conserved karyotypes, displaying modest increases in mutations but elevated frequencies of structural variants. The study highlighted the variable contributions of mutational footprints and identified specific genomic differences between primary and metastatic stages across various cancer types.

Methodology and Findings

The study processed 7,108 tumor genomes, harmonizing data from two unpaired primary and metastatic cohorts.
Metastatic tumors generally displayed increased clonality, while the karyotype remained mostly conserved, except for certain cancer types like prostate, thyroid, and kidney renal clear cell carcinomas.
Tumor mutation burden (TMB) exhibited moderate increases in metastatic tumors, with notable exceptions in specific cancers like breast, cervical, thyroid, prostate carcinomas, and pancreatic neuroendocrine tumors.
Mutational signature analysis revealed significant enrichment of mutational processes linked to environmental exposures and endogenous mechanisms, notably platinum-based chemotherapies, APOBEC mutagenesis, and clock-like mutational processes.

Clinical Implications and Therapeutic Resistance

The identification of treatment-associated driver alterations (TEDs) in metastatic tumors highlighted the potential implications for therapy resistance.
Several cancer types showed increased driver alterations in metastatic tumors, including genes associated with resistance to specific therapies, such as AR-activating mutations in prostate cancer and ESR1 mutations in breast cancer.

Conclusion

The study demonstrated substantial genomic differences between primary and metastatic tumors across multiple cancer types. However, these differences varied considerably among cancers, influencing the genomic landscape and potential therapeutic responses. Further research and larger datasets are necessary to comprehend the complexities of tumor evolution, metastasis, and therapy resistance comprehensively.

Significance

The findings offer valuable insights into tumor progression and therapy resistance mechanisms, laying the groundwork for potential personalized treatment strategies across various cancers.

Resources and databases

The nearly 800 terabytes of data from the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes project have been made available through various portals and repositories, including those at the Ontario Institute for Cancer Research, the European Molecular Biology Laboratory's European Bioinformatics Institute, and the National Center for Biotechnology Information. All data obtained from the TCGA efforts are available at the US National Cancer Institute's TARGET Data Matrix and the web portal ProteinPaint.^[7]

StarBase pan-cancer resources^[8] were created for the networks of long noncoding RNAs, microRNAs, competing endogenous RNAs and RBPs.

External links

Related Research Articles

A tumor suppressor gene (TSG), or anti-oncogene, is a gene that regulates a cell during cell division and replication. If the cell grows uncontrollably, it will result in cancer. When a tumor suppressor gene is mutated, it results in a loss or reduction in its function. In combination with other genetic mutations, this could allow the cell to grow abnormally. The loss of function for these genes may be even more significant in the development of human cancers, compared to the activation of oncogenes.

Gene duplication is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. Gene duplications can arise as products of several types of errors in DNA replication and repair machinery as well as through fortuitous capture by selfish genetic elements. Common sources of gene duplications include ectopic recombination, retrotransposition event, aneuploidy, polyploidy, and replication slippage.

RNA-binding proteins are proteins that bind to the double or single stranded RNA in cells and participate in forming ribonucleoprotein complexes. RBPs contain various structural motifs, such as RNA recognition motif (RRM), dsRNA binding domain, zinc finger and others. They are cytoplasmic and nuclear proteins. However, since most mature RNA is exported from the nucleus relatively quickly, most RBPs in the nucleus exist as complexes of protein and pre-mRNA called heterogeneous ribonucleoprotein particles (hnRNPs). RBPs have crucial roles in various cellular processes such as: cellular function, transport and localization. They especially play a major role in post-transcriptional control of RNAs, such as: splicing, polyadenylation, mRNA stabilization, mRNA localization and translation. Eukaryotic cells express diverse RBPs with unique RNA-binding activity and protein–protein interaction. According to the Eukaryotic RBP Database (EuRBPDB), there are 2961 genes encoding RBPs in humans. During evolution, the diversity of RBPs greatly increased with the increase in the number of introns. Diversity enabled eukaryotic cells to utilize RNA exons in various arrangements, giving rise to a unique RNP (ribonucleoprotein) for each RNA. Although RBPs have a crucial role in post-transcriptional regulation in gene expression, relatively few RBPs have been studied systematically.It has now become clear that RNA–RBP interactions play important roles in many biological processes among organisms.

Oncogenomics is a sub-field of genomics that characterizes cancer-associated genes. It focuses on genomic, epigenomic and transcript alterations in cancer.

KRAS is a gene that provides instructions for making a protein called K-Ras, a part of the RAS/MAPK pathway. The protein relays signals from outside the cell to the cell's nucleus. These signals instruct the cell to grow and divide (proliferate) or to mature and take on specialized functions (differentiate). It is called KRAS because it was first identified as a viral oncogene in the KirstenRAt Sarcoma virus. The oncogene identified was derived from a cellular genome, so KRAS, when found in a cellular genome, is called a proto-oncogene.

The Cancer Genome Atlas (TCGA) is a project to catalogue the genetic mutations responsible for cancer using genome sequencing and bioinformatics. The overarching goal was to apply high-throughput genome analysis techniques to improve the ability to diagnose, treat, and prevent cancer through a better understanding of the genetic basis of the disease.

Post-transcriptional regulation is the control of gene expression at the RNA level. It occurs once the RNA polymerase has been attached to the gene's promoter and is synthesizing the nucleotide sequence. Therefore, as the name indicates, it occurs between the transcription phase and the translation phase of gene expression. These controls are critical for the regulation of many genes across human tissues. It also plays a big role in cell physiology, being implicated in pathologies such as cancer and neurodegenerative diseases.

The International Cancer Genome Consortium (ICGC) is a voluntary scientific organization that provides a forum for collaboration among the world's leading cancer and genomic researchers. The ICGC was launched in 2008 to coordinate large-scale cancer genome studies in tumours from 50 cancer types and/or subtypes that are of main importance across the globe.

Cancer genome sequencing is the whole genome sequencing of a single, homogeneous or heterogeneous group of cancer cells. It is a biochemical laboratory method for the characterization and identification of the DNA or RNA sequences of cancer cell(s).

Genome instability refers to a high frequency of mutations within the genome of a cellular lineage. These mutations can include changes in nucleic acid sequences, chromosomal rearrangements or aneuploidy. Genome instability does occur in bacteria. In multicellular organisms genome instability is central to carcinogenesis, and in humans it is also a factor in some neurodegenerative diseases such as amyotrophic lateral sclerosis or the neuromuscular disease myotonic dystrophy.

<span class="mw-page-title-main">Cancer epigenetics</span> Field of study in cancer research

Cancer epigenetics is the study of epigenetic modifications to the DNA of cancer cells that do not involve a change in the nucleotide sequence, but instead involve a change in the way the genetic code is expressed. Epigenetic mechanisms are necessary to maintain normal sequences of tissue specific gene expression and are crucial for normal development. They may be just as important, if not even more important, than genetic mutations in a cell's transformation to cancer. The disturbance of epigenetic processes in cancers, can lead to a loss of expression of genes that occurs about 10 times more frequently by transcription silencing than by mutations. As Vogelstein et al. points out, in a colorectal cancer there are usually about 3 to 6 driver mutations and 33 to 66 hitchhiker or passenger mutations. However, in colon tumors compared to adjacent normal-appearing colonic mucosa, there are about 600 to 800 heavily methylated CpG islands in the promoters of genes in the tumors while these CpG islands are not methylated in the adjacent mucosa. Manipulation of epigenetic alterations holds great promise for cancer prevention, detection, and therapy. In different types of cancer, a variety of epigenetic mechanisms can be perturbed, such as the silencing of tumor suppressor genes and activation of oncogenes by altered CpG island methylation patterns, histone modifications, and dysregulation of DNA binding proteins. There are several medications which have epigenetic impact, that are now used in a number of these diseases.

The Icahn Genomics Institute is a biomedical and genomics research institute within the Icahn School of Medicine at Mount Sinai in New York City. Its aim is to establish a new generation of medicines that can better treat diseases afflicting the world, including cancer, heart disease and infectious pathogens. To do this, the institute’s doctors and scientists are developing and employing new types of treatments that utilize DNA and RNA based therapies, such as CRISPR, siRNA, RNA vaccines, and CAR T cells, and searching for novel drug targets through the use of functional genomics and data science. The institute is led by Brian Brown, a leading expert in gene therapy, genetic engineering, and molecular immunology.

Squamous-cell carcinoma (SCC) of the lung is a histologic type of non-small-cell lung carcinoma (NSCLC). It is the second most prevalent type of lung cancer after lung adenocarcinoma and it originates in the bronchi. Its tumor cells are characterized by a squamous appearance, similar to the one observed in epidermal cells. Squamous-cell carcinoma of the lung is strongly associated with tobacco smoking, more than any other forms of NSCLC.

Circulating tumor DNA (ctDNA) is tumor-derived fragmented DNA in the bloodstream that is not associated with cells. ctDNA should not be confused with cell-free DNA (cfDNA), a broader term which describes DNA that is freely circulating in the bloodstream, but is not necessarily of tumor origin. Because ctDNA may reflect the entire tumor genome, it has gained traction for its potential clinical utility; "liquid biopsies" in the form of blood draws may be taken at various time points to monitor tumor progression throughout the treatment regimen.

Mutational signatures are characteristic combinations of mutation types arising from specific mutagenesis processes such as DNA replication infidelity, exogenous and endogenous genotoxin exposures, defective DNA repair pathways, and DNA enzymatic editing.

Personalized onco-genomics (POG) is the field of oncology and genomics that is focused on using whole genome analysis to make personalized clinical treatment decisions. The program was devised at British Columbia's BC Cancer Agency and is currently being led by Marco Marra and Janessa Laskin. Genome instability has been identified as one of the underlying hallmarks of cancer. The genetic diversity of cancer cells promotes multiple other cancer hallmark functions that help them survive in their microenvironment and eventually metastasise. The pronounced genomic heterogeneity of tumours has led researchers to develop an approach that assesses each individual's cancer to identify targeted therapies that can halt cancer growth. Identification of these "drivers" and corresponding medications used to possibly halt these pathways are important in cancer treatment.

Cancer pharmacogenomics is the study of how variances in the genome influences an individual’s response to different cancer drug treatments. It is a subset of the broader field of pharmacogenomics, which is the area of study aimed at understanding how genetic variants influence drug efficacy and toxicity.

Tumour mutational burden is a genetic characteristic of tumorous tissue that can be informative to cancer research and treatment. It is defined as the number of non-inherited mutations per million bases (Mb) of investigated genomic sequence, and its measurement has been enabled by next generation sequencing. High TMB and DNA damage repair mutations were discovered to be associated with superior clinical benefit from immune checkpoint blockade therapy by Timothy Chan and colleagues at the Memorial Sloan Kettering Cancer Center.

Personalized genomics is the human genetics-derived study of analyzing and interpreting individualized genetic information by genome sequencing to identify genetic variations compared to the library of known sequences. International genetics communities have spared no effort from the past and have gradually cooperated to prosecute research projects to determine DNA sequences of the human genome using DNA sequencing techniques. The methods that are the most commonly used are whole exome sequencing and whole genome sequencing. Both approaches are used to identify genetic variations. Genome sequencing became more cost-effective over time, and made it applicable in the medical field, allowing scientists to understand which genes are attributed to specific diseases.

Precision diagnostics is a branch of precision medicine that involves precisely managing a patient's healthcare model and diagnosing specific diseases based on customized omics data analytics.

References

↑ Cancer Genome Atlas Research, Network; Weinstein, JN; Collisson, EA; Mills, GB; Shaw, KR; Ozenberger, BA; Ellrott, K; Shmulevich, I; Sander, C; Stuart, JM (Oct 2013). "The Cancer Genome Atlas Pan-Cancer analysis project". Nature Genetics. 45 (10): 1113–20. doi:10.1038/ng.2764. PMC 3919969 . PMID 24071849.
↑ Omberg, L; Ellrott, K; Yuan, Y; Kandoth, C; Wong, C; Kellen, MR; Friend, SH; Stuart, J; Liang, H; Margolin, AA (Oct 2013). "Enabling transparent and collaborative computational analysis of 12 tumor types within The Cancer Genome Atlas". Nature Genetics. 45 (10): 1121–6. doi:10.1038/ng.2761. PMC 3950337 . PMID 24071850.
↑ The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium (5 February 2020). "Pan-cancer analysis of Whole Genomes". Nature. 578 (7793): 82–93. Bibcode:2020Natur.578...82I. doi: 10.1038/s41586-020-1969-6 . PMC 7025898 . PMID 32025007.
↑ Wang, ZL; Li, B; Luo, YX; Lin, Q; Liu, SR; Zhang, XQ; Zhou, H; Yang, JH; Qu, LH (2 January 2018). "Comprehensive Genomic Characterization of RNA-Binding Proteins across Human Cancers". Cell Reports. 22 (1): 286–298. doi: 10.1016/j.celrep.2017.12.035 . PMID 29298429.
↑ Mercatelli, Daniele; Ray, Forest; Giorgi, Federico M. (2019). "Pan-Cancer and Single-Cell Modeling of Genomic Alterations Through Gene Expression". Frontiers in Genetics. 10: 671. doi: 10.3389/fgene.2019.00671 . ISSN 1664-8021. PMC 6657420 . PMID 31379928.
↑ Martínez-Jiménez, Francisco; Movasati, Ali; Brunner, Sascha Remy; Nguyen, Luan; Priestley, Peter; Cuppen, Edwin; Van Hoeck, Arne (June 2023). "Pan-cancer whole-genome comparison of primary and metastatic solid tumours". Nature. 618 (7964): 333–341. Bibcode:2023Natur.618..333M. doi:10.1038/s41586-023-06054-z. ISSN 1476-4687. PMC 10247378 . PMID 37165194.
↑ "Exploring genomic alteration in pediatric cancer using ProteinPaint". Nature Genetics.
↑ Li, JH; Liu, S; Zhou, H; Qu, LH; Yang, JH (January 2014). "starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data". Nucleic Acids Research. 42 (Database issue): D92-7. doi:10.1093/nar/gkt1248. PMC 3964941 . PMID 24297251.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Cancer Genome Atlas Research, Network; Weinstein, JN; Collisson, EA; Mills, GB; Shaw, KR; Ozenberger, BA; Ellrott, K; Shmulevich, I; Sander, C; Stuart, JM (Oct 2013). "The Cancer Genome Atlas Pan-Cancer analysis project". Nature Genetics. 45 (10): 1113–20. doi:10.1038/ng.2764. PMC 3919969 . PMID 24071849.

[2] Omberg, L; Ellrott, K; Yuan, Y; Kandoth, C; Wong, C; Kellen, MR; Friend, SH; Stuart, J; Liang, H; Margolin, AA (Oct 2013). "Enabling transparent and collaborative computational analysis of 12 tumor types within The Cancer Genome Atlas". Nature Genetics. 45 (10): 1121–6. doi:10.1038/ng.2761. PMC 3950337 . PMID 24071850.

[3] The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium (5 February 2020). "Pan-cancer analysis of Whole Genomes". Nature. 578 (7793): 82–93. Bibcode:2020Natur.578...82I. doi: 10.1038/s41586-020-1969-6 . PMC 7025898 . PMID 32025007.

[4] Wang, ZL; Li, B; Luo, YX; Lin, Q; Liu, SR; Zhang, XQ; Zhou, H; Yang, JH; Qu, LH (2 January 2018). "Comprehensive Genomic Characterization of RNA-Binding Proteins across Human Cancers". Cell Reports. 22 (1): 286–298. doi: 10.1016/j.celrep.2017.12.035 . PMID 29298429.

[MercatelliRay2019-5] Mercatelli, Daniele; Ray, Forest; Giorgi, Federico M. (2019). "Pan-Cancer and Single-Cell Modeling of Genomic Alterations Through Gene Expression". Frontiers in Genetics. 10: 671. doi: 10.3389/fgene.2019.00671 . ISSN 1664-8021. PMC 6657420 . PMID 31379928.

[6] Martínez-Jiménez, Francisco; Movasati, Ali; Brunner, Sascha Remy; Nguyen, Luan; Priestley, Peter; Cuppen, Edwin; Van Hoeck, Arne (June 2023). "Pan-cancer whole-genome comparison of primary and metastatic solid tumours". Nature. 618 (7964): 333–341. Bibcode:2023Natur.618..333M. doi:10.1038/s41586-023-06054-z. ISSN 1476-4687. PMC 10247378 . PMID 37165194.

[7] "Exploring genomic alteration in pediatric cancer using ProteinPaint". Nature Genetics.

[8] Li, JH; Liu, S; Zhou, H; Qu, LH; Yang, JH (January 2014). "starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data". Nucleic Acids Research. 42 (Database issue): D92-7. doi:10.1093/nar/gkt1248. PMC 3964941 . PMID 24297251.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]