Cancer is a category of disease characterized by uncontrolled cell growth and proliferation. For cancer to develop, genes regulating cell growth and differentiation must be altered; these mutations are then maintained through subsequent cell divisions and are thus present in all cancerous cells. Gene expression profiling is a technique used in molecular biology to query the expression of thousands of genes simultaneously. In the context of cancer, gene expression profiling has been used to more accurately classify tumors. The information derived from gene expression profiling often helps in predicting the patient's clinical outcome.
Oncogenesis is the process by which normal cells acquire the properties of cancer cells leading to the formation of a cancer or tumor (see: tumorigenesis). It is characterized by a molecular reprogramming of a cell to undergo uninhibited cell division, allowing the formation of a malignant mass. The cells forming this mass undergo natural selection: as cells acquire mutations that enhance their survivability or reproductive capacity, they dominate the growing tumor as other cells are out-competed (see: somatic evolution in cancer). Because of these selective properties, the majority of cells within a tumor will share a common profile of gene expression.
Gene expression profiling is a technique used in molecular biology to query the expression of thousands of genes simultaneously. While almost all cells in an organism contain the entire genome of the organism, only a small subset of those genes is expressed as messenger RNA (mRNA) at any given time, and their relative expression can be evaluated. Techniques include DNA microarray technology or sequenced-based techniques such as serial analysis of gene expression (SAGE).
Current cancer research makes use primarily of DNA microarrays in which an arrayed series of microscopic spots of pre-defined DNA oligonucleotides known as probes are covalently attached to a solid surface such as glass, forming what is known as a gene chip. DNA labeled with fluorophores (target) is prepared from a sample such as a tumor biopsy and is hybridized to the complementary DNA (cDNA) sequences on the gene chip. The chip is then scanned for the presence and strength of the fluorescent labels at each spot representing probe-target hybrids. The level of fluorescence at a particular spot provides quantitative information about the expression of the particular gene corresponding to the spotted cDNA sequence. DNA microarrays evolved from Southern blotting which allows for detection of a specific DNA sequence in a sample of DNA.
Due to lowering costs, RNA-Sequencing is becoming more common as a method for cancer gene expression profiling. It is superior to microarray techniques due to not having the bias inherent in probe selection.
Classification of cancers has been dominated by the fields of histology and histopathology which aim to leverage morphological markers for accurate identification of a tumor type. Histological methods rely on chemical staining of tissues with pigments such as haematoxylin and eosin and microscopy-based visualization by a pathologist. The identification of tumor subtypes is based on established classification schemes such as the International Classification of Diseases published by the World Health Organization which provides codes to classify diseases and a wide variety of signs, symptoms, abnormal findings, complaints, social circumstances, and external causes of injury or diseases. For some types of cancer, these methods are unable to distinguish between subclasses; for example, defining subgroups of diffuse large B-cell lymphoma (DLBCL) have largely failed due to discrepancies between inter- and intra-observer reproducibility. [1] Furthermore, the clinical outcomes of tumors classified as DLBCLs is highly variable [1] suggesting that there are multiple subtypes of DLBCL that cannot be distinguished based on these histological markers. Breast tumor classification too has largely failed based on these predictors. [2] Development of effective therapies depends on accurate diagnosis; additionally, poor diagnosis can lead to patient suffering due to needless side-effects from non-targeted treatments and to increased health care expenditure. Most telling perhaps is that 70-80% of breast cancer patients receiving chemotherapy based on traditional predictors would have survived without it. [3] [4]
Of note, similar gene expression patterns associated with metastatic behaviour of breast cancer tumor cells have also been found in breast cancer of dog, the most common tumor of the female dog. [5] [6]
Presented below are ways that gene expression profiling has been used to more precisely classify tumors into subgroups, often with clinical effect.
In a particular type of cell or tissue, only a small subset of an organism's genomic DNA will be expressed as mRNAs at any given time. The unique pattern of gene expression for a given cell or tissue is referred to as its molecular signature. For example, the expression of genes in skin cells would be very different compared to those expressed in blood cells. Microarray analysis can provide quantitative gene expression information allowing for the generation of a molecular signature, each unique to a particular class of tumor. This idea was first shown experimentally [7] in 2000 by researchers at Stanford University published in Nature Genetics . The authors measured the relative expression of 9,703 human cDNAs in sixty cancer cell lines previously studied and characterized by the National Cancer Institute's Developmental Therapeutics Program. A hierarchical clustering algorithm was used to group cell lines based on the similarity by which the pattern of gene expression varied. In this study by Ross et al., the majority of cell lines with common organs of origin (based on information from the National Institutes of Health) clustered together at terminal branches, suggesting that cancer cells arising from the same tissue share many molecular characteristics. This allows for reliable identification of tumor type based on gene expression.
A more powerful result of gene expression profiling is the ability to further classify tumors into subtypes having distinct biological properties and affect prognoses. For example, some diffuse large B-cell lymphomas (DLBCLs) are indistinguishable based on histological methods yet are clinically heterogeneous: 40% of patients respond well and exhibit prolonged survival while the remaining 60% do not. [8]
In 2000, Stanford researchers led by Ash Alizadeh and colleagues published results [8] in Nature , utilizing expression profiling techniques to stratify DLBCL to two subtypes: germinal center B-like DLBCL and activated B-like DLBCL. The authors developed custom microarrays termed "lymphochips" that were used to query expression of 17,856 genes preferentially expressed in lymphoid cells and those with roles in cancer or immunology for 96 lymphocyte samples. The hierarchical clustering algorithm identified a subset of tumors that would have been labeled DLBCLs by traditional histological methods; however, the expression profiles of these tumors were heterogeneous. When the tumors were re-clustered based on expression of germinal center B-cell genes, a second group of genes characteristic of activated B-cells emerged and were oppositely regulated compared to the first set of genes. Based on these expression patterns, the heterogeneous DLBCL cluster was subclassified to the germinal center B-like DLBCL and activated B-like DLBCL. The distinction of these groups is significant in terms of patient overall survival: the probability of survival for patients with germinal center B-like DLBCLs over 10 years was about 80% while those with activated B-like DLBCLs was lowered to about 40% over a shorter eight-year period.
Breast cancers are also difficult to distinguish based on histological markers. In a 2000 study published in Nature , Stanford researchers led by Perou, C.M. characterized gene expression patterns across 8,102 genes for 65 biopsies obtained from breast cancers. [9] The goal of the study was to identify patterns of gene expression that could be used to describe the phenotypic diversity of breast tumors by comparing the profiles of the biopsies to those of cultured cell lines and relating this information to clinical data. The tumors were clustered into two major groups that largely reflected the ER-positive and ER-negative clinical descriptions. The ER-positive tumors were characterized by high expression of genes normally expressed in breast luminal cells. The authors suggest that this higher-order distinction may encompass at least two biologically distinct types of cancer that may each require a unique course of treatment. Within the ER-negative group, additional clusters were identified based on expression of Erb-B2 and keratin 5- and 17-enriched basal epithelial-like genes. These groups reflect distinct molecular features as related to mammary epithelial biology, based on the outcome of disease.
In a 2001 study published in the Proceedings of the National Academy of Sciences , Sørlie et al. [10] further stratified the classifications described by Perou et al. [9] and explored the clinical value of these breast cancer subtypes. The authors separated the ER-positive tumors into two distinct groups and found that tumor classification based on gene expression was related to patient survival. The expression of 427 genes was measured for 78 cancers and seven non-malignant breast samples. Following hierarchical clustering, the samples formed two groups at the highest level of organization reflecting the ER-positive and ER-negative phenotypes; the ER-negative cluster further stratified to groups identical to those described by Perou et al. [9] In contrast to previous results, Sørlie et al. [10] found that the ER-positive group could also be separated into three distinct subgroups termed luminal subtypes A, B, and, C based on patterns of luminal-specific gene expression with different outcomes. The authors further found once they performed survival analyses that tumors belonging to the various groups showed significantly different outcomes when treated uniformly. Survival analyses are often shown as Kaplan-Meier survival plots, an example of which is shown to the right.
In addition to identifying genes that correlate to survival, microarray analyses have been utilized to establish gene expression profiles associated with prognosis. It is agreed upon that patients with tumors exhibiting poor prognostic features would benefit the most from adjuvant therapy as these treatments substantially improve overall survival for women with breast cancer. Traditional prognostic factors, however, are inexact as mentioned above. Researchers at the Netherlands Cancer Institute were able to identify "good-prognosis" and "bad-prognosis" signatures based on the expression of 70 genes that was better able to predict the likelihood of metastasis development within five years for breast cancer patients [11] [12] Metastasis involves the spread of cancer from one organ to others throughout the body and is the principal cause of death in cancer patients. While the study at the Netherlands Cancer Institute applied to breast cancer patients only, researchers at Massachusetts Institute of Technology identified a molecular signature of metastasis that applied to adenocarcinomas in general. [13]
Anaplastic large-cell lymphoma (ALCL) refers to a group of non-Hodgkin lymphomas in which aberrant T cells proliferate uncontrollably. Considered as a single entity, ALCL is the most common type of peripheral lymphoma and represents ~10% of all peripheral lymphomas in children. The incidence of ALCL is estimated to be 0.25 cases per 100,000 people in the United States of America. There are four distinct types of anaplastic large-cell lymphomas that on microscopic examination share certain key histopathological features and tumor marker proteins. However, the four types have very different clinical presentations, gene abnormalities, prognoses, and/or treatments.
Invasive carcinoma of no special type, invasive breast carcinoma of no special type (IBC-NST), invasive ductal carcinoma (IDC), infiltrating ductal carcinoma (IDC) or invasive ductal carcinoma, not otherwise specified (NOS) is a disease. For international audiences this article will use "invasive carcinoma NST" because it is the preferred term of the World Health Organization (WHO).
MammaPrint is a prognostic and predictive diagnostic test for early stage breast cancer patients that assess the risk that a tumor will metastasize to other parts of the body. It gives a binary result, high-risk or low-risk classification, and helps physicians determine whether or not a patient will benefit from chemotherapy. Women with a low risk result can safely forego chemotherapy without decreasing likelihood of disease free survival. MammaPrint is part of the personalized medicine portfolio marketed by Agendia.
Diffuse large B-cell lymphoma (DLBCL) is a cancer of B cells, a type of lymphocyte that is responsible for producing antibodies. It is the most common form of non-Hodgkin lymphoma among adults, with an annual incidence of 7–8 cases per 100,000 people per year in the US and UK. This cancer occurs primarily in older individuals, with a median age of diagnosis at ~70 years, although it can occur in young adults and, in rare cases, children. DLBCL can arise in virtually any part of the body and, depending on various factors, is often a very aggressive malignancy. The first sign of this illness is typically the observation of a rapidly growing mass or tissue infiltration that is sometimes associated with systemic B symptoms, e.g. fever, weight loss, and night sweats.
Triple-negative breast cancer (TNBC) is any breast cancer that either lacks or shows low levels of estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) overexpression and/or gene amplification. Triple-negative is sometimes used as a surrogate term for basal-like.
The basal-like carcinoma is a recently proposed subtype of breast cancer defined by its gene expression and protein expression profile.
Synthetic lethality is defined as a type of genetic interaction where the combination of two genetic events results in cell death or death of an organism. Although the foregoing explanation is wider than this, it is common when referring to synthetic lethality to mean the situation arising by virtue of a combination of deficiencies of two or more genes leading to cell death, whereas a deficiency of only one of these genes does not. In a synthetic lethal genetic screen, it is necessary to begin with a mutation that does not result in cell death, although the effect of that mutation could result in a differing phenotype, and then systematically test other mutations at additional loci to determine which, in combination with the first mutation, causes cell death arising by way of deficiency or abolition of expression.
Breast cancer classification divides breast cancer into categories according to different schemes criteria and serving a different purpose. The major categories are the histopathological type, the grade of the tumor, the stage of the tumor, and the expression of proteins and genes. As knowledge of cancer cell biology develops these classifications are updated.
Panobinostat, sold under the brand name Farydak, is a medication used for the treatment of multiple myeloma. It is a hydroxamic acid and acts as a non-selective histone deacetylase inhibitor.
Nodular lymphocyte predominant Hodgkin lymphoma (NLPHL) is a slow-growing CD20 positive form of Hodgkin lymphoma, a cancer of the immune system's B cells.
Virtual karyotype is the digital information reflecting a karyotype, resulting from the analysis of short sequences of DNA from specific loci all over the genome, which are isolated and enumerated. It detects genomic copy number variations at a higher resolution for level than conventional karyotyping or chromosome-based comparative genomic hybridization (CGH). The main methods used for creating virtual karyotypes are array-comparative genomic hybridization and SNP arrays.
A gene signature or gene expression signature is a single or combined group of genes in a cell with a uniquely characteristic pattern of gene expression that occurs as a result of an altered or unaltered biological process or pathogenic medical condition. This is not to be confused with the concept of gene expression profiling. Activating pathways in a regular physiological process or a physiological response to a stimulus results in a cascade of signal transduction and interactions that elicit altered levels of gene expression, which is classified as the gene signature of that physiological process or response. The clinical applications of gene signatures breakdown into prognostic, diagnostic and predictive signatures. The phenotypes that may theoretically be defined by a gene expression signature range from those that predict the survival or prognosis of an individual with a disease, those that are used to differentiate between different subtypes of a disease, to those that predict activation of a particular pathway. Ideally, gene signatures can be used to select a group of patients for whom a particular treatment will be effective.
Embryonal rhabdomyosarcoma (EMRS) is a rare histological form of cancer in the connective tissue wherein the mesenchymally-derived malignant cells resemble the primitive developing skeletal muscle of the embryo. It is the most common soft tissue sarcoma occurring in children. Embryonal rhabdomyosarcoma is also known as PAX-fusion negative or fusion-negative rhabdomyosarcoma, as tumors of this subtype are unified by their lack of a PAX3-FOXO1 fusion oncogene. Fusion status refers to the presence or absence of a fusion gene, which is a gene formed from joining two different genes together through DNA rearrangements. These types of tumors are classified as embryonal rhabdomyosarcoma "because of their remarkable resemblance to developing embryonic and fetal skeletal muscle."
Gene expression profiling has revealed that diffuse large B-cell lymphoma (DLBCL) is composed of at least 3 different sub-groups, each having distinct oncogenic mechanisms that respond to therapies in different ways. Germinal Center B-Cell like (GCB) DLBCLs appear to arise from normal germinal center B cells, while Activated B-cell like (ABC) DLBCLs are thought to arise from postgerminal center B cells that are arrested during plasmacytic differentiation. The differences in gene expression between GCB DLBCL and ABC DLBCL are as vast as the differences between distinct types of leukemia, but these conditions have historically been grouped together and treated as the same disease.
A cancer biomarker refers to a substance or process that is indicative of the presence of cancer in the body. A biomarker may be a molecule secreted by a tumor or a specific response of the body to the presence of cancer. Genetic, epigenetic, proteomic, glycomic, and imaging biomarkers can be used for cancer diagnosis, prognosis, and epidemiology. Ideally, such biomarkers can be assayed in non-invasively collected biofluids like blood or serum.
Prognostic markers are biomarkers used to measure the progress of a disease in the patient sample. Prognostic markers are useful to stratify the patients into groups, guiding towards precise medicine discovery. The widely used prognostic markers in cancers include stage, size, grade, node and metastasis. In addition to these common markers, there are prognostic markers specific to different cancer types. For example estrogen level, progesterone and HER2 are markers specific to breast cancer patients. There is evidence showing that genes behaving as tumor suppressors or carcinogens could act as prognostic markers due to altered gene expression or mutation. Besides genetic biomarkers, there are also biomarkers that are detected in plasma or body fluid which can be metabolic or protein biomarkers.
Epstein–Barr virus–associated lymphoproliferative diseases are a group of disorders in which one or more types of lymphoid cells, i.e. B cells, T cells, NK cells, and histiocytic-dendritic cells, are infected with the Epstein–Barr virus (EBV). This causes the infected cells to divide excessively, and is associated with the development of various non-cancerous, pre-cancerous, and cancerous lymphoproliferative disorders (LPDs). These LPDs include the well-known disorder occurring during the initial infection with the EBV, infectious mononucleosis, and the large number of subsequent disorders that may occur thereafter. The virus is usually involved in the development and/or progression of these LPDs although in some cases it may be an "innocent" bystander, i.e. present in, but not contributing to, the disease.
T-cell/histiocyte-rich large B-cell lymphoma (THRLBCL) is a malignancy of B cells. B-cells are lymphocytes that normally function in the humoral immunity component of the adaptive immune system by secreting antibodies that, for example, bind to and neutralize invasive pathogens. Among the various forms of B-cell lymphomas, THRLBCL is a rarely occurring subtype of the diffuse large B-cell lymphomas (DLBCL). DLBCL are a large group of lymphomas that account for ~25% of all non-Hodgkin lymphomas worldwide. THRLBCL is distinguished from the other DLBCL subtypes by the predominance of non-malignant T-cell lymphocytes and histiocytes over malignant B-cells in its tumors and tissue infiltrates.
Primary testicular diffuse large B-cell lymphoma (PT-DLBCL), also termed testicular diffuse large B-cell lymphoma and diffuse large B-cell lymphoma of the testes, is a variant of the diffuse large B-cell lymphomas (DLBCL). DLBCL are a large and diverse group of B-cell malignancies with the great majority (-85%) being typed as diffuse large B-cell lymphoma, not otherwise specified. PT-DLBCL is a variant of DLBCL, NOS that involves one or, in uncommon cases, both testicles. Other variants and subtypes of DLBCL may involve the testes by spreading to them from their primary sites of origin in other tissues. PT-DLBCL differs from these other DLBCL in that it begins in the testes and then may spread to other sites.
Diffuse large B-cell lymphoma associated with chronic inflammation (DLBCL-CI) is a subtype of the Diffuse large B-cell lymphomas and a rare form of the Epstein–Barr virus-associated lymphoproliferative diseases, i.e. conditions in which lymphocytes infected with the Epstein-Barr virus (EBV) proliferate excessively in one or more tissues. EBV infects ~95% of the world's population to cause no symptoms, minor non-specific symptoms, or infectious mononucleosis. The virus then enters a latency phase in which the infected individual becomes a lifetime asymptomatic carrier of the virus. Some weeks, months, years, or decades thereafter, a very small fraction of these carriers, particularly those with an immunodeficiency, develop any one of various EBV-associated benign or malignant diseases.