Single-cell variability

Last updated

In cell biology, single-cell variability occurs when individual cells in an otherwise similar population differ in shape, size, position in the cell cycle, or molecular-level characteristics. Such differences can be detected using modern single-cell analysis techniques. [1] Investigation of variability within a population of cells contributes to understanding of developmental and pathological processes,

Contents

An example of single cell analysis. Here, imaging software is used track individual cells as they migrate over time. ExampleOfSingleCellImaging-6A-Giedt2014.tif
An example of single cell analysis. Here, imaging software is used track individual cells as they migrate over time.

Single-cell analysis

A sample of cells may appear similar, but the cells can vary in their individual characteristics, such as shape and size, mRNA expression levels, genome, or individual counts of metabolites. In the past, the only methods available for investigating such properties required a population of cells and provided an estimate of the characteristic of interest, averaged over the population, which could obscure important differences among the cells. Single-cell analysis allows scientists to study the properties of a single cell of interest with high accuracy, revealing individual differences among populations and offering new insights in molecular biology. These individual differences are important in fields such as developmental biology, where individual cells can take on different "fates" - become specialized cells such as neurons or organ tissue - during the growth of an embryo; in cancer research, where individual malignant cells can vary in their response to therapy; or in infectious disease, where only a subset of cells in a population become infected by a pathogen.

Population-level views of cells can offer a distorted view of the data by averaging out the properties of distinct subsets of cells. [3] For example, if half the cells of a particular group are expressing high levels of a given gene, and the rest are expressing low levels, results from a population-wide analysis may appear as if all cells are expressing a medium level of the given gene. Thus, single-cell analysis allows researchers to study biological processes in finer detail and answer questions that could not have been addressed otherwise.

Types of variation

Variation in gene expression

Cells with identical genomes may vary in the expression of their genes due to differences in their specialized function in the body, their timepoint in the cell cycle, their environment, and also noise and stochastic factors. Thus, accurate measurement of gene expression in individual cells allows researchers to better understand these critical aspects of cellular biology. For example, early study of gene expression in individual cells in fruit fly embryos allowed scientists to discover regularized patterns or gradients of specific gene transcription during different stages of growth, allowing for a more detailed understanding of development at the level of location and time. Another phenomenon in gene expression which could only be identified at the single cell level is oscillatory gene expression, in which a gene is expressed on and off periodically.

Single-cell gene expression is typically assayed using RNA-seq. After the cell has been isolated, the RNA-seq protocol typically consists of three steps: [4] the RNA is reverse transcribed into cDNA, the cDNA is amplified to make more material available for the sequencer, and the cDNA is sequenced.

Variation in DNA sequence

A population of single celled organisms like bacteria typically vary slightly in their DNA sequence due to mutations acquired during reproduction. Within a single human, individual cells typically have identical genomes, though there are interesting exceptions, such as B-cells, which have variation in their DNA enabling them to generate different antibodies to bind to the variety of pathogens that can attack the body. Measuring the differences and the rate of change in DNA content at the single-cell level can help scientists better understand how pathogens develop antibiotic resistance, why the immune system often cannot produce antibodies for rapidly mutating viruses like HIV, and other important phenomena.

Many technologies exist for sequencing genomes, but they are designed to use DNA from a population of cells rather than a single cell. The primary challenge for single-cell genome sequencing is to make multiple copies of (amplify) the DNA so that there is enough material available for the sequencer, a process called whole genome amplification (WGA). Typical methods for WGA consist of: [5] (1) Multiple Displacement Amplification (MDA) in which multiple primers anneal to the DNA, polymerases copy the DNA, and knock off other polymerases, freeing strands that can be processed by the sequencer, (2) PCR-based methods, or (3) some combination of both.

Variation in metabolomic properties

Cells vary in the metabolites they contain, which are the intermediary compounds and end products of complex biochemical reactions that sustain the cell. Genetically identical cells in different conditions and environments can use different metabolic pathways to sustain themselves. By measuring the metabolites present, scientists can infer the metabolic pathways used, and infer useful information about the state of the cell. An example of this is found in the immune system, where CD4+ cells can differentiate into Th17 or TReg cells (among other possibilities), both of which direct the immune system's response in different ways. Th17 cells stimulate a strong inflammatory response, whereas TReg cells stimulate the opposite effect. The former tend to rely much more on glycolysis, [6] due to their increased energy demands.

In order to profile the metabolic content of a cell, researchers must identify the cell of interest in the larger population, isolate it for analysis, quickly inhibit enzymes and halt the metabolic processes in the cell, and then use techniques such as NMR, mass-spec, microfluidics, and other methods to analyze the contents of the cell. [7]

Variation in proteome

Similar to variation in the metabolome, the proteins present in a cell and their abundances can vary from cell to cell in an otherwise similar population. While transcription and translation determine the amount and variety of proteins produced, these processes are imprecise, and cells have a number of mechanisms which can change or degrade proteins, allowing for variance in the proteome that may not be accounted for by variance in gene expression. Also, proteins have many other important features besides simply being present or absent, such as whether have undergone posttranslational modifications such as phosphorylation, or are bound to molecules of interest. The variation in abundance and characteristics of proteins has implications for fields such as cancer research and cancer therapy, where a drug targeting a particular protein may vary in its impact due to variability in the proteome, [8] or vary in efficacy due to the broader biological phenomenon of tumor heterogeneity.

Cytometry, surface methods, and microfluidics technologies are the three classes of tools commonly used to profile the proteomes of individual cells. [9] Cytometry allows researchers to isolate cells of interest, and stain 15–30 proteins to measure their location and/or relative abundance. [9] Image cycling techniques have been developed to measure multi-target abundance and distribution in biopsy samples and tissues. In these methods, 3–4 targets are stained with fluorescently labeled antibodies, imaged, and then stripped of their fluorophores by a variety of means, including oxidation-based chemistries [10] or more recently antibody-DNA conjugation methods, [11] allowing additional targets to be stained in follow-on cycles; in some methods up to 60 individual targets have been visualized. [12] For surface methods, researchers place a single cell on a surface coated with antibodies, which then bind to proteins secreted by the cell and allow them to be measured. [9] Microfluidics methods for proteome analysis immobilize single cells on a microchip and use staining to measure the proteins of interest, or antibodies to bind to the proteins of interest.

Variation in cell size and morphology

Cells in an otherwise similar population can vary in their size and morphology due to differences in function, changes in metabolism, or simply being in different phases of the cell cycle or some other factor. For example, stem cells can divide asymmetrically, [13] which means the two resultant daughter cells may have different fates (specialized functions), and can differ from each other in size or shape. Researchers who study development may be interested in tracking the physical characteristics of the individual progeny in a growing population in order to understand how stem cells differentiate into a complex tissue or organism over time.

Microscopy can be used to analyze cell size and morphology by obtaining high-quality images over time. These pictures will typically contain a population of cells, but algorithms can be applied to identify and track individual cells across multiple images. The algorithms must be able to process gigabytes of data to remove noise and summarize the relevant characteristics for the given research question. [14]

Variation in cell cycle

Individual cells in a population will often be at different points in the cell cycle. Scientists who wish to understand characteristics of the cell at a particular point in the cycle would have difficulty using population-level estimates, since they would average measurements from cells at different stages. Also, understanding the cell cycle in individual diseased cells, like those in a tumor, is also important, since they will often have a very different cycle than healthy cells. Single-cell analysis of characteristics of the cell cycle allow scientists to understand these properties in greater detail.

Variability in cell cycle can be studied using several of the methods previously described. For example, cells in G2 will be quite large in size (as they are a just at the point where they are about to divide in two), and can be identified using protocols for cell size and shape. Cells in S phase copy their genomes, and could be identified using protocols for staining DNA and measuring its content by flow cytometry or quantitative fluorescence microscopy, or by using probes for genes expressed highly at specific phases of the cell cycle.

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

Molecular biology is a branch of biology that seeks to understand the molecular basis of biological activity in and between cells, including biomolecular synthesis, modification, mechanisms, and interactions.

<span class="mw-page-title-main">Proteome</span> Set of proteins that can be expressed by a genome, cell, tissue, or organism

The proteome is the entire set of proteins that is, or can be, expressed by a genome, cell, tissue, or organism at a certain time. It is the set of expressed proteins in a given type of cell or organism, at a given time, under defined conditions. Proteomics is the study of the proteome.

<span class="mw-page-title-main">Human genome</span> Complete set of nucleic acid sequences for humans

The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.

<span class="mw-page-title-main">Proteomics</span> Large-scale study of proteins

Proteomics is the large-scale study of proteins. Proteins are vital parts of living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replication of DNA. In addition, other kinds of proteins include antibodies that protect an organism from infection, and hormones that send important signals throughout the body.

<span class="mw-page-title-main">Gene expression</span> Conversion of a genes sequence into a mature gene product or products

Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, and ultimately affect a phenotype. These products are often proteins, but in non-protein-coding genes such as transfer RNA (tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA. Gene expression is summarized in the central dogma of molecular biology first formulated by Francis Crick in 1958, further developed in his 1970 article, and expanded by the subsequent discoveries of reverse transcription and RNA replication.

<span class="mw-page-title-main">Flow cytometry</span> Lab technique in biology and chemistry

Flow cytometry (FC) is a technique used to detect and measure physical and chemical characteristics of a population of cells or particles.

<span class="mw-page-title-main">Functional genomics</span> Field of molecular biology

Functional genomics is a field of molecular biology that attempts to describe gene functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional "candidate-gene" approach.

The transcriptome is the set of all RNA transcripts, including coding and non-coding, in an individual or a population of cells. The term can also sometimes be used to refer to all RNAs, or just mRNA, depending on the particular experiment. The term transcriptome is a portmanteau of the words transcript and genome; it is associated with the process of transcript production during the biological process of transcription.

<span class="mw-page-title-main">Nuclear gene</span> Gene located in the cell nucleus of a eukaryote

A nuclear gene is a gene that has its DNA nucleotide sequence physically situated within the cell nucleus of a eukaryotic organism. This term is employed to differentiate nuclear genes, which are located in the cell nucleus, from genes that are found in mitochondria or chloroplasts. The vast majority of genes in eukaryotes are nuclear.

<span class="mw-page-title-main">ChIP-on-chip</span> Molecular biology method

ChIP-on-chip is a technology that combines chromatin immunoprecipitation ('ChIP') with DNA microarray ("chip"). Like regular ChIP, ChIP-on-chip is used to investigate interactions between proteins and DNA in vivo. Specifically, it allows the identification of the cistrome, the sum of binding sites, for DNA-binding proteins on a genome-wide basis. Whole-genome analysis can be performed to determine the locations of binding sites for almost any protein of interest. As the name of the technique suggests, such proteins are generally those operating in the context of chromatin. The most prominent representatives of this class are transcription factors, replication-related proteins, like origin recognition complex protein (ORC), histones, their variants, and histone modifications.

In molecular cloning and biology, a gene knock-in refers to a genetic engineering method that involves the one-for-one substitution of DNA sequence information in a genetic locus or the insertion of sequence information not found within the locus. Typically, this is done in mice since the technology for this process is more refined and there is a high degree of shared sequence complexity between mice and humans. The difference between knock-in technology and traditional transgenic techniques is that a knock-in involves a gene inserted into a specific locus, and is thus a "targeted" insertion. It is the opposite of gene knockout.

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.

<span class="mw-page-title-main">RNA-Seq</span> Lab technique in cellular biology

RNA-Seq is a sequencing technique that uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA in a biological sample, representing an aggregated snapshot of the cells' dynamic pool of RNAs, also known as transcriptome.

<span class="mw-page-title-main">Biological network</span> Method of representing systems

A biological network is a method of representing systems as complex sets of binary interactions or relations between various biological entities. In general, networks or graphs are used to capture relationships between entities or objects. A typical graphing representation consists of a set of nodes connected by edges.

<span class="mw-page-title-main">Chromatin immunoprecipitation</span> Genomic technique

Chromatin immunoprecipitation (ChIP) is a type of immunoprecipitation experimental technique used to investigate the interaction between proteins and DNA in the cell. It aims to determine whether specific proteins are associated with specific genomic regions, such as transcription factors on promoters or other DNA binding sites, and possibly define cistromes. ChIP also aims to determine the specific location in the genome that various histone modifications are associated with, indicating the target of the histone modifiers. ChIP is crucial for the advancements in the field of epigenomics and learning more about epigenetic phenomena.

The Human Protein Atlas (HPA) is a Swedish-based program started in 2003 with the aim to map all the human proteins in cells, tissues and organs using integration of various omics technologies, including antibody-based imaging, mass spectrometry-based proteomics, transcriptomics and systems biology. All the data in the knowledge resource is open access to allow scientists both in academia and industry to freely access the data for exploration of the human proteome. In June 2023, version 23 was launched where a new Interaction section was introduced containing human protein-protein interaction networks for more than 11,000 genes that will add new aspects in terms of protein function.

<span class="mw-page-title-main">Single-cell analysis</span> Testbg biochemical processes and reactions in an individual cell

In the field of cellular biology, single-cell analysis and subcellular analysis is the study of genomics, transcriptomics, proteomics, metabolomics and cell–cell interactions at the single cell level. The concept of single-cell analysis originated in the 1970s. Before the discovery of heterogeneity, single-cell analysis mainly referred to the analysis or manipulation of an individual cell in a bulk population of cells at a particular condition using optical or electronic microscope. To date, due to the heterogeneity seen in both eukaryotic and prokaryotic cell populations, analyzing a single cell makes it possible to discover mechanisms not seen when studying a bulk population of cells. Technologies such as fluorescence-activated cell sorting (FACS) allow the precise isolation of selected single cells from complex samples, while high throughput single cell partitioning technologies, enable the simultaneous molecular analysis of hundreds or thousands of single unsorted cells; this is particularly useful for the analysis of transcriptome variation in genotypically identical cells, allowing the definition of otherwise undetectable cell subtypes. The development of new technologies is increasing our ability to analyze the genome and transcriptome of single cells, as well as to quantify their proteome and metabolome. Mass spectrometry techniques have become important analytical tools for proteomic and metabolomic analysis of single cells. Recent advances have enabled quantifying thousands of protein across hundreds of single cells, and thus make possible new types of analysis. In situ sequencing and fluorescence in situ hybridization (FISH) do not require that cells be isolated and are increasingly being used for analysis of tissues.

<span class="mw-page-title-main">Nuclear organization</span> Spatial distribution of chromatin within a cell nucleus

Nuclear organization refers to the spatial distribution of chromatin within a cell nucleus. There are many different levels and scales of nuclear organisation. Chromatin is a higher order structure of DNA.

CITE-Seq is a method for performing RNA sequencing along with gaining quantitative and qualitative information on surface proteins with available antibodies on a single cell level. So far, the method has been demonstrated to work with only a few proteins per cell. As such, it provides an additional layer of information for the same cell by combining both proteomics and transcriptomics data. For phenotyping, this method has been shown to be as accurate as flow cytometry by the groups that developed it. It is currently one of the main methods, along with REAP-Seq, to evaluate both gene expression and protein levels simultaneously in different species.

References

  1. Habibi, Iman; Cheong, Raymond; Lipniacki, Tomasz; Levchenko, Andre; Emamian, Effat S.; Abdi, Ali (2017-04-05). "Computation and measurement of cell decision making errors using single cell data". PLOS Computational Biology. 13 (4): e1005436. Bibcode:2017PLSCB..13E5436H. doi: 10.1371/journal.pcbi.1005436 . ISSN   1553-7358. PMC   5397092 . PMID   28379950.
  2. Giedt, Randy J.; Koch, Peter D.; Weissleder, Ralph; Muñoz-Barrutia, Arrate (10 April 2013). "Single Cell Analysis of Drug Distribution by Intravital Imaging". PLOS ONE. 8 (4): e60988. Bibcode:2013PLoSO...860988G. doi: 10.1371/journal.pone.0060988 . PMC   3622689 . PMID   23593370.
  3. Sandberg, Rickard (30 December 2013). "Entering the era of single-cell transcriptomics in biology and medicine". Nature Methods. 11 (1): 22–24. doi:10.1038/nmeth.2764. PMID   24524133. S2CID   27632439.
  4. Saliba, A.-E.; Westermann, A. J.; Gorski, S. A.; Vogel, J. (22 July 2014). "Single-cell RNA-seq: advances and future challenges". Nucleic Acids Research. 42 (14): 8845–8860. doi:10.1093/nar/gku555. PMC   4132710 . PMID   25053837.
  5. Macaulay, Iain C.; Voet, Thierry; Maizels, Nancy (30 January 2014). "Single Cell Genomics: Advances and Future Perspectives". PLOS Genetics. 10 (1): e1004126. doi: 10.1371/journal.pgen.1004126 . PMC   3907301 . PMID   24497842.
  6. Barbi, Joseph; Pardoll, Drew; Pan, Fan (March 2013). "Metabolic control of the Treg/Th17 axis". Immunological Reviews. 252 (1): 52–77. doi:10.1111/imr.12029. PMC   3576873 . PMID   23405895.
  7. Rubakhin, Stanislav S; Romanova, Elena V; Nemes, Peter; Sweedler, Jonathan V (30 March 2011). "Profiling metabolites and peptides in single cells". Nature Methods. 8 (4s): S20–S29. doi:10.1038/nmeth.1549. PMC   3312877 . PMID   21451513.
  8. Cohen, A. A.; Geva-Zatorsky, N.; Eden, E.; Frenkel-Morgenstern, M.; Issaeva, I.; Sigal, A.; Milo, R.; Cohen-Saidon, C.; Liron, Y.; Kam, Z.; Cohen, L.; Danon, T.; Perzov, N.; Alon, U. (5 December 2008). "Dynamic Proteomics of Individual Cancer Cells in Response to a Drug". Science. 322 (5907): 1511–1516. Bibcode:2008Sci...322.1511C. doi:10.1126/science.1160165. PMID   19023046. S2CID   9553016.
  9. 1 2 3 Wei, Wei; Shin, Young; Ma, Chao; Wang, Jun; Elitas, Meltem; Fan, Rong; Heath, James R (2013). "Microchip platforms for multiplex single-cell functional proteomics with applications to immunology and cancer research". Genome Medicine. 5 (8): 75. doi: 10.1186/gm479 . PMC   3978720 . PMID   23998271.
  10. Sorger, Peter K.; Fallahi-Sichani, Mohammad; Lin, Jia-Ren (2015-09-24). "Highly multiplexed imaging of single cells using a high-throughput cyclic immunofluorescence method". Nature Communications. 6: 8390. Bibcode:2015NatCo...6.8390L. doi:10.1038/ncomms9390. ISSN   2041-1723. PMC   4587398 . PMID   26399630.
  11. Weissleder, Ralph; Juric, Dejan; Castillo, Andres Fernandez del; McFarland, Philip J.; Carlson, Jonathan C. T.; Pathania, Divya; Giedt, Randy J. (2018-10-31). "Single-cell barcode analysis provides a rapid readout of cellular signaling pathways in clinical specimens". Nature Communications. 9 (1): 4550. Bibcode:2018NatCo...9.4550G. doi:10.1038/s41467-018-07002-6. ISSN   2041-1723. PMC   6208406 . PMID   30382095.
  12. Lin, Jia-Ren; Izar, Benjamin; Wang, Shu; Yapp, Clarence; Mei, Shaolin; Shah, Parin M; Santagata, Sandro; Sorger, Peter K (2018-07-11). Chakraborty, Arup K; Raj, Arjun; Marr, Carsten; Horváth, Péter (eds.). "Highly multiplexed immunofluorescence imaging of human tissues and tumors using t-CyCIF and conventional optical microscopes". eLife. 7: e31657. doi: 10.7554/eLife.31657 . ISSN   2050-084X. PMC   6075866 . PMID   29993362.
  13. Knoblich, Juergen A. (February 2008). "Mechanisms of Asymmetric Stem Cell Division". Cell. 132 (4): 583–597. doi: 10.1016/j.cell.2008.02.007 . PMID   18295577.
  14. Cohen, A. R. (3 November 2014). "Extracting meaning from biological imaging data". Molecular Biology of the Cell. 25 (22): 3470–3473. doi:10.1091/mbc.E14-04-0946. PMC   4230605 . PMID   25368423.