EPIC-Seq

Last updated
Fig 1.This flowchart delineates the workflow for EPIC-Seq, a comprehensive method for analyzing cell-free DNA (cfDNA). The initial step involves extracting cfDNA from patients' plasma through liquid biopsy. Subsequently, the process advances to capturing reads within the Transcription Start Site (TSS) region of targeted genes utilizing specialized probes, ensuring precise genomic coverage. Once the cfDNA within the TSS region of targeted genes is captured, the next phase involves computing Fragmentomic Features such as promoter fragmentation entropy (PFE) and the Depth at nucleosome-depleted regions (NDR), providing insights into the structural characteristics of the DNA fragments. Finally, leveraging the NDR and PFE of all targets as features, a machine learning model is constructed to infer gene expression profiles in patients, facilitating the interpretation of molecular signatures and aiding in diagnostic and prognostic assessments. EPIC-seq workflow.png
Fig 1.This flowchart delineates the workflow for EPIC-Seq, a comprehensive method for analyzing cell-free DNA (cfDNA). The initial step involves extracting cfDNA from patients' plasma through liquid biopsy. Subsequently, the process advances to capturing reads within the Transcription Start Site (TSS) region of targeted genes utilizing specialized probes, ensuring precise genomic coverage. Once the cfDNA within the TSS region of targeted genes is captured, the next phase involves computing Fragmentomic Features such as promoter fragmentation entropy (PFE) and the Depth at nucleosome-depleted regions (NDR), providing insights into the structural characteristics of the DNA fragments. Finally, leveraging the NDR and PFE of all targets as features, a machine learning model is constructed to infer gene expression profiles in patients, facilitating the interpretation of molecular signatures and aiding in diagnostic and prognostic assessments.

EPIC-seq, (short for Epigenetic Expression Inference by Cell-free DNA Sequencing), is a high-throughput method that specifically targets gene promoters using cell-free DNA (cfDNA) sequencing. By employing non-invasive techniques such as blood sampling, it infers the expression levels of targeted genes. It consists of both wet and dry lab stages. [1]

Contents

EPIC-seq involves deep sequencing of the transcription start sites (TSS). It hypothesizes that with deep sequencing of these TSSs, usage of fragmentomic features, chromatin fragmentation patterns or properties, can allow high-resolution analyses, as opposed to its alternatives. [1]

The method has been shown effective for gene-level expression inference, molecular subtyping of diffuse large B cell lymphoma (DLBCL), histological classification of nonsmall-cell lung cancer (NSCLC), evaluation of results of immunotherapy agents, and assessment of the genes' prognostic importance. EPIC-seq uses machine learning to deduce the RNA expression of the genes and proposes two new metrics: promoter fragmentation entropy (PFE), an adjusted Shannon Index for entropy, [2] and nucleosome-depleted region (NDR) score, the depth of sequencing in NDR regions. PFE showed superior performance compared to earlier metrics for fragmentomic features. [1]

Additionally, EPIC-seq has been mentioned as a possible solution for detecting tissue damage and esophagus cancer using methylation profiles of cfDNAs, [3] [4] profiling of donor liver molecular networks, [5] and inflammatory bowel disease [6] (IBD) detection.

Background

Historical Usage of cfDNA and fragmentomic features

cfDNA, cell death-related and chromatin fragmented DNA molecules contained in blood plasma, has been used to detect transplant tissue rejection, prenatal fetal aneuploidy testing, tumour profiling, and early cancer detection in previous research. [7] Nevertheless, prevalent liquid biopsy methods for cfDNA profiling depend on detecting germline or somatic genetic variations, which may be absent even in high disease burden-bearing patients and cancers with high tumour mutation rates. [1]

Historically, the usage of fragmentomic features of cfDNA samples was shown to be another method to approach the problems mentioned. [8] [9] They demonstrated the capability to inform about the originated tissue classification of cfDNA molecules, which can help segregate tumour-related somatic mutations. [10] [11] However, current methods that use fragmentomic features, such as shallow whole genome sequencing (WGS) on cfDNA, do not fully cover all the tissues' effects and provide low sequencing depth and breadth to infer low-level, for example, gene level, properties. Hence, these methods require a high tumour burden from the patients. [1]

Circulating Tumor DNA profiling

Circulating tumour DNA (ctDNA) molecules are tumour-derived cell-free DNA (cfDNA) circulating in the bloodstream and are not associated with cells. CtDNA primarily arises from chromatin fragmentation accompanying tumour cell death [12] and can be extracted by liquid biopsy. [13] CtDNA analysis has been implemented for noninvasive identification of tumour genetic characteristics and early recognition of various cancer forms. [14] [15] [16] The majority of current ctDNA analysis depends on genetic differences in germline or somatic cells to diagnose diseases and detect tumour cells at an early stage. [11] [2] [12] While looking at genetic variations of ctDNA can be beneficial, not all ctDNAs contain genetic mutations. EPIC-seq unitized epigenetic features of ctDNA to inform tissue-of-origin of these unmutated molecules, [1] which is helpful for cancer classification.

Fragmentomic Features for Tissue-of-origin classification

The majority of circulating cfDNA molecules are fragments linked to nucleosomes, so they represent unique chromatin arrangements found in the nuclear genomes of the cells they originate from. [14] [15] [16] In particular, open chromatin areas j, whereas genomic regions linked to nucleosomal complexes are often shielded from endonuclease activity. [17] Several studies have identified specific chromatin fragmentomic characteristics that aid in informing tissue origins through cfDNA profiling. These features include:

  1. Reduced sequencing coverage depth [18] [19] [20]
  2. Disruption of nucleosome positioning near transcription start sites (TSSs) [17]
  3. Length of cfDNA fragments [21] [22] [23]

Principles of EPIC-seq

Fig 2. Diagram explaining Promoter Fragmentation Entropy (PFE) and Nucleosome Depleted Region (NDR) Epic-seq NDR PFE.png
Fig 2. Diagram explaining Promoter Fragmentation Entropy (PFE) and Nucleosome Depleted Region (NDR)

Currently, the majority of circulating tumour DNA (ctDNA) fragmentomic techniques lack the ability to achieve gene-level resolution and are effective mainly in inferring expression at elevated ctDNA levels. Consequently, they are primarily applicable to patients with notably advanced tumour burdens typically seen in late-stage cancer. [1]

To address this limitation, EPIC-seq employs hybrid capture-based targeted deep sequencing of regions flanking transcription start sites (TSS) in cfDNA. This approach allows for the acquisition of ctDNA fragmentation features crucial for predicting gene expressions, such as Promoter Fragmentation Entropy (PFE) and Nucleosome Depleted Region (NDR). [1] These key fragmentomic features possess the capability to capture associations at the gene level with expression levels throughout the genome, enabling the construction of a predictive model for transcriptional output. This would allows for the high-resolution monitoring of cfDNA fragmentation and gene-level analysis. [1]

Promoter Fragmentation entropy

Epic-seq hypothesizes that cfDNA fragments originating from active promoters, which are less shielded by nucleosomes and thus more susceptible to endonuclease cleavage, will display more erratic cleavage patterns compared to fragments from inactive promoters, which are better protected by nucleosomes. PFE is a variation of the Shannon Index, which is a quantitative measure for estimating diversity. In the context of Epic-seq, PFE calculates the diversity of cfDNA fragment lengths where both ends of the fragment are situated within the 2 kb flanking region of each gene's TSS. The higher the PFE of a gene's TSS, the more likely the gene is highly expressed. [1]

Nucleosome Depleted region

Actively expressed genes have open chromatin at their TSS region, they are less shielded by nucleosomes and, therefore, more susceptible to endonuclease cleavage. Consequently, the depth of cfDNA originating from the TSS of active genes tends to be shallower compared to that of inactive genes. NDR quantifies the normalized depth within each 2-kilobase window surrounding each TSS. The lower the NDR of a gene TSS site, the more likely the gene is highly expressed. [1]

Methods

Wet Lab workflow

1. Collection and Processing of plasma

Peripheral blood samples were obtained and processed to isolate plasma following standard protocols. Upon centrifugation, plasma specimens were preserved at −80 °C, awaiting the extraction of ctDNA. The extraction of cfDNA from plasma volumes ranging from 2 to 16 ml was carried out using established laboratory procedures. Following isolation, the concentration of cfDNA was determined using fluorescence-based quantification methods. [1]

2. Sequencing Library preparation

A typical amount of 32 ng of cfDNA was utilized for library preparation. DNA input was adjusted to mitigate the effects of high molecular-weight DNA contamination. The library preparation process encompassed end repair, A-tailing, and adapter ligation, which also incorporated molecular barcodes into each read. These procedures were conducted according to ligation based library preparation standardized protocols, with overnight ligation performed at 4 °C. Following this, shotgun cfDNA libraries underwent hybrid capture targeting specific genomic regions, as detailed below. [1]

3. Custom Capture Panels sequencing

Custom capture panels tailored to specific cancer types or personalized selectors were utilized in EPIC-seq. The capture panels targeted transcription start site regions of genes of interest. Enrichment for EPIC-seq was performed following established laboratory protocols. Subsequently, hybridization captures were pooled, and the pooled samples underwent sequencing using short read sequencing. [1]

Dry Lab workflow

Since EPIC-seq contains certain computational parts after the wet-lab portion for further processing, the following steps are summarized based on the developers' steps provided in the original paper. [1]

4. Demultiplexing and Error correction

If multiplexed paired-end sequencing is used, then demultiplexing needs to be done to sort reads for different samples to different files. After the demultiplexing, error correction and read pair elimination based on unique identifier and barcode matching of pairs can be done. Developers adapt the demultiplexing and error correction steps from the CAPP-seq [24] demultiplexing pipeline. [1]

5. Outer Sequence Removal and trimming

For the preservation of shorter fragment reads, barcode removal and adapter trimming need to be done. After read preprocessing, the alignment of reads to the human genome reference should be performed. Original EPIC-seq used hg19 [25] but for better results, an updated version of human genome reference can be used. One should be careful about their aligner's options since some aligners can interfere with the inclusion of shorter reads paired with longer ones. [1] For the deduplication, attached molecular customized barcodes should be exploited. These barcodes include endogenous and exogenous unique molecular identifiers (UMIs) and are handy for distinguishing Polymerase Chain Reaction (PCR) duplicates from real duplicates and hence for PCR duplicate cleansing. This portion is especially important for oncologic applications since the low mutation abundance can be suppressed by PCR duplicates. [1]

6. Read Normalization and quality control

If the data for different samples are going to be contrasted with each other, one can perform downsampling on the reads to achieve comparability. The reported sequencing coverage depth for reasonable analysis results was reported as bigger than 500 folds, thus any sample whose mean sequencing depth does not exceed the number can be dropped for more accurate outcomes. Also, EPIC-seq uses estimated expected cfDNA fragment length density of 140–185, based on chromatosomal length. The samples that have outlier fragment length density can be dropped for higher correlation results.  As the last quality control step, mapping quality should be considered. A looser threshold can be dictated on EPIC-seq reads, compared to WGS, due to the TSS selection criteria imposed during design phases making the reads more unique for EPIC-seq. [1]

Fragmentomic Feature Analysis

7. Shannon's entropy

For the measurement of the diversity of fragmentomic features, the PFE metric, derived from Shannon's Index of entropy, [2] is developed. The default number of 201 bins of lengths 100 to 300 are used for density estimation by the maximum likelihood method. The probability of having a fragment with size , () is computed by the division of the number of fragments with size by the total number of fragments. Shannon's entropy [2] is calculated with fthe formula: . [1]

8. Dirichlet-Multionomial model

Next, as a cleansing against different sequencing depths from different runs and other factors that can hinder the fragment length distribution sanity, Bayesian normalization via the Dirichlet-multinomial model should be done. Per every sample, based on the fragment lengths observed in that sample, a multinomial maximum likelihood estimation-based fragment length distribution is generated.  Two intervals of 250 base pair length are used, located between -1000th base pair and -750th base pair, and between 750th base pair and 1000th base pair locations to the centre of TSS. This is done due to the prevention of the impact of gene expression on the generated distribution, as the selected intervals are relatively far away from TSS. Then, the fragment length densities from that distribution are sampled for each 201-fragment size and used as a parameter for Dirichlet distribution generation. [1]

The initial parameter for Dirichlet distribution is set to 20. From the obtained Dirichlet distribution, 2000 fragments are sampled, and Shannon's entropy [2] is calculated for those. The Shannon entropies [2] are subsequently compared with the Shannon entropy [2] values of five randomly selected background sets ( where ). [1]

9. PFE calculation

PFE is calculated as the probability of gene-specific entropy being higher than times all other background set entropies individually. The variable is sampled from the Gamma distribution with shape 1 and rate 0.5. Also, as the last step, the expected value for the sum of gene-specific entropy probability for each background is reported as PFE. That probability is based on the Dirichlet distribution generated in the previous step. [1]

10. NDR calculation

NDR is the normalized measure of sequencing depth, which was downsampled to 2000 folds as a default in the 2000 base pair windows during read preprocessing and quality control steps. [1]

11. Machine Learning for Expression prediction

With deep WGS data of cfDNA from a carcinoma of unknown primary patient with very low ctDNA concentration quantified, they trained a machine learning model using bootstrapping. The results of RNA-sequencing on PBMC runs for the 5 different patients are recorded and the average of 3 of these individuals' expression levels is used as a reference for gene expression. The genes are clustered into 10 clusters based on reference gene expression to increase the resolution at the core promoters. Then, genes used as a background value for PFE calculation are removed. Next, all the fragments in extended TSS regions, a region that has the center as TSS regions' center and the length of 2000 base pairs, are pooled. The PFE and NDR scores are calculated for the fragments pooled. Further normalization of these scores is done based on their 95th percentile. [1]

Using these two features, they bootstrapped, used in a weighted fashion, 600 expression prediction models developed for WGS data. Among those models, there are 200 univariable standalone NDR, 200 univariable standalone PFE, and 200 NDR-PFE integrated models. [1]

Advantages

High throughputness

EPIC-seq inherits the advantages of high-throughput sequencing: fast sequencing times, high scalability, higher sequencing depths, lower costs, and low error rates. [26] Another advantage of EPIC-seq is that it is non-invasive. This also eliminates the risks of invasive methods done over risky tissues and allows scientists to study tissues that are too dangerous or difficult to do so. [1]

Indepency of High Tumour Burden requirement

As mentioned in the introduction, two major limitations of the predecessor methods are not inherited by EPIC-Seq: germline or somatic variant dependency of common liquid biopsy methods which is also not certain to be found even in high-disease burden patients and methods like shallow WGS's insufficient range of cfDNA tissue consideration, genomic breadth and genomic depth which causes low-resolution and level of inference of gene expression and, again, requires high tumour burden for higher resolution. [1] EPIC-seq uses fragmentomic features instead of variant calling, thus it is not bound by the existence of the variation. Also, since it does targeted sequencing instead of whole genome, it allows scientists to increase the sequencing depth and hence provide a better resolution. Moreover, it also provides more sensitive and comprehensive tissue-of-origin information. [1]

Different Prediction sensitivities

Furthermore, the method showed consistent performance in cancer identification, classification, and treatment effect problems like NSCLC and DLBCL identification, histological classification of subtypes of NSCLC, molecular classification of subtypes of DLBCL, DLBCL COO detection, programmed death-ligand 1 immune-checkpoint inhibition response prediction against advanced NSCLC cases, and prognostic value detection of individual genes. [1]

Generalizability

WES was done with EPIC-seq and it detected a correlation between the biological signal and active genes' exonic regions; [1] this shows that EPIC-seq can be generalized for expression of genes of interest rather than only cancer genes

Robustness on cfDNA levels

In general, EPIC-seq analysis results showed a significant correlation between the inspected biological effect and the developed score. For the classification tasks Area Under the ROC (receiver operating characteristic curve) Curve (AUC) scores were over 90% with a sufficient significance interval. Also, for these tasks, cfDNA levels did not change the performance unfavourably even when the levels were below 1%. [1] So, the method shows a good robustness against cfDNA levels as well. Finally, EPIC-seq did not show any significant changes under different pre-analytical factors, [1] which proves that the method is robust under different circumstances that can be caused by the instruments and tools used before the analysis.

Limitations

While EPIC-seq offers significant potential in various biomedical applications, it also has limitations that warrant consideration in its implementation and interpretation.

Dependency on Known Cancer-Associated genes

One limitation of EPIC-seq is its reliance on prior knowledge of genes associated with specific cancers. The effectiveness of the EPIC-seq model hinges on the availability of comprehensive gene expression profiles for the targeted cancer types. This dependency may restrict its applicability to cancers with well-characterized gene expression patterns, limiting its utility in cancers with less understood molecular signatures. [1] [27]

Limited applicability to specific cancer types

EPIC-seq may be more effective in cancers with prominent genes or well-defined molecular subtypes. Consequently, its utility may be limited in cancers with less distinct genetic profiles or those characterized by significant interpatient variability. This restricts its generalizability across different cancer types and necessitates cautious interpretation of results in diverse oncological contexts. [1] [27]

Limited Performance in Early-stage cancer

EPIC-seq may exhibit enhanced performance in detecting late-stage cancer due to higher levels of ctDNA and more pronounced genetic alterations. For example, EPIC-seq's sensitivity for detecting NSCLC diminishes significantly in patients with low tumor-DNA burden (below 1%), resulting in decreased detection rates by approximately 34%. [1] [27]

Applications

Noninvasive cancer detection

EPIC-seq has demonstrated remarkable potential in noninvasive cancer detection, notably in the diagnosis of lung cancer, the leading cause of cancer-related mortality. Using EPIC-seq, researchers have achieved high accuracy in distinguishing between NSCLC patients, DLBCL patients and healthy individuals. [1]

Noninvasive Classification of Cancer subtypes

EPIC-seq enables the subclassification of NSCLC into histological subtypes such as lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC). EPIC-seq can also aid with the classification of cell-of-origin (COO) subtypes in DLBCL. By analyzing epigenetic and transcriptional signatures, EPIC-seq-derived classifiers provide valuable insights into tumor heterogeneity and molecular subtyping, providing valuable insights for tailored treatment strategies. [1]

Therapeutic Response prediction

In addition to diagnosis and classification, EPIC-seq holds promise in predicting patient response to various cancer therapies, including immune-checkpoint inhibition (ICI). By analyzing changes in gene expression patterns captured through EPIC-seq, researchers can forecast patient response to PD-(L)1 blockade therapy, which can provide great help in personalized cancer treatment. EPIC-seq-derived indices have shown significant correlation with treatment response, offering potential prognostic markers for therapy outcome prediction. [1]

Immunotranscriptomic profiling of Classical Hodgkin Lymphoma

EPIC-seq has been shown to be effective for inferral of epigenetic expression of classical Hodgkin Lymphoma's (cHL) subtypes. Hodgkin and Reed/Sternberg cells and their corresponding T cells' expression were inferred with EPIC-seq. Bulk single-cell RNA sequencing results shows significant correlation with EPIC-seq profilings of these cell types. [28]

Possible use cases

Research in different areas mention possible use cases of EPIC-seq. Integrated analysis toolkit for whole-genome-wide features of cfDNA (INAC) [29] compiles different tools, including EPIC-seq's PFE and NDR scores, to provide in comprehensive silico analysis of cfDNA which can be exemplified disease state and clinical outcome inference, transcriptome modeling, and copy number profiling. EPIC-seq is also mentioned to be a potential application in clinica IBD cases. It can be used for survailance of IBD in high-risk groups and precancerous development caused by IBD. It is also named as a possible superior method in clinical IBD gut damage detection, compared to the current methods. [6]

Alternatives

As EPIC-seq studies epigenetic markers to infer gene expression, one can study epigenetic sequencing methods like ChIP-seq, [30] ATAC-seq, [31] MeDIP-seq, [32] and Bisulfite-Free DNA Methylation sequencing [33] in combination with methods for profiling RNA expression such as RNA-seq [34] and scRNA-seq. [35]

Considering the method is mainly developed for early cancer detection or subgrouping, liquid biopsy methods, such as Twist cfDNA Pan-Cancer Reference Standard, can be used as an alternative. Different liquid biopsy methods focus on cell-free tumour markers, tumour methylation markers, exomes, proteins, lipids, carbohydrates, electrolytes, metabolites, RNA, extracellular vesicles, circulating tumour cells, and tumour-educated platelets for early identification of cancer non-invasively. [36] Some of the proposed liquid biopsy methods provide a comprehensive detection of cancer types, such as ATR-FTIR spectroscopy [37] and CancerSEEK, [38] while others, like Dxcover [39] and SelectMdx [40] operate on more specific (even single) cancer targets. [36]

EPIC-seq utilizes fragmentomic features to infer expression levels of genes. Several studies also employ fragmentomic features to infer cancer existence, infer cell death, and detect other clinical conditions such as transplant failure. [41]

ctDNA by Fragment Size analysis

This method uses in vivo and silico ctDNA fragment length selection to enrich the variant proportion in the plasma. The method is decided on size selection criteria based on blood ctDNA fragment length properties, so it may not generalize well for other non-invasive sampling methods. Furthermore, it employs supervised machine learning methods like Random Forest and Logistic Regression on shallow WGS to classify cancer and healthy patients. The method can be used for different cancer types. [42]

Plasma DNA End-Motif profiling

This method tries to identify 4-bp long end motifs from each stand's 5' end on bisulfite sequencing reads of plasma cfDNAs. Hierarchical clustering of the motifs is done to detect any under/overrepresentation of these motifs due to cancer existence. The method incorporates Support Vector Machines and Logistic Regression to predict cancer patients from healthy ones. The method is also applied to transplant patients with clustering and multidimensional scaling (MDS) analysis and shows applicability. The same analysis types also proved that this method applies to prenetal testing. This method is also informative for cell type origins. [10]

Orientation-aware Plasma cell-free DNA Fragmentation analysis

Sequencing depth inconsistencies on open chromatin regions and signals derived from up/downstream orientation-sensitive sequencing read densities, this method infers the tissue of origin of the cfDNA fragments obtained from bisulfite sequencing. The method uses a mathematical formulation to generate signals for orientation-aware cfDNA fragmentation based on the empirical peak periods and positions of up/downstream ends of the reads. The method shown to be useful for inferring the tissue-of-origin, pregnancy identification, cancer detection, and transplant monitoring. This method also provides information on which tissue-of-origin contributes how much to cfDNA reads. [11]

DNA Evaluation of Fragments for early interception

The method analyzes the shallow WGS reads in windows while considering the cfDNA fragment length and coverage. The genome-wide pattern of cfDNA fragmentation features is then fed to a gradient tree-boosting machine learning model to predict their cancer situation.  They also used machine learning classifiers to predict the tissue of origin. Overall, the method can be used to identify if a patient has cancer. Even though the method does not specifically classify the cancer types during prediction, it is used for the detection of different cancers. [43]

In vivo Nucleosome footprinting

The method produces genome-wide mappings of in vivo nucleosome occupancy to detect the tissue-of-origin of cfDNA molecules. The method uses reads' endpoint position aligned which are expected to be close to nucleosome core particle (NCP) sites. Windowed Protection Score (WPS) is proposed to quantify the cfDNA density close to NCPs using the frequency of cfDNA particles that cover 120 base pairs centred at a given location minus the frequency of fragments with an endpoint at the same interval. Then, the peaks are called heuristically for WPS to identify footprints. The cells contributing to cfDNA are then predicted from the footprints. These footprints can be used for identifying non-malignant epigenetic or genetic sites like transcription factor binding sites, and detection of malignancy-related biomarkers based on the extent of tissue damage and cell deaths. [17]

ctDNA Nucleosome Pattern Employment for Transcriptional Regulation profiling

The method has mainly been developed for detecting the various phenotypes of metastatic castration-resistant prostate cancer. It requires the usage of patient-derived xenografts for enrichment of ctDNA in blood for further analysis. After WGS, the method utilizes the tool Griffin [44] for inspection of local promoter coverage, nucleosome positioning, fragment size analysis, and composite transcription factor binding sites plus open chromatin sites of ctDNA reads. It also checks the histone modifications and applies dimensionality reduction on the found sites to identify putative promoter, enhancer, and gene repressive heterochromatic marks. To interrogate the chromatine phasing, distance between open chromatin regions, the method uses TritonNP, newly developed software, that uses Fourier transforms and band-pass filters. XGBoost [45] is utilized for classification on cancer subtype with using the features detected in previous steps. [46]

cfDNA Methylation, Copy Number, and Fragmentation Analysis for early detection of multiple cancer types

The method is proposed as an assay that employs both cfDNA whole genome methylation sequencing and fragmentomic feature information for multicancer classification. Copy number ratios calculated for healthy and cancerous tissues are used as a cancer type and cancer existence identifier. As done in EPIC-seq, the method also utilizes fragment lengths. Short fragment over long fragment ratio is used in the method as an identifier score. Using the single base or region level methylation percentages on detected cancer methylation markers for each cancer type, copy number ratios, and short/long fragment ratios; the method employs a custom Support Vector Machines algorithm to classify the cancer type if there exists one. This method reports the cancer detection and tissue-of-origin of 4 cancer types. However, it requires detection of specific methylation sites/regions of interest for cancer types [47]

Related Research Articles

Chromatin is a complex of DNA and protein found in eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important roles in reinforcing the DNA during cell division, preventing DNA damage, and regulating gene expression and DNA replication. During mitosis and meiosis, chromatin facilitates proper segregation of the chromosomes in anaphase; the characteristic shapes of chromosomes visible during this stage are the result of DNA being coiled into highly condensed chromatin.

ChIP-sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. ChIP-seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. It can be used to map global binding sites precisely for any protein of interest. Previously, ChIP-on-chip was the most common technique utilized to study these protein–DNA relations.

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Epigenetic modifications are reversible modifications on a cell's DNA or histones that affect gene expression without altering the DNA sequence. Epigenomic maintenance is a continuous process and plays an important role in stability of eukaryotic genomes by taking part in crucial biological mechanisms like DNA repair. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis. The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.

<span class="mw-page-title-main">RNA-Seq</span> Lab technique in cellular biology

RNA-Seq is a technique that uses next-generation sequencing to reveal the presence and quantity of RNA molecules in a biological sample, providing a snapshot of gene expression in the sample, also known as transcriptome.

<span class="mw-page-title-main">Chromatin immunoprecipitation</span> Genomic technique

Chromatin immunoprecipitation (ChIP) is a type of immunoprecipitation experimental technique used to investigate the interaction between proteins and DNA in the cell. It aims to determine whether specific proteins are associated with specific genomic regions, such as transcription factors on promoters or other DNA binding sites, and possibly define cistromes. ChIP also aims to determine the specific location in the genome that various histone modifications are associated with, indicating the target of the histone modifiers. ChIP is crucial for the advancements in the field of epigenomics and learning more about epigenetic phenomena.

<span class="mw-page-title-main">STARR-seq</span>

STARR-seq is a method to assay enhancer activity for millions of candidates from arbitrary sources of DNA. It is used to identify the sequences that act as transcriptional enhancers in a direct, quantitative, and genome-wide manner.

<span class="mw-page-title-main">Circulating tumor DNA</span> Tumor-derived fragmented DNA in the bloodstream

Circulating tumor DNA (ctDNA) is tumor-derived fragmented DNA in the bloodstream that is not associated with cells. ctDNA should not be confused with cell-free DNA (cfDNA), a broader term which describes DNA that is freely circulating in the bloodstream, but is not necessarily of tumor origin. Because ctDNA may reflect the entire tumor genome, it has gained traction for its potential clinical utility; "liquid biopsies" in the form of blood draws may be taken at various time points to monitor tumor progression throughout the treatment regimen.

ATAC-seq is a technique used in molecular biology to assess genome-wide chromatin accessibility. In 2013, the technique was first described as an alternative advanced method for MNase-seq, FAIRE-Seq and DNase-Seq. ATAC-seq is a faster analysis of the epigenome than DNase-seq or MNase-seq.

Circulating free DNA (cfDNA) (also known as cell-free DNA) are degraded DNA fragments released to body fluids such as blood plasma, urine, cerebrospinal fluid, etc. Typical sizes of cfDNA fragments reflect chromatosome particles (~165bp), as well as multiples of nucleosomes, which protect DNA from digestion by apoptotic nucleases. The term cfDNA can be used to describe various forms of DNA freely circulating in body fluids, including circulating tumor DNA (ctDNA), cell-free mitochondrial DNA (ccf mtDNA), cell-free fetal DNA (cffDNA) and donor-derived cell-free DNA (dd-cfDNA). Elevated levels of cfDNA are observed in cancer, especially in advanced disease. There is evidence that cfDNA becomes increasingly frequent in circulation with the onset of age. cfDNA has been shown to be a useful biomarker for a multitude of ailments other than cancer and fetal medicine. This includes but is not limited to trauma, sepsis, aseptic inflammation, myocardial infarction, stroke, transplantation, diabetes, and sickle cell disease. cfDNA is mostly a double-stranded extracellular molecule of DNA, consisting of small fragments (50 to 200 bp) and larger fragments (21 kb) and has been recognized as an accurate marker for the diagnosis of prostate cancer and breast cancer.

H3K27me3 is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the tri-methylation of lysine 27 on histone H3 protein.

H3K9me3 is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the tri-methylation at the 9th lysine residue of the histone H3 protein and is often associated with heterochromatin.

H4K20me is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the mono-methylation at the 20th lysine residue of the histone H4 protein. This mark can be di- and tri-methylated. It is critical for genome integrity including DNA damage repair, DNA replication and chromatin compaction.

H4K16ac is an epigenetic modification to the DNA packaging protein Histone H4. It is a mark that indicates the acetylation at the 16th lysine residue of the histone H4 protein.

H4K8ac, representing an epigenetic modification to the DNA packaging protein histone H4, is a mark indicating the acetylation at the 8th lysine residue of the histone H4 protein. It has been implicated in the prevalence of malaria.

H3K36ac is an epigenetic modification to the DNA packaging protein Histone H3. It is a mark that indicates the acetylation at the 36th lysine residue of the histone H3 protein.

<span class="mw-page-title-main">MNase-seq</span> Method used to analyse protein interactions with DNA

MNase-seq, short for micrococcal nuclease digestion with deep sequencing, is a molecular biological technique that was first pioneered in 2006 to measure nucleosome occupancy in the C. elegans genome, and was subsequently applied to the human genome in 2008. Though, the term ‘MNase-seq’ had not been coined until a year later, in 2009. Briefly, this technique relies on the use of the non-specific endo-exonuclease micrococcal nuclease, an enzyme derived from the bacteria Staphylococcus aureus, to bind and cleave protein-unbound regions of DNA on chromatin. DNA bound to histones or other chromatin-bound proteins may remain undigested. The uncut DNA is then purified from the proteins and sequenced through one or more of the various Next-Generation sequencing methods.

H4R3me2 is an epigenetic modification to the DNA packaging protein histone H4. It is a mark that indicates the di-methylation at the 3rd arginine residue of the histone H4 protein. In epigenetics, arginine methylation of histones H3 and H4 is associated with a more accessible chromatin structure and thus higher levels of transcription. The existence of arginine demethylases that could reverse arginine methylation is controversial.

Single-cell genome and epigenome by transposases sequencing (scGET-seq) is a DNA sequencing method for profiling open and closed chromatin. In contrast to single-cell assay for transposase-accessible chromatin with sequencing (scATAC-seq), which only targets active euchromatin, scGET-seq is also capable of probing inactive heterochromatin.

H3T3P is an epigenetic modification to the DNA packaging protein histone H3. It is a mark that indicates the phosphorylation the 3rd threonine residue of the histone H3 protein.

<span class="mw-page-title-main">NOMe-seq</span> NOMe-seq is a nucleosome occupancy and methylome technique.

Nucleosome Occupancy and Methylome Sequencing (NOMe-seq) is a genomics technique used to simultaneously detect nucleosome positioning and DNA methylation... This method is an extension of bisulfite sequencing, which is the gold standard for determining DNA methylation. NOMe-seq relies on the methyltransferase M.CviPl, which methylates cytosines in GpC dinucleotides unbound by nucleosomes or other proteins, creating a nucleosome footprint. The mammalian genome naturally contains DNA methylation, but only at CpG sites, so GpC methylation can be differentiated from genomic methylation after bisulfite sequencing. This allows simultaneous analysis of the nucleosome footprint and endogenous methylation on the same DNA molecules. In addition to nucleosome foot-printing, NOMe-seq can determine locations bound by transcription factors. Nucleosomes are bound by 147 base pairs of DNA whereas transcription factors or other proteins will only bind a region of approximately 10-80 base pairs. Following treatment with M.CviPl, nucleosome and transcription factor sites can be differentiated based on the size of the unmethylated GpC region.

References

  1. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 Esfahani, Mohammad Shahrokh; Hamilton, Emily G.; Mehrmohamadi, Mahya; Nabet, Barzin Y.; Alig, Stefan K.; King, Daniel A.; Steen, Chloé B.; Macaulay, Charles W.; Schultz, Andre; Nesselbush, Monica C.; Soo, Joanne; Schroers-Martin, Joseph G.; Chen, Binbin; Binkley, Michael S.; Stehr, Henning (April 2022). "Inferring gene expression from cell-free DNA fragmentation profiles". Nature Biotechnology. 40 (4): 585–597. doi:10.1038/s41587-022-01222-4. ISSN   1087-0156. PMC   9337986 . PMID   35361996.
  2. 1 2 3 4 5 6 7 Saraiva, Paulo (July 2023). "On Shannon entropy and its applications". Kuwait Journal of Science. 50 (3): 194–199. Bibcode:2023KwJS...50..194S. doi: 10.1016/j.kjs.2023.05.004 .
  3. Zhang, Lijing; Li, Jinming (2023-10-19). "Unlocking the secrets: the power of methylation-based cfDNA detection of tissue damage in organ systems". Clinical Epigenetics. 15 (1): 168. doi: 10.1186/s13148-023-01585-8 . ISSN   1868-7083. PMC   10588141 . PMID   37858233.
  4. Wang, Rui; Yang, Yue; Lu, Tianyu; Cui, Youbin; Li, Bo; Liu, Xin (2024-01-31). "Circulating cell-free DNA-based methylation pattern in plasma for early diagnosis of esophagus cancer". PeerJ. 12: e16802. doi: 10.7717/peerj.16802 . ISSN   2167-8359. PMC   10838104 . PMID   38313016.
  5. Scarpa, Joseph (December 2023). "Improving liver transplant outcomes with transplant-omics and network biology". Current Opinion in Organ Transplantation. 28 (6): 412–418. doi:10.1097/MOT.0000000000001100. ISSN   1087-2418. PMID   37706301. S2CID   261742668.
  6. 1 2 Chuah, Cher Shiong; Fischer, Lena; Ho, Gwo-Tzer (2023-06-14). "Circulating Cell-Free DNA in Inflammatory Bowel Disease: Liquid Biopsies with Mechanistic and Translational Implications". Faculty Reviews. 12: 14. doi: 10.12703/r/12-14 . ISSN   2732-432X. PMC   10281509 . PMID   37346090.
  7. Ranucci, Rossella (2019). "Cell-Free DNA: Applications in Different Diseases". Cell-free DNA as Diagnostic Markers: Methods and Protocols. Methods in Molecular Biology. Vol. 1909. New York, NY: Springer. pp. 3–12. doi:10.1007/978-1-4939-8973-7_1. ISBN   978-1-4939-8973-7. PMID   30580419. S2CID   58550698 . Retrieved 2024-02-23.
  8. Ivanov, Maxim; Baranova, Ancha; Butler, Timothy; Spellman, Paul; Mileyko, Vladislav (December 2015). "Non-random fragmentation patterns in circulating cell-free DNA reflect epigenetic regulation". BMC Genomics. 16 (S13): S1. doi: 10.1186/1471-2164-16-S13-S1 . ISSN   1471-2164. PMC   4686799 . PMID   26693644.
  9. Ulz, Peter; Thallinger, Gerhard G; Auer, Martina; Graf, Ricarda; Kashofer, Karl; Jahn, Stephan W; Abete, Luca; Pristauz, Gunda; Petru, Edgar; Geigl, Jochen B; Heitzer, Ellen; Speicher, Michael R (October 2016). "Inferring expressed genes by whole-genome sequencing of plasma DNA". Nature Genetics. 48 (10): 1273–1278. doi:10.1038/ng.3648. ISSN   1061-4036. PMID   27571261.
  10. 1 2 Jiang, Peiyong; Sun, Kun; Peng, Wenlei; Cheng, Suk Hang; Ni, Meng; Yeung, Philip C.; Heung, Macy M.S.; Xie, Tingting; Shang, Huimin; Zhou, Ze; Chan, Rebecca W.Y.; Wong, John; Wong, Vincent W.S.; Poon, Liona C.; Leung, Tak Yeung (2020-05-01). "Plasma DNA End-Motif Profiling as a Fragmentomic Marker in Cancer, Pregnancy, and Transplantation". Cancer Discovery. 10 (5): 664–673. doi:10.1158/2159-8290.CD-19-0622. ISSN   2159-8274. PMID   32111602. S2CID   211565376.
  11. 1 2 3 Sun, Kun; Jiang, Peiyong; Cheng, Suk Hang; Cheng, Timothy H.T.; Wong, John; Wong, Vincent W.S.; Ng, Simon S.M.; Ma, Brigette B.Y.; Leung, Tak Y.; Chan, Stephen L.; Mok, Tony S.K.; Lai, Paul B.S.; Chan, Henry L.Y.; Sun, Hao; Chan, K.C. Allen (March 2019). "Orientation-aware plasma cell-free DNA fragmentation analysis in open chromatin regions informs tissue of origin". Genome Research. 29 (3): 418–427. doi:10.1101/gr.242719.118. ISSN   1088-9051. PMC   6396422 . PMID   30808726.
  12. 1 2 Heitzer, Ellen; Auinger, Lisa; Speicher, Michael R. (May 2020). "Cell-Free DNA and Apoptosis: How Dead Cells Inform About the Living". Trends in Molecular Medicine. 26 (5): 519–528. doi:10.1016/j.molmed.2020.01.012. PMID   32359482. S2CID   213458341.
  13. Heitzer, Ellen; Haque, Imran S.; Roberts, Charles E. S.; Speicher, Michael R. (February 2019). "Current and future perspectives of liquid biopsies in genomics-driven oncology". Nature Reviews Genetics. 20 (2): 71–88. doi:10.1038/s41576-018-0071-5. ISSN   1471-0056. PMID   30410101. S2CID   53211888.
  14. 1 2 Phallen, Jillian; Sausen, Mark; Adleff, Vilmos; Leal, Alessandro; Hruban, Carolyn; White, James; Anagnostou, Valsamo; Fiksel, Jacob; Cristiano, Stephen; Papp, Eniko; Speir, Savannah; Reinert, Thomas; Orntoft, Mai-Britt Worm; Woodward, Brian D.; Murphy, Derek (2017-08-16). "Direct detection of early-stage cancers using circulating tumor DNA". Science Translational Medicine. 9 (403). doi:10.1126/scitranslmed.aan2415. ISSN   1946-6234. PMC   6714979 . PMID   28814544.
  15. 1 2 Cristiano, Stephen; Leal, Alessandro; Phallen, Jillian; Fiksel, Jacob; Adleff, Vilmos; Bruhm, Daniel C.; Jensen, Sarah Østrup; Medina, Jamie E.; Hruban, Carolyn; White, James R.; Palsgrove, Doreen N.; Niknafs, Noushin; Anagnostou, Valsamo; Forde, Patrick; Naidoo, Jarushka (June 2019). "Genome-wide cell-free DNA fragmentation in patients with cancer". Nature. 570 (7761): 385–389. Bibcode:2019Natur.570..385C. doi:10.1038/s41586-019-1272-6. ISSN   0028-0836. PMC   6774252 . PMID   31142840.
  16. 1 2 Newman, Aaron M; Bratman, Scott V; To, Jacqueline; Wynne, Jacob F; Eclov, Neville C W; Modlin, Leslie A; Liu, Chih Long; Neal, Joel W; Wakelee, Heather A; Merritt, Robert E; Shrager, Joseph B; Loo, Billy W; Alizadeh, Ash A; Diehn, Maximilian (May 2014). "An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage". Nature Medicine. 20 (5): 548–554. doi:10.1038/nm.3519. ISSN   1078-8956. PMC   4016134 . PMID   24705333.
  17. 1 2 3 Snyder, Matthew W.; Kircher, Martin; Hill, Andrew J.; Daza, Riza M.; Shendure, Jay (January 2016). "Cell-free DNA Comprises an In Vivo Nucleosome Footprint that Informs Its Tissues-Of-Origin". Cell. 164 (1–2): 57–68. doi:10.1016/j.cell.2015.11.050. PMC   4715266 . PMID   26771485.
  18. Ivanov, Maxim; Baranova, Ancha; Butler, Timothy; Spellman, Paul; Mileyko, Vladislav (December 2015). "Non-random fragmentation patterns in circulating cell-free DNA reflect epigenetic regulation". BMC Genomics. 16 (S13): S1. doi: 10.1186/1471-2164-16-s13-s1 . ISSN   1471-2164. PMC   4686799 . PMID   26693644.
  19. Ulz, Peter; Thallinger, Gerhard G; Auer, Martina; Graf, Ricarda; Kashofer, Karl; Jahn, Stephan W; Abete, Luca; Pristauz, Gunda; Petru, Edgar; Geigl, Jochen B; Heitzer, Ellen; Speicher, Michael R (2016-08-29). "Inferring expressed genes by whole-genome sequencing of plasma DNA". Nature Genetics. 48 (10): 1273–1278. doi:10.1038/ng.3648. ISSN   1061-4036. PMID   27571261.
  20. "Correction for Chen et al., Plasma butyrylcholinesterase regulates ghrelin to control aggression". Proceedings of the National Academy of Sciences. 112 (12): E1510. 2015-03-03. Bibcode:2015PNAS..112E1510.. doi: 10.1073/pnas.1503913112 . ISSN   0027-8424. PMC   4378400 . PMID   25737546.
  21. Mouliere, Florent; Chandrananda, Dineika; Piskorz, Anna M.; Moore, Elizabeth K.; Morris, James; Ahlborn, Lise Barlebo; Mair, Richard; Goranova, Teodora; Marass, Francesco; Heider, Katrin; Wan, Jonathan C. M.; Supernat, Anna; Hudecova, Irena; Gounaris, Ioannis; Ros, Susana (2018-11-07). "Enhanced detection of circulating tumor DNA by fragment size analysis". Science Translational Medicine. 10 (466). doi:10.1126/scitranslmed.aat4921. ISSN   1946-6234. PMC   6483061 . PMID   30404863.
  22. Underhill, Hunter R.; Kitzman, Jacob O.; Hellwig, Sabine; Welker, Noah C.; Daza, Riza; Baker, Daniel N.; Gligorich, Keith M.; Rostomily, Robert C.; Bronner, Mary P.; Shendure, Jay (2016-07-18). "Fragment Length of Circulating Tumor DNA". PLOS Genetics. 12 (7): e1006162. doi: 10.1371/journal.pgen.1006162 . ISSN   1553-7404. PMC   4948782 . PMID   27428049.
  23. Ulz, Peter; Perakis, Samantha; Zhou, Qing; Moser, Tina; Belic, Jelena; Lazzeri, Isaac; Wölfler, Albert; Zebisch, Armin; Gerger, Armin; Pristauz, Gunda; Petru, Edgar; White, Brandon; Roberts, Charles E. S.; John, John St.; Schimek, Michael G. (2019-10-11). "Inference of transcription factor binding from cell-free DNA enables tumor subtype prediction and early detection". Nature Communications. 10 (1): 4666. Bibcode:2019NatCo..10.4666U. doi:10.1038/s41467-019-12714-4. ISSN   2041-1723. PMC   6789008 . PMID   31604930.
  24. Newman, Aaron M; Bratman, Scott V; To, Jacqueline; Wynne, Jacob F; Eclov, Neville C W; Modlin, Leslie A; Liu, Chih Long; Neal, Joel W; Wakelee, Heather A; Merritt, Robert E; Shrager, Joseph B; Loo, Billy W; Alizadeh, Ash A; Diehn, Maximilian (May 2014). "An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage". Nature Medicine. 20 (5): 548–554. doi:10.1038/nm.3519. ISSN   1078-8956. PMC   4016134 . PMID   24705333.
  25. "Homo sapiens genome assembly GRCh37". NCBI. Retrieved 2024-02-23.
  26. "High Throughput Sequencing - an overview | ScienceDirect Topics". www.sciencedirect.com. Retrieved 2024-02-23.
  27. 1 2 3 Jiang, Peiyong; Lo, Y. M. Dennis (April 2022). "Enhanced cancer detection from cell-free DNA". Nature Biotechnology. 40 (4): 473–474. doi:10.1038/s41587-021-01207-9. ISSN   1546-1696. PMID   35361997. S2CID   247854113.
  28. Alig, Stefan K.; Shahrokh Esfahani, Mohammad; Garofalo, Andrea; Li, Michael Yu; Rossi, Cédric; Flerlage, Tim; Flerlage, Jamie E.; Adams, Ragini; Binkley, Michael S.; Shukla, Navika; Jin, Michael C.; Olsen, Mari; Telenius, Adèle; Mutter, Jurik A.; Schroers-Martin, Joseph G. (January 2024). "Distinct Hodgkin lymphoma subtypes defined by noninvasive genomic profiling". Nature. 625 (7996): 778–787. Bibcode:2024Natur.625..778A. doi:10.1038/s41586-023-06903-x. ISSN   1476-4687. PMC   11293530 . PMID   38081297. S2CID   266224498.
  29. Li, Chennan; Baj, Anna; Sowalsky, Adam G. (April 2023). "One toolkit to bring them all, and in silico analyze them". Clinical and Translational Discovery. 3 (2). doi:10.1002/ctd2.194. ISSN   2768-0622. PMC   10201993 . PMID   37220531.
  30. Park, Peter J. (October 2009). "ChIP–seq: advantages and challenges of a maturing technology". Nature Reviews Genetics. 10 (10): 669–680. doi:10.1038/nrg2641. ISSN   1471-0064. PMC   3191340 . PMID   19736561.
  31. Grandi, Fiorella C.; Modi, Hailey; Kampman, Lucas; Corces, M. Ryan (June 2022). "Chromatin accessibility profiling by ATAC-seq". Nature Protocols. 17 (6): 1518–1552. doi:10.1038/s41596-022-00692-9. ISSN   1750-2799. PMC   9189070 . PMID   35478247.
  32. Staunstrup, Nicklas H.; Starnawska, Anna; Nyegaard, Mette; Christiansen, Lene; Nielsen, Anders L.; Børglum, Anders; Mors, Ole (December 2016). "Genome-wide DNA methylation profiling with MeDIP-seq using archived dried blood spots". Clinical Epigenetics. 8 (1): 81. doi: 10.1186/s13148-016-0242-1 . ISSN   1868-7075. PMC   4960904 . PMID   27462375.
  33. Niemöller, Christoph; Wehrle, Julius; Riba, Julian; Claus, Rainer; Renz, Nathalie; Rhein, Janika; Bleul, Sabine; Stosch, Juliane M.; Duyster, Justus; Plass, Christoph; Lutsik, Pavlo; Lipka, Daniel B.; Lübbert, Michael; Becker, Heiko (2021-02-01). "Bisulfite-free epigenomics and genomics of single cells through methylation-sensitive restriction". Communications Biology. 4 (1): 153. doi:10.1038/s42003-021-01661-w. ISSN   2399-3642. PMC   7851132 . PMID   33526904.
  34. Wang, Zhong; Gerstein, Mark; Snyder, Michael (January 2009). "RNA-Seq: a revolutionary tool for transcriptomics". Nature Reviews Genetics. 10 (1): 57–63. doi:10.1038/nrg2484. ISSN   1471-0056. PMC   2949280 . PMID   19015660.
  35. Jovic, Dragomirka; Liang, Xue; Zeng, Hua; Lin, Lin; Xu, Fengping; Luo, Yonglun (March 2022). "Single-cell RNA sequencing technologies and applications: A brief overview". Clinical and Translational Medicine. 12 (3): e694. doi:10.1002/ctm2.694. ISSN   2001-1326. PMC   8964935 . PMID   35352511.
  36. 1 2 Connal, Siobhan; Cameron, James M.; Sala, Alexandra; Brennan, Paul M.; Palmer, David S.; Palmer, Joshua D.; Perlow, Haley; Baker, Matthew J. (2023-02-11). "Liquid biopsies: the future of cancer early detection". Journal of Translational Medicine. 21 (1): 118. doi: 10.1186/s12967-023-03960-8 . ISSN   1479-5876. PMC   9922467 . PMID   36774504.
  37. Barbora, Ayan; Karri, Sirish; Firer, Michael A.; Minnes, Refael (2023-11-02). "Multifractal analysis of cellular ATR-FTIR spectrum as a method for identifying and quantifying cancer cell metastatic levels". Scientific Reports. 13 (1): 18935. Bibcode:2023NatSR..1318935B. doi:10.1038/s41598-023-46014-1. ISSN   2045-2322. PMC   10622493 . PMID   37919384.
  38. Cohen, Joshua D.; Li, Lu; Wang, Yuxuan; Thoburn, Christopher; Afsari, Bahman; Danilova, Ludmila; Douville, Christopher; Javed, Ammar A.; Wong, Fay; Mattox, Austin; Hruban, Ralph H.; Wolfgang, Christopher L.; Goggins, Michael G.; Dal Molin, Marco; Wang, Tian-Li (2018-02-23). "Detection and localization of surgically resectable cancers with a multi-analyte blood test". Science. 359 (6378): 926–930. Bibcode:2018Sci...359..926C. doi:10.1126/science.aar3247. ISSN   0036-8075. PMC   6080308 . PMID   29348365.
  39. "The Future of Cancer Diagnostics". Dxcover. Retrieved 2024-02-23.
  40. Visser, Wieke C. H.; de Jong, Hans; Steyaert, Sandra; Melchers, Willem J. G.; Mulders, Peter F. A.; Schalken, Jack A. (September 2022). "Clinical use of the mRNA urinary biomarker SelectMDx test for prostate cancer". Prostate Cancer and Prostatic Diseases. 25 (3): 583–589. doi:10.1038/s41391-022-00562-1. ISSN   1365-7852. PMC   9385481 . PMID   35810263.
  41. Qi, Ting; Pan, Min; Shi, Huajuan; Wang, Liangying; Bai, Yunfei; Ge, Qinyu (2023-01-12). "Cell-Free DNA Fragmentomics: The Novel Promising Biomarker". International Journal of Molecular Sciences. 24 (2): 1503. doi: 10.3390/ijms24021503 . ISSN   1422-0067. PMC   9866579 . PMID   36675018.
  42. Mouliere, Florent; Chandrananda, Dineika; Piskorz, Anna M.; Moore, Elizabeth K.; Morris, James; Ahlborn, Lise Barlebo; Mair, Richard; Goranova, Teodora; Marass, Francesco; Heider, Katrin; Wan, Jonathan C. M.; Supernat, Anna; Hudecova, Irena; Gounaris, Ioannis; Ros, Susana (2018-11-07). "Enhanced detection of circulating tumor DNA by fragment size analysis". Science Translational Medicine. 10 (466). doi:10.1126/scitranslmed.aat4921. ISSN   1946-6234. PMC   6483061 . PMID   30404863.
  43. Cristiano, Stephen; Leal, Alessandro; Phallen, Jillian; Fiksel, Jacob; Adleff, Vilmos; Bruhm, Daniel C.; Jensen, Sarah Østrup; Medina, Jamie E.; Hruban, Carolyn; White, James R.; Palsgrove, Doreen N.; Niknafs, Noushin; Anagnostou, Valsamo; Forde, Patrick; Naidoo, Jarushka (June 2019). "Genome-wide cell-free DNA fragmentation in patients with cancer". Nature. 570 (7761): 385–389. Bibcode:2019Natur.570..385C. doi:10.1038/s41586-019-1272-6. ISSN   0028-0836. PMC   6774252 . PMID   31142840.
  44. Doebley, Anna-Lisa; Ko, Minjeong; Liao, Hanna; Cruikshank, A. Eden; Kikawa, Caroline; Santos, Katheryn; Hiatt, Joseph; Patton, Robert D.; De Sarkar, Navonil (2021-09-03). "Griffin: Framework for clinical cancer subtyping from nucleosome profiling of cell-free DNA". doi:10.1101/2021.08.31.21262867. S2CID   237397242 . Retrieved 2024-03-09.{{cite journal}}: Cite journal requires |journal= (help)
  45. Chen, Tianqi; Guestrin, Carlos (2016-08-13). "XGBoost: A Scalable Tree Boosting System". Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 785–794. arXiv: 1603.02754 . doi:10.1145/2939672.2939785. ISBN   978-1-4503-4232-2.
  46. De Sarkar, Navonil; Patton, Robert D.; Doebley, Anna-Lisa; Hanratty, Brian; Adil, Mohamed; Kreitzman, Adam J.; Sarthy, Jay F.; Ko, Minjeong; Brahma, Sandipan; Meers, Michael P.; Janssens, Derek H.; Ang, Lisa S.; Coleman, Ilsa M.; Bose, Arnab; Dumpit, Ruth F.; Lucas, Jared M.; Nunez, Talina A.; Nguyen, Holly M.; McClure, Heather M.; Pritchard, Colin C.; Schweizer, Michael T.; Morrissey, Colm; Choudhury, Atish D.; Baca, Sylvan C.; Berchuck, Jacob E.; Freedman, Matthew L.; Ahmad, Kami; Haffner, Michael C.; Montgomery, R. Bruce; Corey, Eva; Henikoff, Steven; Nelson, Peter S.; Ha, Gavin (1 March 2023). "Nucleosome Patterns in Circulating Tumor DNA Reveal Transcriptional Regulation of Advanced Prostate Cancer Phenotypes". Cancer Discovery. 13 (3): 632–653. doi:10.1158/2159-8290.CD-22-0692. PMC   9976992 . PMID   36399432.
  47. Kim, Su Yeon; Jeong, Seongmun; Lee, Wookjae; Jeon, Yujin; Kim, Yong-Jin; Park, Seowoo; Lee, Dongin; Go, Dayoung; Song, Sang-Hyun; Lee, Sanghoo; Woo, Hyun Goo; Yoon, Jung-Ki; Park, Young Sik; Kim, Young Tae; Lee, Se-Hoon (November 2023). "Cancer signature ensemble integrating cfDNA methylation, copy number, and fragmentation facilitates multi-cancer early detection". Experimental & Molecular Medicine. 55 (11): 2445–2460. doi:10.1038/s12276-023-01119-5. ISSN   2092-6413. PMC   10689759 . PMID   37907748.