Tohoku Medical Megabank Project

Last updated
Tohoku Medical Megabank Project
ToMMo building.jpg
Building of ToMMo at Tohoku Univ.
Commercial?No
Type of projectProspective cohort study
LocationSendai, Miyagi, JAPAN (ToMMo) and Shiwa-gun, Iwate, JAPAN (IMM)
Key peopleMasayuki Yamamoto (Executive Director, ToMMo) Makoto Sasaki (Executive Director, IMM)
Established2012

The Tohoku Medical Megabank Project is a national project in Japan, which started in 2012. [1] The mission of the Tohoku Medical Megabank (TMM) project is to carry out a long-term health survey in the Miyagi and Iwate prefectures, which were affected by the Great East Japan Earthquake, and provide the research infrastructure for the development of personalized medicine by establishing a biobank and conducting cohort studies. [2] [3]

Contents

The TMM project is conducted by Tohoku University's Tohoku Medical Megabank Organization (ToMMo) and Iwate Medical University's Iwate Tohoku Medical Megabank Organization (IMM).

Projects

The following cohort studies and research projects are in progress or completed.

Prospective cohort studies

The study recruited over 80,000 participants, with 50,000 from Miyagi Prefecture and 30,000 from Iwate Prefecture.

The study recruited over 70,000 participants, including over 23,000 newborns and their parents, siblings, grandparents, and extended family members. [3]

The project has been collecting various specimens, including peripheral and cord blood mononuclear cells, Buffy coat, plasma, serum, urine, breast milk, and saliva. [2]

Research projects

The project provides allele frequencies of SNVs (single nucleotide variants) and short INDELs (insertions/deletions) detected by short-read whole genome sequencing (WGS) analysis of 54,000 Japanese individuals, as well as allele frequencies of structural variants detected by long-read WGS analysis of 222 individuals. [7]

The project designed ethnic-specific SNP arrays optimized for the Japanese population and performed a series of GWAS analyses with the cohort samples. [8]

The project constructed a reference genome for Japanese by integrating de novo assemblies of three Japanese individuals.

Timeline

2013

2014

2015

2016

2017

2018

2019

2020

2021

2022

2023

Related Research Articles

<span class="mw-page-title-main">Human genome</span> Complete set of nucleic acid sequences for humans

The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.

<span class="mw-page-title-main">Single-nucleotide polymorphism</span> Single nucleotide in genomic DNA at which different sequence alternatives exist

In genetics and bioinformatics, a single-nucleotide polymorphism is a germline substitution of a single nucleotide at a specific position in the genome that is present in a sufficiently large fraction of considered population.

The International HapMap Project was an organization that aimed to develop a haplotype map (HapMap) of the human genome, to describe the common patterns of human genetic variation. HapMap is used to find genetic variants affecting health, disease and responses to drugs and environmental factors. The information produced by the project is made freely available for research.

<span class="mw-page-title-main">UK Biobank</span> Long-term biobank study of 500,000 people

UK Biobank is a large long-term biobank study in the United Kingdom (UK) which is investigating the respective contributions of genetic predisposition and environmental exposure to the development of disease. It began in 2006. UK Biobank has been cited as an important resource for cancer research.

<span class="mw-page-title-main">Ancestry-informative marker</span>

In population genetics, an ancestry-informative marker (AIM) is a single-nucleotide polymorphism that exhibits substantially different frequencies between different populations. A set of many AIMs can be used to estimate the proportion of ancestry of an individual derived from each population.

Genotyping is the process of determining differences in the genetic make-up (genotype) of an individual by examining the individual's DNA sequence using biological assays and comparing it to another individual's sequence or a reference sequence. It reveals the alleles an individual has inherited from their parents. Traditionally genotyping is the use of DNA sequences to define biological populations by use of molecular tools. It does not usually involve defining the genes of an individual.

In molecular biology, SNP array is a type of DNA microarray which is used to detect polymorphisms within a population. A single nucleotide polymorphism (SNP), a variation at a single site in DNA, is the most frequent type of variation in the genome. Around 335 million SNPs have been identified in the human genome, 15 million of which are present at frequencies of 1% or higher across different populations worldwide.

<span class="mw-page-title-main">Genome-wide association study</span> Study of genetic variants in different individuals

In genomics, a genome-wide association study, is an observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. GWA studies typically focus on associations between single-nucleotide polymorphisms (SNPs) and traits like major human diseases, but can equally be applied to any other genetic variants and any other organisms.

<span class="mw-page-title-main">1000 Genomes Project</span> International research effort on genetic variation

The 1000 Genomes Project (1KGP), taken place from January 2008 to 2015, was an international research effort to establish the most detailed catalogue of human genetic variation at the time. Scientists planned to sequence the genomes of at least one thousand anonymous healthy participants from a number of different ethnic groups within the following three years, using advancements in newly developed technologies. In 2010, the project finished its pilot phase, which was described in detail in a publication in the journal Nature. In 2012, the sequencing of 1092 genomes was announced in a Nature publication. In 2015, two papers in Nature reported results and the completion of the project and opportunities for future research.

Population genomics is the large-scale comparison of DNA sequences of populations. Population genomics is a neologism that is associated with population genetics. Population genomics studies genome-wide effects to improve our understanding of microevolution so that we may learn the phylogenetic history and demography of a population.

<span class="mw-page-title-main">Whole genome sequencing</span> Determining nearly the entirety of the DNA sequence of an organisms genome at a single time

Whole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast.

<span class="mw-page-title-main">Exome sequencing</span> Sequencing of all the exons of a genome

Exome sequencing, also known as whole exome sequencing (WES), is a genomic technique for sequencing all of the protein-coding regions of genes in a genome. It consists of two steps: the first step is to select only the subset of DNA that encodes proteins. These regions are known as exons—humans have about 180,000 exons, constituting about 1% of the human genome, or approximately 30 million base pairs. The second step is to sequence the exonic DNA using any high-throughput DNA sequencing technology.

<span class="mw-page-title-main">Reference genome</span> Digital nucleic acid sequence database

A reference genome is a digital nucleic acid sequence database, assembled by scientists as a representative example of the set of genes in one idealized individual organism of a species. As they are assembled from the sequencing of DNA from a number of individual donors, reference genomes do not accurately represent the set of genes of any single individual organism. Instead, a reference provides a haploid mosaic of different DNA sequences from each donor. For example, one of the most recent human reference genomes, assembly GRCh38/hg38, is derived from >60 genomic clone libraries. There are reference genomes for multiple species of viruses, bacteria, fungus, plants, and animals. Reference genomes are typically used as a guide on which new genomes are built, enabling them to be assembled much more quickly and cheaply than the initial Human Genome Project. Reference genomes can be accessed online at several locations, using dedicated browsers such as Ensembl or UCSC Genome Browser.

In genetics, imputation is the statistical inference of unobserved genotypes. It is achieved by using known haplotypes in a population, for instance from the HapMap or the 1000 Genomes Project in humans, thereby allowing to test for association between a trait of interest and experimentally untyped genetic variants, but whose genotypes have been statistically inferred ("imputed"). Genotype imputation is usually performed on SNPs, the most common kind of genetic variation.

In the field of genetic sequencing, genotyping by sequencing, also called GBS, is a method to discover single nucleotide polymorphisms (SNP) in order to perform genotyping studies, such as genome-wide association studies (GWAS). GBS uses restriction enzymes to reduce genome complexity and genotype multiple DNA samples. After digestion, PCR is performed to increase fragments pool and then GBS libraries are sequenced using next generation sequencing technologies, usually resulting in about 100bp single-end reads. It is relatively inexpensive and has been used in plant breeding. Although GBS presents an approach similar to restriction-site-associated DNA sequencing (RAD-seq) method, they differ in some substantial ways.

Elective genetic and genomic testing are DNA tests performed for an individual who does not have an indication for testing. An elective genetic test analyzes selected sites in the human genome while an elective genomic test analyzes the entire human genome. Some elective genetic and genomic tests require a physician to order the test to ensure that individuals understand the risks and benefits of testing as well as the results. Other DNA-based tests, such as a genealogical DNA test do not require a physician's order. Elective testing is generally not paid for by health insurance companies. With the advent of personalized medicine, also called precision medicine, an increasing number of individuals are undertaking elective genetic and genomic testing.

Amanda M. Hulse-Kemp is a computational biologist with the United States Department of Agriculture – Agricultural Research Service. She works in the Genomics and Bioinformatics Research Unit and is stationed on the North Carolina State University campus in Raleigh, North Carolina.

Human Heredity and Health in Africa, or H3Africa, is an initiative to study the genomics and medical genetics of African people. Its goals are to build the continent's research infrastructure, train researchers and clinicians, and to study questions of scientific and medical interest to Africans. The H3Africa Consortium was formally launched in 2012 in Addis Ababa and has grown to include research projects across 32 countries, a pan-contintental bioinformatics network, and the first whole genome sequencing of many African ethnolinguistic groups.

Personalized genomics is the human genetics-derived study of analyzing and interpreting individualized genetic information by genome sequencing to identify genetic variations compared to the library of known sequences. International genetics communities have spared no effort from the past and have gradually cooperated to prosecute research projects to determine DNA sequences of the human genome using DNA sequencing techniques. The methods that are the most commonly used are whole exome sequencing and whole genome sequencing. Both approaches are used to identify genetic variations. Genome sequencing became more cost-effective over time, and made it applicable in the medical field, allowing scientists to understand which genes are attributed to specific diseases.

References

  1. 1 2 3 4 5 6 7 Kuriyama, Shinichi; Yaegashi, Nobuo; Nagami, Fuji; Arai, Tomohiko; Kawaguchi, Yoshio; Osumi, Noriko; Sakaida, Masaki; Suzuki, Yoichi; Nakayama, Keiko; Hashizume, Hiroaki; Tamiya, Gen (2016). "The Tohoku Medical Megabank Project: Design and Mission". Journal of Epidemiology. 26 (9): 493–511. doi:10.2188/jea.JE20150268. PMC   5008970 . PMID   27374138.
  2. 1 2 3 Minegishi, Naoko; Nishijima, Ichiko; Nobukuni, Takahiro; Kudo, Hisaaki; Ishida, Noriko; Terakawa, Takahiro; Kumada, Kazuki; Yamashita, Riu; Katsuoka, Fumiki; Ogishima, Soichi; Suzuki, Kichiya (2019). "Biobank Establishment and Sample Management in the Tohoku Medical Megabank Project". The Tohoku Journal of Experimental Medicine. 248 (1): 45–55. doi: 10.1620/tjem.248.45 . PMID   31130587. S2CID   167210746.
  3. 1 2 Kuriyama, Shinichi; Metoki, Hirohito; Kikuya, Masahiro; Obara, Taku; Ishikuro, Mami; Yamanaka, Chizuru; Nagai, Masato; Matsubara, Hiroko; Kobayashi, Tomoko; Sugawara, Junichi; Tamiya, Gen (2020-02-01). "Cohort Profile: Tohoku Medical Megabank Project Birth and Three-Generation Cohort Study (TMM BirThree Cohort Study): rationale, progress and perspective". International Journal of Epidemiology. 49 (1): 18–19m. doi:10.1093/ije/dyz169. ISSN   0300-5771. PMC   7124511 . PMID   31504573.
  4. Ogishima, Soichi; Nagaie, Satoshi; Mizuno, Satoshi; Ishiwata, Ryosuke; Iida, Keita; Shimokawa, Kazuro; Takai-Igarashi, Takako; Nakamura, Naoki; Nagase, Sachiko; Nakamura, Tomohiro; Tsuchiya, Naho (2021-12-10). "dbTMM: an integrated database of large-scale cohort, genome and clinical data for the Tohoku Medical Megabank Project". Human Genome Variation. 8 (1): 44. doi:10.1038/s41439-021-00175-5. ISSN   2054-345X. PMC   8660797 . PMID   34887386.
  5. Kawame, Hiroshi; Fukushima, Akimune; Fuse, Nobuo; Nagami, Fuji; Suzuki, Yoichi; Sakurai-Yageta, Mika; Yasuda, Jun; Yamaguchi-Kabata, Yumi; Kinoshita, Kengo; Ogishima, Soichi; Takai, Takako (January 2022). "The return of individual genomic results to research participants: design and pilot study of Tohoku Medical Megabank Project". Journal of Human Genetics. 67 (1): 9–17. doi:10.1038/s10038-021-00952-8. ISSN   1435-232X. PMID   34234266. S2CID   235768398.
  6. 1 2 Yasuda, Jun; Kinoshita, Kengo; Katsuoka, Fumiki; Danjoh, Inaho; Sakurai-Yageta, Mika; Motoike, Ikuko N.; Kuroki, Yoko; Saito, Sakae; Kojima, Kaname; Shirota, Matsuyuki; Saigusa, Daisuke (2019-02-01). "Genome analyses for the Tohoku Medical Megabank Project towards establishment of personalized healthcare". Journal of Biochemistry. 165 (2): 139–158. doi:10.1093/jb/mvy096. ISSN   1756-2651. PMID   30452759.
  7. "jMorp". jmorp.megabank.tohoku.ac.jp. Retrieved 2024-03-12.
  8. Sakurai-Yageta, Mika; Kumada, Kazuki; Gocho, Chinatsu; Makino, Satoshi; Uruno, Akira; Tadaka, Shu; Motoike, Ikuko N; Kimura, Masae; Ito, Shin; Otsuki, Akihito; Narita, Akira; Kudo, Hisaaki; Aoki, Yuichi; Danjoh, Inaho; Yasuda, Jun (2021-10-12). "Japonica Array NEO with increased genome-wide coverage and abundant disease risk SNPs". The Journal of Biochemistry. 170 (3): 399–410. doi:10.1093/jb/mvab060. ISSN   0021-924X. PMC   8510329 . PMID   34131746.
  9. 1 2 Takayama, Jun; Tadaka, Shu; Yano, Kenji; Katsuoka, Fumiki; Gocho, Chinatsu; Funayama, Takamitsu; Makino, Satoshi; Okamura, Yasunobu; Kikuchi, Atsuo; Sugimoto, Sachiyo; Kawashima, Junko (2021-01-11). "Construction and integration of three de novo Japanese human genome assemblies toward a population-specific reference". Nature Communications. 12 (1): 226. Bibcode:2021NatCo..12..226T. doi:10.1038/s41467-020-20146-8. ISSN   2041-1723. PMC   7801658 . PMID   33431880.
  10. 1 2 Koshiba, Seizo; Motoike, Ikuko; Saigusa, Daisuke; Inoue, Jin; Shirota, Matsuyuki; Katoh, Yasutake; Katsuoka, Fumiki; Danjoh, Inaho; Hozawa, Atsushi; Kuriyama, Shinichi; Minegishi, Naoko (June 2018). "Omics research project on prospective cohort studies from the Tohoku Medical Megabank Project". Genes to Cells: Devoted to Molecular & Cellular Mechanisms. 23 (6): 406–417. doi: 10.1111/gtc.12588 . ISSN   1365-2443. PMID   29701317. S2CID   21700516.
  11. 1 2 Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Shiwa, Yuh; Ono, Kanako; Suzuki, Sadafumi; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Shimizu, Atsushi (2016-05-19). "Intraindividual dynamics of transcriptome and genome-wide stability of DNA methylation". Scientific Reports. 6: 26424. Bibcode:2016NatSR...626424F. doi:10.1038/srep26424. ISSN   2045-2322. PMC   4872231 . PMID   27192970.
  12. Saito, Sakae; Aoki, Yuichi; Tamahara, Toru; Goto, Maki; Matsui, Hiroyuki; Kawashima, Junko; Danjoh, Inaho; Hozawa, Atsushi; Kuriyama, Shinichi; Suzuki, Yoichi; Fuse, Nobuo (2020). "Oral Microbiome Analysis in Prospective Genome Cohort Studies of the Tohoku Medical Megabank Project". Frontiers in Cellular and Infection Microbiology. 10: 604596. doi: 10.3389/fcimb.2020.604596 . ISSN   2235-2988. PMC   7878372 . PMID   33585276.
  13. Nagasaki, Masao; Yasuda, Jun; Katsuoka, Fumiki; Nariai, Naoki; Kojima, Kaname; Kawai, Yosuke; Yamaguchi-Kabata, Yumi; Yokozawa, Junji; Danjoh, Inaho; Saito, Sakae; Sato, Yukuto (2015-08-21). "Rare variant discovery by deep whole-genome sequencing of 1,070 Japanese individuals". Nature Communications. 6: 8018. Bibcode:2015NatCo...6.8018.. doi:10.1038/ncomms9018. ISSN   2041-1723. PMC   4560751 . PMID   26292667.
  14. Kawai, Yosuke; Mimori, Takahiro; Kojima, Kaname; Nariai, Naoki; Danjoh, Inaho; Saito, Rumiko; Yasuda, Jun; Yamamoto, Masayuki; Nagasaki, Masao (October 2015). "Japonica array: improved genotype imputation by designing a population-specific SNP array with 1070 Japanese individuals". Journal of Human Genetics. 60 (10): 581–587. doi:10.1038/jhg.2015.68. ISSN   1434-5161. PMC   4635170 . PMID   26108142.
  15. Tadaka, Shu; Saigusa, Daisuke; Motoike, Ikuko N.; Inoue, Jin; Aoki, Yuichi; Shirota, Matsuyuki; Koshiba, Seizo; Yamamoto, Masayuki; Kinoshita, Kengo (2018-01-04). "jMorp: Japanese Multi Omics Reference Panel". Nucleic Acids Research. 46 (D1): D551–D557. doi:10.1093/nar/gkx978. ISSN   1362-4962. PMC   5753289 . PMID   29069501.
  16. Tadaka, Shu; Katsuoka, Fumiki; Ueki, Masao; Kojima, Kaname; Makino, Satoshi; Saito, Sakae; Otsuki, Akihito; Gocho, Chinatsu; Sakurai-Yageta, Mika; Danjoh, Inaho; Motoike, Ikuko N. (2019). "3.5KJPNv2: an allele frequency panel of 3552 Japanese individuals including the X chromosome". Human Genome Variation. 6: 28. doi:10.1038/s41439-019-0059-5. ISSN   2054-345X. PMC   6581902 . PMID   31240104.
  17. Sakurai-Yageta, Mika; Kumada, Kazuki; Gocho, Chinatsu; Makino, Satoshi; Uruno, Akira; Tadaka, Shu; Motoike, Ikuko N.; Kimura, Masae; Ito, Shin; Otsuki, Akihito; Narita, Akira (2021-10-12). "Japonica Array NEO with increased genome-wide coverage and abundant disease risk SNPs". Journal of Biochemistry. 170 (3): 399–410. doi:10.1093/jb/mvab060. ISSN   1756-2651. PMC   8510329 . PMID   34131746.
  18. "Genome information for 14,000 people is now available -jMorp 2021 major update-|ToMMo". ToMMo | Tohoku Medical Megabank Organization (in Japanese). 2021-12-27. Retrieved 2022-07-01.
  19. Otsuki, Akihito; Okamura, Yasunobu; Ishida, Noriko; Tadaka, Shu; Takayama, Jun; Kumada, Kazuki; Kawashima, Junko; Taguchi, Keiko; Minegishi, Naoko; Kuriyama, Shinichi; Tamiya, Gen; Kinoshita, Kengo; Katsuoka, Fumiki; Yamamoto, Masayuki (2022-09-20). "Construction of a trio-based structural variation panel utilizing activated T lymphocytes and long-read sequencing technology". Communications Biology. 5 (1): 991. doi:10.1038/s42003-022-03953-1. ISSN   2399-3642. PMC   9489684 . PMID   36127505.
  20. "Completion of 50,000 Japanese whole genome sequence analysis and release of whole genome reference panel of 38,000 Japanese individuals (38KJPN)|ToMMo". ToMMo | Tohoku Medical Megabank Organization (in Japanese). 2022-07-19. Retrieved 2022-10-13.
  21. "jMorp". jmorp.megabank.tohoku.ac.jp. Retrieved 2023-07-11.