Lexical Variant Generation (software)

Last updated
Lexical Variant Generation
Developer(s) The Lexical Systems Group
Initial release2002;21 years ago (2002)
Stable release
lvg2014 / December 13, 2013;9 years ago (2013-12-13)
Written in Java
Platform Java SE
Type Lexical semantics
License NLM copyright and terms of use

Lexical Variant Generation (lvg) is a suite of CLI tools that are used to perform lexical transformations to text. The goal is to generate lexical variants in Natural language processing of patient clinical documents. [1]

Contents

See also

Related Research Articles

Lexical tokenization is conversion of a text into meaningful lexical tokens belonging to categories defined by a "lexer" program. In case of a natural language, those categories include nouns, verbs, adjectives, punctuations etc. In case of a programming language, the categories include identifiers, operators, grouping symbols and data types. Lexical tokenization is not the same process as the probabilistic tokenization, used for large language model's data preprocessing, that encode text into numerical tokens, using byte pair encoding.

Lex is a computer program that generates lexical analyzers.

<span class="mw-page-title-main">National Center for Biotechnology Information</span> Database branch of the US National Library of Medicine

The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). It is approved and funded by the government of the United States. The NCBI is located in Bethesda, Maryland, and was founded in 1988 through legislation sponsored by US Congressman Claude Pepper.

<span class="mw-page-title-main">Doxygen</span> Free software for generating software documentation from source code

Doxygen is a documentation generator and static analysis tool for software source trees. When used as a documentation generator, Doxygen extracts information from specially-formatted comments within the code. When used for analysis, Doxygen uses its parse tree to generate diagrams and charts of the code structure. Doxygen can cross reference documentation and code, so that the reader of a document can easily refer to the actual code.

MedlinePlus is an online information service produced by the United States National Library of Medicine. The service provides curated consumer health information in English and Spanish with select content in additional languages. The site brings together information from the National Library of Medicine (NLM), the National Institutes of Health (NIH), other U.S. government agencies, and health-related organizations. There is also a site optimized for display on mobile devices, in both English and Spanish. In 2015, about 400 million people from around the world used MedlinePlus. The service is funded by the NLM and is free to users.

Haplogroup W is a human mitochondrial DNA (mtDNA) haplogroup.

The American Association for Medical Systems and Informatics (AAMSI) was an organization created to encourage improvements in the state of medical care by encouraging the development of computer systems for that field.

Health Sciences Online (HSO) is a non-profit online health information resource that launched in December 2008. The website hosts a virtual learning center providing weblinks to a collection of more than 50,000 courses, references, textbooks, guidelines, lectures, presentations, cases, articles, images and videos, available in 42 different languages. The content includes medicine, public health, nursing, pharmacy, dentistry, nutrition, kinesiology and other health sciences resources.

<span class="mw-page-title-main">V. Mohan</span> Indian physician

V. Mohan is an Indian physician/scientist specializing in diabetology. He is the Chairman of Dr. Mohan’s Diabetes Specialities Centre, which is an IDF Centre of Excellence in Diabetes Care. He is also the Chairman of the Madras Diabetes Research Foundation in Chennai which is an ICMR Centre for Advanced Research on Diabetes.

<span class="mw-page-title-main">Journal Article Tag Suite</span>

The Journal Article Tag Suite (JATS) is an XML format used to describe scientific literature published online. It is a technical standard developed by the National Information Standards Organization (NISO) and approved by the American National Standards Institute with the code Z39.96-2012.

<span class="mw-page-title-main">FAM63A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 63, member A is a protein that, is encoded by the FAM63A gene in humans,. It is located on the minus strand of chromosome 1 at locus 1q21.3.

KIAA0753 is a protein that in humans is encoded by the gene KIAA0753. The gene is located on chromosome 17p13.1, on the reverse strand spanning bases 6578141 to 6641744. The KIAA0753 gene contains 18 exons, 19 introns, and has no known aliases.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">TMEM176B</span> Protein-coding gene in the species Homo sapiens

Transmembrane Protein 176B, or TMEM176B is a transmembrane protein that in humans is encoded by the TMEM176B gene. It is thought to play a role in the process of maturation of dendritic cells.

Antenna House Formatter is a proprietary software program that uses either XSL-FO or Cascading Style Sheets (CSS) to convert XML and HTML documents into PDF, SVG, PostScript, XPS, text, and Microsoft Word formats. It supports 30 scripts and over 80 languages.

<span class="mw-page-title-main">TMEM125</span> Protein

Transmembrane protein 125 is a protein that, in humans, is encoded by the TMEM125 gene. It has 4 transmembrane domains and is expressed in the lungs, thyroid, pancreas, intestines, spinal cord, and brain. Though its function is currently poorly understood by the scientific community, research indicates it may be involved in colorectal and lung cancer networks. Additionally, it was identified as a cell adhesion molecule in oligodendrocytes, suggesting it may play a role in neuron myelination.

<span class="mw-page-title-main">TMEM221</span> Protein

Transmembrane protein 221 (TMEM221) is a protein that in humans is encoded by the TMEM221 gene. The function of TMEM221 is currently not well understood.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

Chromosome 4 open reading frame 50 is a protein that in humans is encoded by the C4orf50 gene. The protein localizes in the nucleus. C4orf50 has orthologs in vertebrates but not invertebrates

<span class="mw-page-title-main">THAP3</span> Protein in Humans

THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.

References

  1. "Lexical Tools". lexsrv3.nlm.nih.gov. Retrieved 2018-07-05.