GemIdent is an interactive image recognition program that identifies regions of interest in images and photographs. It is specifically designed for images with few colors, where the objects of interest look alike with small variation. For example, color image segmentation of:
GemIdent also packages data analysis tools to investigate spatial relationships among the objects identified.
GemIdent was developed at Stanford University by Adam Kapelner from June, 2006 until January, 2007 in the lab of Dr. Peter Lee under the tutelage of Professor Susan Holmes. [1] The concept was inspired by data Kohrt et al. [2] who analyzed immune profiles of lymph nodes in breast cancer patients. Hence, GemIdent works well when identifying cells in IHC-stained tissue imaged via automated light microscopy when the nuclear background stain and membrane/cytoplasmic stain are well-defined. In 2008, it was adapted to support multispectral imaging techniques. [3]
GemIdent uses supervised learning to perform automated identification of regions of interest in the images. Therefore, the user must do a substantial amount of work first supplying the relevant colors, then pointing out examples of the objects or regions themselves as well as negatives (training set creation).
When a user clicks on a pixel, many scores are generated using the surrounding color information via Mahalanobis Ring Score attribute generation (read the JSS paper for a detailed exposition). These scores are then used to build a random forest machine-learning classifier which will then classify pixels in any given image.
After classification, there may be mistakes. The user can return to training and point out the specific mistakes and then reclassify. These training-classifying-retraining-reclassifying iterations (considered interactive boosting) can result in a highly accurate segmentation.
In 2010, Setiadi et al. [4] analyzed histological sections of lymph nodes looking at spatial densities of B and T cells. "Cell numbers do not capture the full range of information encoded within tissues".
The Java source code is now open source under GPL2. [5]
The raw photograph (left), a superimposed mask showing the pixel classification results (center), and finally the photograph is marked with the centroids of the object of interest - the oranges (right)
The raw microscopic image of a stained lymph node (left) from the Kohrt study, [2] a superimposed mask showing the pixel classification results (center), and finally the image is marked with the centroids of the object of interest - the cancer nuclei (right)
This example illustrates GemIdent's ability to find multiple phenotypes in the same image: the raw microscopic image of a stained lymph node (top left) from the Kohrt study, [2] a superimposed mask showing the pixel classification results (top right), and finally the image marked with the centroids of the objects of interest - the cancer nuclei (in green stars), the T-cells (in yellow stars), and non-specific background nuclei (in cyan stars).
The command-line data analysis and visualization interface in action analyzing results of a classification of a lymph node from the Kohrt study. [2] The histogram displays the distribution of distances from T-cells to neighboring cancer cells. The binary image of cancer membrane is the result of a pixel-only classification. The open PDF document is the autogenerated report of the analysis which includes a thumbnail view of the entire lymph node, counts and Type I error rates for all phenotypes, as well as a transcript of the analyses performed.
A lymph node, or lymph gland, is a kidney-shaped organ of the lymphatic system and the adaptive immune system. A large number of lymph nodes are linked throughout the body by the lymphatic vessels. They are major sites of lymphocytes that include B and T cells. Lymph nodes are important for the proper functioning of the immune system, acting as filters for foreign particles including cancer cells, but have no detoxification function.
A lymphocyte is a type of white blood cell (leukocyte) in the immune system of most vertebrates. Lymphocytes include T cells, B cells, and innate lymphoid cells, of which natural killer cells are an important subtype. They are the main type of cell found in lymph, which prompted the name "lymphocyte". Lymphocytes make up between 18% and 42% of circulating white blood cells.
Image analysis or imagery analysis is the extraction of meaningful information from images; mainly from digital images by means of digital image processing techniques. Image analysis tasks can be as simple as reading bar coded tags or as sophisticated as identifying a person from their face.
In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.
The TNM Classification of Malignant Tumors (TNM) is a globally recognised standard for classifying the anatomical extent of the spread of malignant tumours (cancer). It has gained wide international acceptance for many solid tumor cancers, but is not applicable to leukaemia or tumors of the central nervous system. Most common tumors have their own TNM classification. The TNM staging system is sometimes referred to as the AJCC/UICC staging system or the UICC/AJCC staging system. Several revisions have been published, the latest being the eighth edition in 2017.
A seminoma is a germ cell tumor of the testicle or, more rarely, the mediastinum or other extra-gonadal locations. It is a malignant neoplasm and is one of the most treatable and curable cancers, with a survival rate above 95% if discovered in early stages.
Invasive carcinoma of no special type, invasive breast carcinoma of no special type (IBC-NST), invasive ductal carcinoma (IDC), infiltrating ductal carcinoma (IDC) or invasive ductal carcinoma, not otherwise specified (NOS) is a disease. For international audiences this article will use "invasive carcinoma NST" because it is the preferred term of the World Health Organization (WHO).
The sentinel lymph node is the hypothetical first lymph node or group of nodes draining a cancer. In case of established cancerous dissemination it is postulated that the sentinel lymph nodes are the target organs primarily reached by metastasizing cancer cells from the tumor.
Papillary thyroid cancer is the most common type of thyroid cancer, representing 75 percent to 85 percent of all thyroid cancer cases. It occurs more frequently in women and presents in the 20–55 year age group. It is also the predominant cancer type in children with thyroid cancer, and in patients with thyroid cancer who have had previous radiation to the head and neck. It is often well-differentiated, slow-growing, and localized, although it can metastasize.
Rosai–Dorfman disease, also known as sinus histiocytosis with massive lymphadenopathy or sometimes as Destombes–Rosai–Dorfman disease, is a rare disorder of unknown cause that is characterized by abundant histiocytes in lymph nodes or other locations including the skin, sinuses, brain and heart. Individuals with the disorder often present with enlarged lymph nodes and a nodular red skin rash. The main causes of morbidity with the illness are systemic infection from impaired immune response and organ dysfunction from histiocyte deposition throughout the body.
Cancer of unknown primary origin (CUP) is a cancer that is determined to be at the metastatic stage at the time of diagnosis, but a primary tumor cannot be identified. A diagnosis of CUP requires a clinical picture consistent with metastatic disease and one or more biopsy results inconsistent with a tumor cancer
Medullary breast carcinoma is a rare type of breast cancer that is characterized as a relatively circumscribed tumor with pushing, rather than infiltrating, margins. It is histologically characterized as poorly differentiated cells with abundant cytoplasm and pleomorphic high grade vesicular nuclei. It involves lymphocytic infiltration in and around the tumor and can appear to be brown in appearance with necrosis and hemorrhage. Prognosis is measured through staging but can often be treated successfully and has a better prognosis than other infiltrating breast carcinomas.
Breast cancer classification divides breast cancer into categories according to different schemes criteria and serving a different purpose. The major categories are the histopathological type, the grade of the tumor, the stage of the tumor, and the expression of proteins and genes. As knowledge of cancer cell biology develops these classifications are updated.
Bioimage informatics is a subfield of bioinformatics and computational biology. It focuses on the use of computational techniques to analyze bioimages, especially cellular and molecular images, at large scale and high throughput. The goal is to obtain useful knowledge out of complicated and heterogeneous image and related metadata.
The random walker algorithm is an algorithm for image segmentation. In the first description of the algorithm, a user interactively labels a small number of pixels with known labels, e.g., "object" and "background". The unlabeled pixels are each imagined to release a random walker, and the probability is computed that each pixel's random walker first arrives at a seed bearing each label, i.e., if a user places K seeds, each with a different label, then it is necessary to compute, for each pixel, the probability that a random walker leaving the pixel will first arrive at each seed. These probabilities may be determined analytically by solving a system of linear equations. After computing these probabilities for each pixel, the pixel is assigned to the label for which it is most likely to send a random walker. The image is modeled as a graph, in which each pixel corresponds to a node which is connected to neighboring pixels by edges, and the edges are weighted to reflect the similarity between the pixels. Therefore, the random walk occurs on the weighted graph.
ilastik is a user-friendly free open source software for image classification and segmentation. No previous experience in image processing is required to run the software. Since 2018 ilastik is further developed and maintained by Anna Kreshuk's group at European Molecular Biology Laboratory.
The histopathology of colorectal cancer of the adenocarcinoma type involves analysis of tissue taken from a biopsy or surgery. A pathology report contains a description of the microscopical characteristics of the tumor tissue, including both tumor cells and how the tumor invades into healthy tissues and finally if the tumor appears to be completely removed. The most common form of colon cancer is adenocarcinoma, constituting between 95% and 98% of all cases of colorectal cancer. Other, rarer types include lymphoma, adenosquamous and squamous cell carcinoma. Some subtypes have been found to be more aggressive.
Tissue image cytometry or tissue cytometry is a method of digital histopathology and combines classical digital pathology and computational pathology into one integrated approach with solutions for all kinds of diseases, tissue and cell types as well as molecular markers and corresponding staining methods to visualize these markers. Tissue cytometry uses virtual slides as they can be generated by multiple, commercially available slide scanners, as well as dedicated image analysis software – preferentially including machine and deep learning algorithms. Tissue cytometry enables cellular analysis within thick tissues, retaining morphological and contextual information, including spatial information on defined cellular subpopulations.
Papillary carcinomas of the breast (PCB), also termed malignant papillary carcinomas of the breast, are rare forms of the breast cancers. The World Health Organization (2019) classified papillary neoplasms of the breast into 5 types: intraductal papilloma, papillary ductal carcinoma in situ (PDCIS), encapsulated papillary carcinoma (EPC), solid-papillary carcinoma (SPC), and invasive papillary carcinoma (IPC). The latter four carcinomas are considered here; intraductal papilloma is a benign neoplasm. The World Health Organization regarded solid papillary carcinoma as having two subtypes: in situ and invasive SPC.
Pure apocrine carcinoma of the breast (PACB) is a rare carcinoma derived from the epithelial cells in the lactiferous ducts of the mammary gland. The mammary gland is an apocrine gland. Its lactiferous ducts have two layers of epithelial cells, a luminal layer which faces the duct's lumen and a basal layer which lies beneath the luminal layer. There are at least four subtypes of epithelial cells in these ducts: luminal progenitor cells and luminal mature cells which reside in the luminal layer and mammary stem cells and basal cells which reside in the basal layer. Examination of the genes expressed in PACB cancer cells indicate that most of these tumors consist of cells derived from luminal cells but a minority of these tumors consist of cells derived from basal cells.
{{cite book}}
: |journal=
ignored (help)