Geographical cluster

Last updated

A geographical cluster is a localized anomaly, usually an excess of something given the distribution or variation of something else. [1] Often it is considered as an incidence rate that is unusual in that there is more of some variable than might be expected. Examples would include: a local excess disease rate, a crime hot spot, areas of high unemployment, accident blackspots, unusually high positive residuals from a model, high concentrations of flora or fauna, areas with high levels of creative activity, [2] physical features or events like earthquake epicenters etc... [ citation needed ]

Identifying these extreme regions may be useful in that there could be implicit geographical associations with other variables that can be identified and would be of interest. Pattern detection via the identification of such geographical clusters is a very simple and generic form of geographical analysis that has many applications in many different contexts. The emphasis is on localized clustering or patterning because this may well contain the most useful information. [ citation needed ]

A geographical cluster is different from a high concentration as it is generally second order, involving the factoring in of the distribution of something else. [ citation needed ]

Geographical cluster detection

Identifying geographical clusters can be an important stage in a geographical analysis. Mapping the locations of unusual concentrations may help identify causes of these. Some techniques include the Geographical Analysis Machine and Besag and Newell's cluster detection method. [3]

Related Research Articles

<span class="mw-page-title-main">Principal component analysis</span> Method of data analysis

Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing.

<span class="mw-page-title-main">Economic geography</span> Subfield of human geography and economics

Economic geography is the subfield of human geography that studies economic activity and factors affecting it. It can also be considered a subfield or method in economics.

Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, where a small portion of the data is tagged, and self-supervision. Some researchers consider self-supervised learning a form of unsupervised learning.

<span class="mw-page-title-main">Image segmentation</span> Partitioning a digital image into segments

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

<span class="mw-page-title-main">Cluster analysis</span> Grouping a set of objects by similarity

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning.

<span class="mw-page-title-main">Forensic toxicology</span> Use of toxicology for investigations

Forensic toxicology is a multidisciplinary field that combines the principles of toxicology with expertise in disciplines such as analytical chemistry, pharmacology and clinical chemistry to aid medical or legal investigation of death, poisoning, and drug use. The paramount focus for forensic toxicology is not the legal implications of the toxicological investigation or the methodologies employed, but rather the acquisition and accurate interpretation of results. Toxicological analyses can encompass a wide array of samples. In the course of an investigation, a forensic toxicologist must consider the context of an investigation, in particular any physical symptoms recorded, and any evidence collected at a crime scene that may narrow the search, such as pill bottles, powders, trace residue, and any available chemicals. Armed with this contextual information and samples to examine, the forensic toxicologist is tasked with identifying the specific toxic substances present, quantifying their concentrations, and assessing their likely impact on the individual involved.

In statistics, the k-nearest neighbors algorithm (k-NN) is a non-parametric supervised learning method. It was first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. Most often, it is used for classification, as a k-NN classifier, the output of which is a class membership. An object is classified by a plurality vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply assigned to the class of that single nearest neighbor.

Spatial ecology studies the ultimate distributional or spatial unit occupied by a species. In a particular habitat shared by several species, each of the species is usually confined to its own microhabitat or spatial niche because two species in the same general territory cannot usually occupy the same ecological niche for any significant length of time.

<span class="mw-page-title-main">Spatial analysis</span> Formal techniques which study entities using their topological, geometric, or geographic properties

Spatial analysis is any of the formal techniques which studies entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques using different analytic approaches, especially spatial statistics. It may be applied in fields as diverse as astronomy, with its studies of the placement of galaxies in the cosmos, or to chip fabrication engineering, with its use of "place and route" algorithms to build complex wiring structures. In a more restricted sense, spatial analysis is geospatial analysis, the technique applied to structures at the human scale, most notably in the analysis of geographic data. It may also be applied to genomics, as in transcriptomics data.

Machine olfaction is the automated simulation of the sense of smell. An emerging application in modern engineering, it involves the use of robots or other automated systems to analyze air-borne chemicals. Such an apparatus is often called an electronic nose or e-nose. The development of machine olfaction is complicated by the fact that e-nose devices to date have responded to a limited number of chemicals, whereas odors are produced by unique sets of odorant compounds. The technology, though still in the early stages of development, promises many applications, such as: quality control in food processing, detection and diagnosis in medicine, detection of drugs, explosives and other dangerous or illegal substances, disaster response, and environmental monitoring.

Spatial epidemiology is a subfield of epidemiology focused on the study of the spatial distribution of health outcomes; it is closely related to health geography.

Exposure assessment is a branch of environmental science and occupational hygiene that focuses on the processes that take place at the interface between the environment containing the contaminant of interest and the organism being considered. These are the final steps in the path to release an environmental contaminant, through transport to its effect in a biological system. It tries to measure how much of a contaminant can be absorbed by an exposed target organism, in what form, at what rate and how much of the absorbed amount is actually available to produce a biological effect. Although the same general concepts apply to other organisms, the overwhelming majority of applications of exposure assessment are concerned with human health, making it an important tool in public health.

In data analysis, anomaly detection is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority of the data and do not conform to a well defined notion of normal behavior. Such examples may arouse suspicions of being generated by a different mechanism, or appear inconsistent with the remainder of that set of data.

<span class="mw-page-title-main">GeoDa</span> Free geovisualization and analysis software

GeoDa is a free software package that conducts spatial data analysis, geovisualization, spatial autocorrelation and spatial modeling.

Computer audition (CA) or machine listening is the general field of study of algorithms and systems for audio interpretation by machines. Since the notion of what it means for a machine to "hear" is very broad and somewhat vague, computer audition attempts to bring together several disciplines that originally dealt with specific problems or had a concrete application in mind. The engineer Paris Smaragdis, interviewed in Technology Review, talks about these systems — "software that uses sound to locate people moving through rooms, monitor machinery for impending breakdowns, or activate traffic cameras to record accidents."

Determining the number of clusters in a data set, a quantity often labelled k as in the k-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem.

Fraud represents a significant problem for governments and businesses and specialized analysis techniques for discovering fraud using them are required. Some of these methods include knowledge discovery in databases (KDD), data mining, machine learning and statistics. They offer applicable and successful solutions in different areas of electronic fraud crimes.

A spatial distribution in statistics is the arrangement of a phenomenon across the Earth's surface and a graphical display of such an arrangement is an important tool in geographical and environmental statistics. A graphical display of a spatial distribution may summarize raw data directly or may reflect the outcome of a more sophisticated data analysis. Many different aspects of a phenomenon can be shown in a single graphical display by using a suitable choice of different colours to represent differences.

In statistics, the bootstrap error-adjusted single-sample technique is a non-parametric method that is intended to allow an assessment to be made of the validity of a single sample. It is based on estimating a probability distribution representing what can be expected from valid samples. This is done use a statistical method called bootstrapping, applied to previous samples that are known to be valid.

A Crime concentration is a spatial area to which high levels of crime incidents are attributed. A crime concentration can be the result of homogeneous or heterogeneous crime incidents. Hotspots are the result of various crimes occurring in relative proximity to each other within predefined human geopolitical or social boundaries. Crime concentrations are smaller units or set of crime targets within a hotspot. A single or a conjunction of crime concentrations within a study area can make up a crime hotspot.

References

  1. Ian Turton; Stan Openshaw (February 25, 1998). "What is a cluster?". Centre for Computational Geography. Archived from the original on October 7, 1999. Retrieved April 19, 2011.
  2. Borowiecki, Karol Jan; Dahl, Christian Møller (January 2021). "What makes an artist? The evolution and clustering of creative activity in the US since 1850" (PDF). Regional Science and Urban Economics. 86: 103614. doi:10.1016/j.regsciurbeco.2020.103614. S2CID   228879785.
  3. Fotheringham, A. Stewart; Zhan, F. Benjamin (September 3, 2010). "A Comparison Of Three Exploratory Methods for Cluster Detection in Spatial Point Patterns". Geographical Analysis. 28 (3): 200–218. doi:10.1111/j.1538-4632.1996.tb00931.x.