Ground truth

Last updated

Ground truth is information that is known to be real or true, provided by direct observation and measurement (i.e. empirical evidence) as opposed to information provided by inference.

Contents

Etymology

The Oxford English Dictionary (s.v. ground truth) records the use of the word Groundtruth in the sense of 'fundamental truth' from Henry Ellison's poem "The Siberian Exile's Tale", published in 1833. [1]

Statistics and machine learning

"Ground truth" may be seen as a conceptual term relative to the knowledge of the truth concerning a specific question. It is the ideal expected result. [2] This is used in statistical models to prove or disprove research hypotheses. The term "ground truthing" refers to the process of gathering the proper objective (provable) data for this test. Compare with gold standard. For example, suppose we are testing a stereo vision system to see how well it can estimate 3D positions. The "ground truth" might be the positions given by a laser rangefinder which is known to be much more accurate than the camera system.

Bayesian spam filtering is a common example of supervised learning. In this system, the algorithm is manually taught the differences between spam and non-spam. This depends on the ground truth of the messages used to train the algorithm inaccuracies in the ground truth will correlate to inaccuracies in the resulting spam/non-spam verdicts.

Remote sensing

In remote sensing, "ground truth" refers to information collected at the imaged location. Ground truth allows image data to be related to real features and materials on the ground. The collection of ground truth data enables calibration of remote-sensing data, and aids in the interpretation and analysis of what is being sensed. Examples include cartography, meteorology, analysis of aerial photographs, satellite imagery and other techniques in which data are gathered at a distance.

More specifically, ground truth may refer to a process in which "pixels" [3] on a satellite image are compared to what is imaged (at the time of capture) in order to verify the contents of the "pixels" in the image (noting that the concept of "pixel" is imaging-system-dependent). In the case of a classified image, supervised classification can help to determine the accuracy of the classification by the remote sensing system which can minimize error in the classification.

Ground truth is usually done on site, correlating what is known with surface observations and measurements of various properties of the features of the ground resolution cells under study in the remotely sensed digital image. The process also involves taking geographic coordinates of the ground resolution cell with GPS technology and comparing those with the coordinates of the "pixel" being studied provided by the remote sensing software to understand and analyze the location errors and how it may affect a particular study.

Ground truth is important in the initial supervised classification of an image. When the identity and location of land cover types are known through a combination of field work, maps, and personal experience these areas are known as training sites. The spectral characteristics of these areas are used to train the remote sensing software using decision rules for classifying the rest of the image. These decision rules such as Maximum Likelihood Classification, Parallelopiped Classification, and Minimum Distance Classification offer different techniques to classify an image. Additional ground truth sites allow the remote sensor to establish an error matrix that validates the accuracy of the classification method used. Different classification methods may have different percentages of error for a given classification project. It is important that the remote sensor chooses a classification method that works best with the number of classifications used while providing the least amount of error.

Ground truth also helps with atmospheric correction. Since images from satellites have to pass through the atmosphere, they can get distorted because of absorption in the atmosphere. So ground truth can help fully identify objects in satellite photos.

Errors of commission

An example of an error of commission is when a pixel reports the presence of a feature (such a tree) that, in reality, is absent (no tree is actually present). Ground truthing ensures that the error matrices have a higher accuracy percentage than would be the case if no pixels were ground-truthed. This value is the inverse of the user's accuracy, i.e. Commission Error = 1 - user's accuracy.

Errors of omission

An example of an error of omission is when pixels of a certain type, for example, maple trees, are not classified as maple trees. The process of ground-truthing helps to ensure that the pixel is classified correctly and the error matrices are more accurate. This value is the inverse of the producer's accuracy, i.e. Omission Error = 1 - producer's accuracy

Geographical information systems

The ground truth representations are the GIS elements (fields or objects), and each element is representing (by a cartographic process) a real world object. GroundTruth processModel01.png
The ground truth representations are the GIS elements (fields or objects), and each element is representing (by a cartographic process) a real world object.

In GIS the spatial data is modeled as field (like in remote sensing raster images) or as object (like in vectorial map representation). [4] They are modeled from the real world (also named geographical reality), typically by a cartographic process (illustrated).

Geographic information systems such as GIS, GPS, and GNSS, have become so widespread that the term "ground truth" has taken on special meaning in that context. If the location coordinates returned by a location method such as GPS are an estimate of a location, then the "ground truth" is the actual location on Earth. A smart phone might return a set of estimated location coordinates such as 43.87870,-103.45901. The ground truth being estimated by those coordinates is the tip of George Washington's nose on Mount Rushmore. The accuracy of the estimate is the maximum distance between the location coordinates and the ground truth. We could say in this case that the estimate accuracy is 10 meters, meaning that the point on earth represented by the location coordinates is thought to be within 10 meters of George's nose—the ground truth. In slang, the coordinates indicate where we think George Washington's nose is located, and the ground truth is where it really is. In practice a smart phone or hand-held GPS unit is routinely able to estimate the ground truth within 6–10 meters. Specialized instruments can reduce GPS measurement error to under a centimeter. [5]

Military usage

US military slang uses "ground truth" to refer to the facts comprising a tactical situation—as opposed to intelligence reports, mission plans, and other descriptions reflecting the conative or policy-based projections of the industrial·military complex. The term appears in the title of the Iraq War documentary film The Ground Truth (2006), and also in military publications, for example Stars and Stripes saying: "Stripes decided to figure out what the ground truth was in Iraq."[ citation needed ]

See also

Related Research Articles

<span class="mw-page-title-main">Geographic information system</span> System to capture, manage and present geographic data

A geographic information system (GIS) consists of integrated computer hardware and software that store, manage, analyze, edit, output, and visualize geographic data. Much of this often happens within a spatial database, however, this is not essential to meet the definition of a GIS. In a broader sense, one may consider such a system also to include human users and support staff, procedures and workflows, the body of knowledge of relevant concepts and methods, and institutional organizations.

<span class="mw-page-title-main">Topography</span> Study of the forms of land surfaces

Topography is the study of the forms and features of land surfaces. The topography of an area may refer to the land forms and features themselves, or a description or depiction in maps.

<span class="mw-page-title-main">Remote sensing</span> Acquisition of information at a significant distance from the subject

Remote sensing is the acquisition of information about an object or phenomenon without making physical contact with the object, in contrast to in situ or on-site observation. The term is applied especially to acquiring information about Earth and other planets. Remote sensing is used in numerous fields, including geophysics, geography, land surveying and most Earth science disciplines. It also has military, intelligence, commercial, economic, planning, and humanitarian applications, among others.

Measurement and signature intelligence (MASINT) is a technical branch of intelligence gathering, which serves to detect, track, identify or describe the distinctive characteristics (signatures) of fixed or dynamic target sources. This often includes radar intelligence, acoustic intelligence, nuclear intelligence, and chemical and biological intelligence. MASINT is defined as scientific and technical intelligence derived from the analysis of data obtained from sensing instruments for the purpose of identifying any distinctive features associated with the source, emitter or sender, to facilitate the latter's measurement and identification.

<span class="mw-page-title-main">Photogrammetry</span> Taking measurements using photography

Photogrammetry is the science and technology of obtaining reliable information about physical objects and the environment through the process of recording, measuring and interpreting photographic images and patterns of electromagnetic radiant imagery and other phenomena.

<span class="mw-page-title-main">Synthetic-aperture radar</span> Form of radar used to create images of landscapes

Synthetic-aperture radar (SAR) is a form of radar that is used to create two-dimensional images or three-dimensional reconstructions of objects, such as landscapes. SAR uses the motion of the radar antenna over a target region to provide finer spatial resolution than conventional stationary beam-scanning radars. SAR is typically mounted on a moving platform, such as an aircraft or spacecraft, and has its origins in an advanced form of side looking airborne radar (SLAR). The distance the SAR device travels over a target during the period when the target scene is illuminated creates the large synthetic antenna aperture. Typically, the larger the aperture, the higher the image resolution will be, regardless of whether the aperture is physical or synthetic – this allows SAR to create high-resolution images with comparatively small physical antennas. For a fixed antenna size and orientation, objects which are further away remain illuminated longer – therefore SAR has the property of creating larger synthetic apertures for more distant objects, which results in a consistent spatial resolution over a range of viewing distances.

<span class="mw-page-title-main">Geodetic datum</span> Reference frame for measuring location

A geodetic datum or geodetic system is a global datum reference or reference frame for precisely representing the position of locations on Earth or other planetary bodies by means of geodetic coordinates. Datums are crucial to any technology or technique based on spatial location, including geodesy, navigation, surveying, geographic information systems, remote sensing, and cartography. A horizontal datum is used to measure a location across the Earth's surface, in latitude and longitude or another coordinate system; a vertical datum is used to measure the elevation or depth relative to a standard origin, such as mean sea level (MSL). Since the rise of the global positioning system (GPS), the ellipsoid and datum WGS 84 it uses has supplanted most others in many applications. The WGS 84 is intended for global use, unlike most earlier datums.

The fast Kalman filter (FKF), devised by Antti Lange (born 1941), is an extension of the Helmert–Wolf blocking (HWB) method from geodesy to safety-critical real-time applications of Kalman filtering (KF) such as GNSS navigation up to the centimeter-level of accuracy and satellite imaging of the Earth including atmospheric tomography.

<span class="mw-page-title-main">Multispectral imaging</span> Capturing image data across multiple electromagnetic spectrum ranges

Multispectral imaging captures image data within specific wavelength ranges across the electromagnetic spectrum. The wavelengths may be separated by filters or detected with the use of instruments that are sensitive to particular wavelengths, including light from frequencies beyond the visible light range, i.e. infrared and ultra-violet. It can allow extraction of additional information the human eye fails to capture with its visible receptors for red, green and blue. It was originally developed for military target identification and reconnaissance. Early space-based imaging platforms incorporated multispectral imaging technology to map details of the Earth related to coastal boundaries, vegetation, and landforms. Multispectral imaging has also found use in document and painting analysis.

<span class="mw-page-title-main">Geotagging</span> Act of associating geographic coordinates to digital media

Geotagging, or GeoTagging, is the process of adding geographical identification metadata to various media such as a geotagged photograph or video, websites, SMS messages, QR Codes or RSS feeds and is a form of geospatial metadata. This data usually consists of latitude and longitude coordinates, though they can also include altitude, bearing, distance, accuracy data, and place names, and perhaps a time stamp.

<span class="mw-page-title-main">Geodetic control network</span>

A geodetic control network is a network, often of triangles, that are measured precisely by techniques of control surveying, such as terrestrial surveying or satellite geodesy. It is also known as a geodetic network, reference network, control point network, or simply control network.

Georeferencing or georegistration is a type of coordinate transformation that binds a digital raster image or vector database that represents a geographic space to a spatial reference system, thus locating the digital data in the real world. It is thus the geographic form of image registration. The term can refer to the mathematical formulas used to perform the transformation, the metadata stored alongside or within the image file to specify the transformation, or the process of manually or automatically aligning the image to the real world to create such metadata. The most common result is that the image can be visually and analytically integrated with other geographic data in geographic information systems and remote sensing software.

<span class="mw-page-title-main">Interferometric synthetic-aperture radar</span> Geodesy and remote sensing technique

Interferometric synthetic aperture radar, abbreviated InSAR, is a radar technique used in geodesy and remote sensing. This geodetic method uses two or more synthetic aperture radar (SAR) images to generate maps of surface deformation or digital elevation, using differences in the phase of the waves returning to the satellite or aircraft. The technique can potentially measure millimetre-scale changes in deformation over spans of days to years. It has applications for geophysical monitoring of natural hazards, for example earthquakes, volcanoes and landslides, and in structural engineering, in particular monitoring of subsidence and structural stability.

Remote sensing techniques in archaeology are an increasingly important component of the technical and methodological tool set available in archaeological research. The use of remote sensing techniques allows archaeologists to uncover unique data that is unobtainable using traditional archaeological excavation techniques.

Multispectral remote sensing is the collection and analysis of reflected, emitted, or back-scattered energy from an object or an area of interest in multiple bands of regions of the electromagnetic spectrum. Subcategories of multispectral remote sensing include hyperspectral, in which hundreds of bands are collected and analyzed, and ultraspectral remote sensing where many hundreds of bands are used. The main purpose of multispectral imaging is the potential to classify the image using multispectral classification. This is a much faster method of image analysis than is possible by human interpretation.

Collocation is a procedure used in remote sensing to match measurements from two or more different instruments. This is done for two main reasons: for validation purposes when comparing measurements of the same variable, and to relate measurements of two different variables either for performing retrievals or for prediction. In the second case the data is later fed into some type of statistical inverse method such as an artificial neural network, statistical classification algorithm, kernel estimator or a linear least squares. In principle, most collocation problems can be solved by a nearest neighbor search, but in practice there are many other considerations involved and the best method is highly specific to the particular matching of instruments. Here we deal with some of the most important considerations along with specific examples.

<span class="mw-page-title-main">Remote sensing in geology</span> Data acquisition method for earth sciences

Remote sensing is used in the geological sciences as a data acquisition method complementary to field observation, because it allows mapping of geological characteristics of regions without physical contact with the areas being explored. About one-fourth of the Earth's total surface area is exposed land where information is ready to be extracted from detailed earth observation via remote sensing. Remote sensing is conducted via detection of electromagnetic radiation by sensors. The radiation can be naturally sourced, or produced by machines and reflected off of the Earth surface. The electromagnetic radiation acts as an information carrier for two main variables. First, the intensities of reflectance at different wavelengths are detected, and plotted on a spectral reflectance curve. This spectral fingerprint is governed by the physio-chemical properties of the surface of the target object and therefore helps mineral identification and hence geological mapping, for example by hyperspectral imaging. Second, the two-way travel time of radiation from and back to the sensor can calculate the distance in active remote sensing systems, for example, Interferometric synthetic-aperture radar. This helps geomorphological studies of ground motion, and thus can illuminate deformations associated with landslides, earthquakes, etc.

Land cover maps are tools that provide vital information about the Earth's land use and cover patterns. They aid policy development, urban planning, and forest and agricultural monitoring.

<span class="mw-page-title-main">Underwater survey</span> Inspection or measurement in or of an underwater environment

An underwater survey is a survey performed in an underwater environment or conducted remotely on an underwater object or region. Survey can have several meanings. The word originates in Medieval Latin with meanings of looking over and detailed study of a subject. One meaning is the accurate measurement of a geographical region, usually with the intention of plotting the positions of features as a scale map of the region. This meaning is often used in scientific contexts, and also in civil engineering and mineral extraction. Another meaning, often used in a civil, structural, or marine engineering context, is the inspection of a structure or vessel to compare actual condition with the specified nominal condition, usually with the purpose of reporting on the actual condition and compliance with, or deviations from, the nominal condition, for quality control, damage assessment, valuation, insurance, maintenance, and similar purposes. In other contexts it can mean inspection of a region to establish presence and distribution of specified content, such as living organisms, either to establish a baseline, or to compare with a baseline.

Atmospheric correction for Interferometric Synthetic ApertureRadar (InSAR) technique is a set of different methods to remove artefact displacement from an interferogram caused by the effect of weather variables such as humidity, temperature, and pressure. An interferogram is generated by processing two synthetic-aperture radar images before and after a geophysical event like an earthquake. Corrections for atmospheric variations are an important stage of InSAR data processing in many study areas to measure surface displacement because relative humidity differences of 20% can cause inaccuracies of 10–14 cm InSAR due to varying delays in the radar signal. Overall, atmospheric correction methods can be divided into two categories: a) Using Atmospheric Phase Screen (APS) statistical properties and b) Using auxiliary (external) data such as GPS measurements, multi-spectral observations, local meteorological models, and global atmospheric models.

References

  1. Ellison, Henry (1833). Mad moments, or first verse attempts by a born natural. p. 362. Retrieved 2014-10-24. As the Groundtruth of her own Existence it must be regarded, thro' Him in its highest, purest Aspect shown!
  2. Lemoigne, Yves; Caner, Alessandra (2006). Molecular Imaging: Computer Reconstruction and Practice.
  3. Fisher, P (1997). "The Pixel: A Snare and a Delusion". International Journal of Remote Sensing. 18 (15): 679–685. Bibcode:1997IJRS...18..679F. doi:10.1080/014311697219015.
  4. Goodchild, M., "Geographical data modeling". Computers & Geosciences, vol. 18, no.4, pp. 401-408, 1992.
  5. Pickles, John (1995). Ground Truth: The Social Implications of Geographical Information Systems. p. 179.