Remote sensing for statistics

Last updated

Satellite images provide very useful information to produce statistics on topics closely related to the territory, such as agriculture, forestry or land cover in general. The first large project to apply Landsata 1 images for statistics was LACIE (Large Area Crop Inventory Experiment), run by NASA, NOAA and the USDA in 1974–77. [1] [2] Many other application projects on crop area estimation have followed, including the Italian AGRIT project and the MARS project of the Joint Research Centre (JRC) of the European Commission. [3] Forest area and deforestation estimation have also been a frequent target of remote sensing projects [4] [5] , the same as land cover and land use [6]

Accuracy assessment and area estimation

Before producing statistical estimates, satellite images undergo several operations. Some of them can be considered pre-treatment phases, including radiometric calibration and orthorectification, so that the images can be displayed in a Geographic Information System (GIS). Further steps produce layers that are conceptually closer to the statistical variables we want to estimate. If our target is estimating the area of single crops, such as wheat, for example, we need classified images in which wheat is one of the classes in the legend. The most traditional image classification algorithms work pixel by pixel. Image segmentation for object-based image analysis (OBIA) is essential in other image analysis fields, but is less critical for agricultural and environmental monitoring.

Groud truth or reference data to train and validate image classification require a field survey if we are targetting annual crops or individual forest species, but may be substituted by photointerpretation if we look at wider classes that can be reliably identified on aerial photos or satellite images. It is relevant to highlight that probabilistic sampling is not critical for the selection of training pixels for image classification, but it is necessary for accuracy assessment of the classified images and area estimation. [7] [8] [9] Additional care is recommended to ensure that training and validation datasets are not spatially correlated. [10]

We suppose now that we have classified images or a land cover map produced by visual photo-interpretation, with a legend of mapped classes that suits our purpose, taking again the example of wheat. The straightforward approach is counting the number of pixels classified as wheat and multiplying by the area of each pixel. Many authors have noticed that estimator is that it is generally biased because commission and omission errors in a confusion matrix do not compensate each other [11] [12] [13]

The main strength of classified satellite images or other indicators computed on satellite images is providing cheap information on the whole target area or most of it. This information usually has a good correlation with the target variable (ground truth) that is usually expensive to observe in an unbiased and accurate way. Therefore it can be observed on a probabilistic sample selected on an area sampling frame. Traditional survey methodology provides different methods to combine accurate information on a sample with less accurate, but exhaustive, data for a covariable or proxy that is cheaper to collect. For agricultural statistics, field surveys are usually required, while photo-interpretation may better for land cover classes that can be reliably identified on aerial photographs or high resolution satellite images. Additional uncertainty can appear because of imperfect reference data (ground truth or similar). [14] [15]

Some options are: ratio estimator, regression estimator [16] , calibration estimators [17] and small area estimators [6]

If we target other variables, such as crop yield or leaf area, we may need different indicators to be computed from images, such as the NDVI, a good proxy to chlorophyll activity. [18]

Related Research Articles

<span class="mw-page-title-main">Remote sensing</span> Acquisition of information at a significant distance from the subject

Remote sensing is the acquisition of information about an object or phenomenon without making physical contact with the object, in contrast to in situ or on-site observation. The term is applied especially to acquiring information about Earth and other planets. Remote sensing is used in numerous fields, including geophysics, geography, land surveying and most Earth science disciplines. It also has military, intelligence, commercial, economic, planning, and humanitarian applications, among others.

Ground truth is information that is known to be real or true, provided by direct observation and measurement as opposed to information provided by inference.

<span class="mw-page-title-main">Landsat program</span> American network of Earth-observing satellites for international research purposes

The Landsat program is the longest-running enterprise for acquisition of satellite imagery of Earth. It is a joint NASA / USGS program. On 23 July 1972, the Earth Resources Technology Satellite was launched. This was eventually renamed to Landsat 1 in 1975. The most recent, Landsat 9, was launched on 27 September 2021.

<span class="mw-page-title-main">Synthetic-aperture radar</span> Form of radar used to create images of landscapes

Synthetic-aperture radar (SAR) is a form of radar that is used to create two-dimensional images or three-dimensional reconstructions of objects, such as landscapes. SAR uses the motion of the radar antenna over a target region to provide finer spatial resolution than conventional stationary beam-scanning radars. SAR is typically mounted on a moving platform, such as an aircraft or spacecraft, and has its origins in an advanced form of side looking airborne radar (SLAR). The distance the SAR device travels over a target during the period when the target scene is illuminated creates the large synthetic antenna aperture. Typically, the larger the aperture, the higher the image resolution will be, regardless of whether the aperture is physical or synthetic – this allows SAR to create high-resolution images with comparatively small physical antennas. For a fixed antenna size and orientation, objects which are further away remain illuminated longer – therefore SAR has the property of creating larger synthetic apertures for more distant objects, which results in a consistent spatial resolution over a range of viewing distances.

India's remote sensing program was developed with the idea of applying space technologies for the benefit of humankind and the development of the country. The program involved the development of three principal capabilities. The first was to design, build and launch satellites to a Sun-synchronous orbit. The second was to establish and operate ground stations for spacecraft control, data transfer along with data processing and archival. The third was to use the data obtained for various applications on the ground.

<span class="mw-page-title-main">Multispectral imaging</span> Capturing image data across multiple electromagnetic spectrum ranges

Multispectral imaging captures image data within specific wavelength ranges across the electromagnetic spectrum. The wavelengths may be separated by filters or detected with the use of instruments that are sensitive to particular wavelengths, including light from frequencies beyond the visible light range, i.e. infrared and ultra-violet. It can allow extraction of additional information the human eye fails to capture with its visible receptors for red, green and blue. It was originally developed for military target identification and reconnaissance. Early space-based imaging platforms incorporated multispectral imaging technology to map details of the Earth related to coastal boundaries, vegetation, and landforms. Multispectral imaging has also found use in document and painting analysis.

Leaf area index (LAI) is a dimensionless quantity that characterizes plant canopies. It is defined as the one-sided green leaf area per unit ground surface area in broadleaf canopies. In conifers, three definitions for LAI have been used:

<span class="mw-page-title-main">Impervious surface</span> Artificial structures such as pavements covered with water-tight materials

Impervious surfaces are mainly artificial structures—such as pavements that are covered by water-resistant materials such as asphalt, concrete, brick, stone—and rooftops. Soils compacted by urban development are also highly impervious.

<span class="mw-page-title-main">Normalized difference vegetation index</span> Graphical indicator of remotely sensed live green vegetation

The normalized difference vegetation index (NDVI) is a widely-used metric for quantifying the health and density of vegetation using sensor data. It is calculated from spectrometric data at two specific bands: red and near-infrared. The spectrometric data is usually sourced from remote sensors, such as satellites.

<span class="mw-page-title-main">Interferometric synthetic-aperture radar</span> Geodesy and remote sensing technique

Interferometric synthetic aperture radar, abbreviated InSAR, is a radar technique used in geodesy and remote sensing. This geodetic method uses two or more synthetic aperture radar (SAR) images to generate maps of surface deformation or digital elevation, using differences in the phase of the waves returning to the satellite or aircraft. The technique can potentially measure millimetre-scale changes in deformation over spans of days to years. It has applications for geophysical monitoring of natural hazards, for example earthquakes, volcanoes and landslides, and in structural engineering, in particular monitoring of subsidence and structural stability.

Multispectral remote sensing is the collection and analysis of reflected, emitted, or back-scattered energy from an object or an area of interest in multiple bands of regions of the electromagnetic spectrum. Subcategories of multispectral remote sensing include hyperspectral, in which hundreds of bands are collected and analyzed, and ultraspectral remote sensing where many hundreds of bands are used. The main purpose of multispectral imaging is the potential to classify the image using multispectral classification. This is a much faster method of image analysis than is possible by human interpretation.

In statistics, adaptive or "variable-bandwidth" kernel density estimation is a form of kernel density estimation in which the size of the kernels used in the estimate are varied depending upon either the location of the samples or the location of the test point. It is a particularly effective technique when the sample space is multi-dimensional.

<span class="mw-page-title-main">Theil–Sen estimator</span> Statistical method for fitting a line

In non-parametric statistics, the Theil–Sen estimator is a method for robustly fitting a line to sample points in the plane by choosing the median of the slopes of all lines through pairs of points. It has also been called Sen's slope estimator, slope selection, the single median method, the Kendall robust line-fit method, and the Kendall–Theil robust line. It is named after Henri Theil and Pranab K. Sen, who published papers on this method in 1950 and 1968 respectively, and after Maurice Kendall because of its relation to the Kendall tau rank correlation coefficient.

Normalized Difference Water Index (NDWI) may refer to one of at least two remote sensing-derived indexes related to liquid water:

<span class="mw-page-title-main">Vegetation index</span>

A vegetation index (VI) is a spectral imaging transformation of two or more image bands designed to enhance the contribution of vegetation properties and allow reliable spatial and temporal inter-comparisons of terrestrial photosynthetic activity and canopy structural variations.

<span class="mw-page-title-main">Remote sensing in geology</span> Data acquisition method for earth sciences

Remote sensing is used in the geological sciences as a data acquisition method complementary to field observation, because it allows mapping of geological characteristics of regions without physical contact with the areas being explored. About one-fourth of the Earth's total surface area is exposed land where information is ready to be extracted from detailed earth observation via remote sensing. Remote sensing is conducted via detection of electromagnetic radiation by sensors. The radiation can be naturally sourced, or produced by machines and reflected off of the Earth surface. The electromagnetic radiation acts as an information carrier for two main variables. First, the intensities of reflectance at different wavelengths are detected, and plotted on a spectral reflectance curve. This spectral fingerprint is governed by the physio-chemical properties of the surface of the target object and therefore helps mineral identification and hence geological mapping, for example by hyperspectral imaging. Second, the two-way travel time of radiation from and back to the sensor can calculate the distance in active remote sensing systems, for example, Interferometric synthetic-aperture radar. This helps geomorphological studies of ground motion, and thus can illuminate deformations associated with landslides, earthquakes, etc.

BAITSSS is biophysical Evapotranspiration (ET) computer model that determines water use, primarily in agriculture landscape, using remote sensing-based information. It was developed and refined by Ramesh Dhungel and the water resources group at University of Idaho's Kimberly Research and Extension Center since 2010. It has been used in different areas in the United States including Southern Idaho, Northern California, northwest Kansas, Texas, and Arizona.

Land cover maps are tools that provide vital information about the Earth's land use and cover patterns. They aid policy development, urban planning, and forest and agricultural monitoring.

Applications of machine learning in earth sciences include geological mapping, gas leakage detection and geological features identification. Machine learning (ML) is a type of artificial intelligence (AI) that enables computer systems to classify, cluster, identify and analyze vast and complex sets of data while eliminating the need for explicit instructions and programming. Earth science is the study of the origin, evolution, and future of the planet Earth. The Earth system can be subdivided into four major components including the solid earth, atmosphere, hydrosphere and biosphere.

An area sampling frame is an alternative to the most traditional type of sampling frames.

References

  1. Houston, A.H. "Use of satellite data in agricultural surveys". Communications in Statistics. Theory and Methods (23): 2857–2880.
  2. Allen, J.D. "A Look at the Remote Sensing Applications Program of the National Agricultural Statistics Service". Journal of Official Statistics. 6 (4): 393–409.
  3. Taylor, J (1997). Regional Crop Inventories in Europe Assisted by Remote Sensing: 1988-1993. Synthesis Report. Luxembourg: Office for Pubblications of the EC.
  4. Foody, G.M. (1994). "Estimation of tropical forest extent and regenerative stage using remotely sensed data". Journal of Biogeography. 21 (3): 223–244. Bibcode:1994JBiog..21..223F. doi:10.2307/2845527. JSTOR   2845527.
  5. Achard, F (2002). "Determination of deforestation rates of the world's humid tropical forests". Science. 297 (5583): 999–1002. Bibcode:2002Sci...297..999A. doi:10.1126/science.1070656. PMID   12169731.
  6. 1 2 Ambrosio Flores, L (2000). "Land cover estimation in small areas using ground survey and remote sensing". Remote Sensing of the Environment. 74 (2): 240–248. Bibcode:2000RSEnv..74..240F. doi:10.1016/S0034-4257(00)00114-0.
  7. Green, Russell G. Congalton, Kass (2019-01-25). Assessing the Accuracy of Remotely Sensed Data: Principles and Practices, Third Edition (3 ed.). Boca Raton: CRC Press. doi:10.1201/9780429052729. ISBN   978-0-429-05272-9.{{cite book}}: CS1 maint: multiple names: authors list (link)
  8. Stehman, S. (2013). "Estimating Area from an Accuracy Assessment Error Matrix". Remote sensing of environment (132): 202–211.
  9. Stehman, S. (2019). "Key issues in rigorous accuracy assessment of land cover products". Remote sensing of environment (231).
  10. Zhen, Z (2013). "Impact of training and validation sample selection on classification accuracy and accuracy assessment when using reference polygons in object-based classification". International Journal of Remote Sensing. 34 (19): 6914–6930.
  11. Czaplewski, R.L. "Misclassification bias in areal estimates". Photogrammetric Engineering and Remote Sensing (39): 189–192.
  12. Bauer, M.E. (1978). "Area estimation of crops by digital analysis of Landsat data". Photogrammetric Engineering and Remote Sensing (44): 1033–1043.
  13. Olofsson, P. (2014). "Good practices for estimating area and assessing accuracy of land change". Remote Sensing of Environment. 148 (148): 42–57. Bibcode:2014RSEnv.148...42O. doi:10.1016/j.rse.2014.02.015.
  14. Mcroberts, R (2018). "The effects of imperfect reference data on remote sensing-assisted estimators of land cover class proportions". ISPRS Journal of Photogrammetry and Remote Sensing. (142): 292–300.
  15. Foody, G.M. (2010). "Assessing the accuracy of land cover change with imperfect ground reference data". Remote sensing of environment (114): 2271–2285.
  16. Sannier, C (2014). "Using the regression estimator with landsat data to estimate proportion forest cover and net proportion deforestation in gabon". Remote Sensing of Environment (151): 138–148.
  17. Gallego, F.J. (2004). "Remote sensing and land cover area estimation". International Journal of Remote Sensing. 25 (5): 3019–3047. Bibcode:2004IJRS...25.3019G. doi:10.1080/01431160310001619607.
  18. Carfagna, E. (2005). "Using remote sensing for agricultural statistics". International Statistical Review. 73 (3): 389–404. doi:10.1111/j.1751-5823.2005.tb00155.x.