Soil inference system

Last updated

Inference is a process of deriving logical conclusion from the basis of empirical evidence and prior knowledge rather than on the basis of direct observation. Soil Inference System (SINFERS) is the term proposed by McBratney et al. (2002) as a knowledge base to infer soil properties and populate the digital soil databases. [1] SINFERS takes measurements with a given level of certainty and infers data that is not known with minimal uncertainties by means of logically linked predictive functions. These predictive functions, in a non-spatial context are referred to as pedotransfer functions. The basic assumption underlying SINFERS is that if one knows or is able to predict the basic fundamental properties of a soil, one should be able to infer all other physical and chemical properties using PTFs. Pedotransfer functions relate basic soil properties to other more difficult or expensive to measure soil properties by means of regression and various data mining tools. Crucial to the operation of SINFERS are reliable inputs, the ability to link basic soil information, and the quantification of uncertainty.

Contents

Current Status

During 2007–2009, Grant Tranter of the University of Sydney, Australia in collaboration with Jason Morris of Morris Technical Solutions, USA, completed a working prototype of SINFERS. This implementation of the SINFERS concept uses Jess to pattern match object representations of subsets of soil properties in working memory to the argument list of known pedotransfer functions. The SINFERS' knowledge base knows which PTF rules to apply and how to choose the most certain computed values. SINFERS computes new properties not only from an original input set, but also from all newly inferred properties. Some of the design aspects of this application were presented at the October Rules Fest 2009. See October Rules Fest 2009 .

In November 2009, the formal work on the project was suspended for more than two years. However, as of March 2012, the research has been resumed by Jason Morris, now at the University of Sydney. As of January 2013, the latest progress has been presented in Sydney (April 2012), [2] and the general concepts were discussed in Italy (July 2012) [3] and also in Tasmania (December 2012). [4]

Current project scope includes deployment as an interactive web application and exposure as a web service. Public release is scheduled for late 2013.

Related Research Articles

In logic, fuzzy logic is a form of many-valued logic in which the truth value of variables may be any real number between 0 and 1. It is employed to handle the concept of partial truth, where the truth value may range between completely true and completely false. By contrast, in Boolean logic, the truth values of variables may only be the integer values 0 or 1.

Machine learning Study of algorithms that improve automatically through experience

Machine learning (ML) is the study of computer algorithms that can improve automatically through experience and by the use of data. It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech recognition, and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.

Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word infer means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that in Europe dates at least to Aristotle. Deduction is inference deriving logical conclusions from premises known or assumed to be true, with the laws of valid inference being studied in logic. Induction is inference from particular premises to a universal conclusion. A third type of inference is sometimes distinguished, notably by Charles Sanders Peirce, contradistinguishing abduction from induction.

Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system can be divided and allocated to different sources of uncertainty in its inputs. A related practice is uncertainty analysis, which has a greater focus on uncertainty quantification and propagation of uncertainty; ideally, uncertainty and sensitivity analysis should be run in tandem.

In logic, statistical inference, and supervised learning, transduction or transductive inference is reasoning from observed, specific (training) cases to specific (test) cases. In contrast, induction is reasoning from observed training cases to general rules, which are then applied to the test cases. The distinction is most interesting in cases where the predictions of the transductive model are not achievable by any inductive model. Note that this is caused by transductive inference on different test sets producing mutually inconsistent predictions.

Neurophilosophy or philosophy of neuroscience is the interdisciplinary study of neuroscience and philosophy that explores the relevance of neuroscientific studies to the arguments traditionally categorized as philosophy of mind. The philosophy of neuroscience attempts to clarify neuroscientific methods and results using the conceptual rigor and methods of philosophy of science.

In soil science, pedotransfer functions (PTF) are predictive functions of certain soil properties using data from soil surveys.

Digital Soil Mapping (DSM) in soil science, also referred to as predictive soil mapping or pedometric mapping, is the computer-assisted production of digital maps of soil types and soil properties. Soil mapping, in general, involves the creation and population of spatial soil information by the use of field and laboratory observational methods coupled with spatial and non-spatial soil inference systems.

Soil map

Soil map is a geographical representation showing diversity of soil types and/or soil properties in the area of interest. It is typically the end result of a soil survey inventory, i.e. soil survey. Soil maps are most commonly used for land evaluation, spatial planning, agricultural extension, environmental protection and similar projects. Traditional soil maps typically show only general distribution of soils, accompanied by the soil survey report. Many new soil maps are derived using digital soil mapping techniques. Such maps are typically richer in context and show higher spatial detail than traditional soil maps. Soil maps produced using (geo)statistical techniques also include an estimate of the model uncertainty.

Soil functions are general capabilities of soils that are important for various agricultural, environmental, nature protection, landscape architecture and urban applications. Soil can perform many functions and these include functions related to the natural ecosystems, agricultural productivity, environmental quality, source of raw material, and as base for buildings. Six key soil functions are:

  1. Food and other biomass production
  2. Environmental Interaction
  3. Biological habitat and gene pool
  4. Source of raw materials
  5. Physical and cultural heritage
  6. Platform for man-made structures
Gianni Bellocchi

Gianni Bellocchi is a researcher in agricultural and related sciences. He is credited with the development of approaches and tools in validation of estimates and measurements. Introduction of fuzzy logic in the context of validation is often considered to be the most significant contribution to the field of model and method validation.

Information facts provided or learned about something or someone

Information, in a general sense, is processed, organised and structured data. It provides context for data and enables decision making. For example, a single customer’s sale at a restaurant is data – this becomes information when the business is able to identify the most popular or least popular dish.

In data mining, cluster-weighted modeling (CWM) is an algorithm-based approach to non-linear prediction of outputs from inputs based on density estimation using a set of models (clusters) that are each notionally appropriate in a sub-region of the input space. The overall approach works in jointly input-output space and an initial version was proposed by Neil Gershenfeld.

Uncertain inference was first described by C. J. van Rijsbergen as a way to formally define a query and document relationship in Information retrieval. This formalization is a logical implication with an attached measure of uncertainty.

Probability box

A probability box is a characterization of an uncertain number consisting of both aleatoric and epistemic uncertainties that is often used in risk analysis or quantitative uncertainty modeling where numerical calculations must be performed. Probability bounds analysis is used to make arithmetic and logical calculations with p-boxes.

Probability bounds analysis (PBA) is a collection of methods of uncertainty propagation for making qualitative and quantitative calculations in the face of uncertainties of various kinds. It is used to project partial information about random variables and other quantities through mathematical expressions. For instance, it computes sure bounds on the distribution of a sum, product, or more complex function, given only sure bounds on the distributions of the inputs. Such bounds are called probability boxes, and constrain cumulative probability distributions.

Pedometric mapping, or statistical soil mapping, is data-driven generation of soil property and class maps that is based on use of statistical methods. The main objective of pedometric mapping is to predict values of some soil variable at unobserved locations and access the uncertainty of that estimate using statistical inference i.e. statistically optimal approaches. From the application point of view, the main objective of soil mapping is to accurately predict response of a soil-plant ecosystem to various soil management strategies. In other words, the main objective of pedometric mapping is to generate maps of soil properties and soil classes that can be used to feed other environmental models or for decision making. Pedometric mapping is largely based on applying geostatistics in soil science and other statistical methods used in pedometrics.

In applied statistics and geostatistics, regression-kriging (RK) is a spatial prediction technique that combines a regression of the dependent variable on auxiliary variables with interpolation (kriging) of the regression residuals. It is mathematically equivalent to the interpolation method variously called universal kriging and kriging with external drift, where auxiliary predictors are used directly to solve the kriging weights.

Semantic queries allow for queries and analytics of associative and contextual nature. Semantic queries enable the retrieval of both explicitly and implicitly derived information based on syntactic, semantic and structural information contained in data. They are designed to deliver precise results or to answer more fuzzy and wide open questions through pattern matching and digital reasoning.

This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence, its sub-disciplines, and related fields. Related glossaries include Glossary of computer science, Glossary of robotics, and Glossary of machine vision.

References

  1. 1 McBratney, A.B., Minasny, B., Cattle, S.R., Vervoort, R.W., 2002. From pedotransfer functions to soil inference systems. Geoderma 109, 41-73
  2. 2 The Role of Soil Inference Systems in Digital Soil Assessments, Digital Soil Assessments and Beyond: Proceedings of the 5th Global Workshop on Digital Soil Mapping 2012, Sydney, Australia
  3. 3 http://www.scienzadelsuolo.org/_docs/Atti_Eurosoil_2012.pdf
  4. 4 Joint SSA and NZSSS Soil Science Conference Hobart, Tasmania 2–7 December 2012

See also