Proactive learning

Last updated

Proactive learning [1] is a generalization of active learning designed to relax unrealistic assumptions and thereby reach practical applications.

Active learning is a form of learning in which teaching strives to involve students in the learning process more directly than in other methods.

"Active learning seeks to select the most informative unlabeled instances and ask an omniscient oracle for their labels, so as to retrain a learning algorithm maximizing accuracy. However, the oracle is assumed to be infallible (never wrong), indefatigable (always answers), individual (only one oracle), and insensitive to costs (always free or always charges the same)." [1]

Machine learning Scientific study of algorithms and statistical models that computer systems use to perform tasks without explicit instructions

Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to perform the task. Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or infeasible to develop a conventional algorithm for effectively performing the task.

"In real life, it is possible and more general to have multiple sources of information with differing reliabilities or areas of expertise. Active learning also assumes that the single oracle is perfect, always providing a correct answer when requested. In reality, though, an "oracle" (if we generalize the term to mean any source of expert information) may be incorrect (fallible) with a probability that should be a function of the difficulty of the question. Moreover, an oracle may be reluctant – it may refuse to answer if it is too uncertain or too busy. Finally, active learning presumes the oracle is either free or charges uniform cost in label elicitation. Such an assumption is naive since cost is likely to be regulated by difficulty (amount of work required to formulate an answer) or other factors." [1]

Proactive learning relaxes all four of these assumptions, relying on a decision-theoretic approach to jointly select the optimal oracle and instance, by casting the problem as a utility optimization problem subject to a budget constraint.

In mathematics and computer science, an optimization problem is the problem of finding the best solution from all feasible solutions. Optimization problems can be divided into two categories depending on whether the variables are continuous or discrete. An optimization problem with discrete variables is known as a discrete optimization. In a discrete optimization problem, we are looking for an object such as an integer, permutation or graph from a countable set. Problems with continuous variables include constrained problems and multimodal problems.

Budget constraint A budget constraint represents all the combinations of goods and services that a consumer may purchase given current prices within his or her given income.

In economics, a budget constraint represents all the combinations of goods and services that a consumer may purchase given current prices within his or her given income. Consumer theory uses the concepts of a budget constraint and a preference map to analyze consumer choices. Both concepts have a ready graphical representation in the two-good case.

Related Research Articles

In computational complexity theory, bounded-error probabilistic polynomial time (BPP) is the class of decision problems solvable by a probabilistic Turing machine in polynomial time with an error probability bounded away from 1/3 for all instances. BPP is one of the largest practical classes of problems, meaning most problems of interest in BPP have efficient probabilistic algorithms that can be run quickly on real modern machines. BPP also contains P, the class of problems solvable in polynomial time with a deterministic machine, since a deterministic machine is a special case of a probabilistic machine.

Java (programming language) Object-oriented programming language

Java is a general-purpose programming language that is class-based, object-oriented, and designed to have as few implementation dependencies as possible. It is intended to let application developers write once, run anywhere (WORA), meaning that compiled Java code can run on all platforms that support Java without the need for recompilation. Java applications are typically compiled to bytecode that can run on any Java virtual machine (JVM) regardless of the underlying computer architecture. The syntax of Java is similar to C and C++, but it has fewer low-level facilities than either of them. As of 2019, Java was one of the most popular programming languages in use according to GitHub, particularly for client-server web applications, with a reported 9 million developers.

In computational complexity theory, the complexity class #P is the set of the counting problems associated with the decision problems in the set NP. More formally, #P is the class of function problems of the form "compute f(x)", where f is the number of accepting paths of a nondeterministic Turing machine running in polynomial time. Unlike most well-known complexity classes, it is not a class of decision problems but a class of function problems.

Defensive programming is a form of defensive design intended to ensure the continuing function of a piece of software under unforeseen circumstances. Defensive programming practices are often used where high availability, safety or security is needed.

Pattern recognition branch of machine learning

Pattern recognition is the automated recognition of patterns and regularities in data. Pattern recognition is closely related to artificial intelligence and machine learning, together with applications such as data mining and knowledge discovery in databases (KDD), and is often used interchangeably with these terms. However, these are distinguished: machine learning is one approach to pattern recognition, while other approaches include hand-crafted rules or heuristics; and pattern recognition is one approach to artificial intelligence, while other approaches include symbolic artificial intelligence. A modern definition of pattern recognition is:

The field of pattern recognition is concerned with the automatic discovery of regularities in data through the use of computer algorithms and with the use of these regularities to take actions such as classifying the data into different categories.

Oracle Database is a proprietary multi-model database management system produced and marketed by Oracle Corporation.

In physics or engineering education, a Fermi problem, Fermi quiz, Fermi question, Fermi estimate, or order estimation is an estimation problem designed to teach dimensional analysis or approximation, and such a problem is usually a back-of-the-envelope calculation. The estimation technique is named after physicist Enrico Fermi as he was known for his ability to make good approximate calculations with little or no actual data. Fermi problems typically involve making justified guesses about quantities and their variance or lower and upper bounds.

A screen reader is a form of assistive technology (AT) which is essential to people who are blind, as well as useful to people who are visually impaired, illiterate, or have a learning disability. Screen readers are software applications that attempt to convey what people with normal eyesight see on a display to their users via non-visual means, like text-to-speech, sound icons, or a Braille device. They do this by applying a wide variety of techniques that include for example interacting with dedicated accessibility APIs, using various operating system features and employing hooking techniques.

A complex question, trick question, multiple question or plurium interrogationum is a question that has a presupposition that is complex. The presupposition is a proposition that is presumed to be acceptable to the respondent when the question is asked. The respondent becomes committed to this proposition when he gives any direct answer. The presupposition is called "complex" because it is a conjunctive proposition, a disjunctive proposition, or a conditional proposition. It could also be another type of proposition that contains some logical connective in a way that makes it have several parts that are component propositions.

Statistical classification in supervised learning

In machine learning and statistics, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations whose category membership is known. Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a diagnosis to a given patient based on observed characteristics of the patient. Classification is an example of pattern recognition.

Semi-supervised learning class of machine learning techniques

Semi-supervised learning is a class of machine learning tasks and techniques that also make use of unlabeled data for training – typically a small amount of labeled data with a large amount of unlabeled data. Semi-supervised learning falls between unsupervised learning and supervised learning. Many machine-learning researchers have found that unlabeled data, when used in conjunction with a small amount of labeled data, can produce considerable improvement in learning accuracy. The acquisition of labeled data for a learning problem often requires a skilled human agent or a physical experiment. The cost associated with the labeling process thus may render a fully labeled training set infeasible, whereas acquisition of unlabeled data is relatively inexpensive. In such situations, semi-supervised learning can be of great practical value. Semi-supervised learning is also of theoretical interest in machine learning and as a model for human learning.

Branch and cut is a method of combinatorial optimization for solving integer linear programs (ILPs), that is, linear programming (LP) problems where some or all the unknowns are restricted to integer values. Branch and cut involves running a branch and bound algorithm and using cutting planes to tighten the linear programming relaxations. Note that if cuts are only used to tighten the initial LP relaxation, the algorithm is called cut and branch.

Query optimization is a function of many relational database management systems. The query optimizer attempts to determine the most efficient way to execute a given query by considering the possible query plans.

In computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run forever.

Active learning (machine learning) topic in machine learning

Active learning is a special case of machine learning in which a learning algorithm is able to interactively query the user to obtain the desired outputs at new data points. In statistics literature it is sometimes also called optimal experimental design.

Dragomir R. Radev is a Yale University professor of computer science working on natural language processing and information retrieval. He previously served as a University of Michigan computer science professor and Columbia University computer science adjunct professor. Radev serves as Member of the Advisory Board of Lawyaw.

Adversarial machine learning is a technique employed in the field of machine learning which attempts to fool models through malicious input. This technique can be applied for a variety of reasons, the most common being to attack or cause a malfunction in standard machine learning models.

Multiple instance learning

In machine learning, multiple-instance learning (MIL) is a type of supervised learning. Instead of receiving a set of instances which are individually labeled, the learner receives a set of labeled bags, each containing many instances. In the simple case of multiple-instance binary classification, a bag may be labeled negative if all the instances in it are negative. On the other hand, a bag is labeled positive if there is at least one instance in it which is positive. From a collection of labeled bags, the learner tries to either (i) induce a concept that will label individual instances correctly or (ii) learn how to label bags without inducing the concept.

Algorithm selection is a meta-algorithmic technique to choose an algorithm from a portfolio on an instance-by-instance basis. It is motivated by the observation that on many practical problems, algorithms have different performances. That is, while one algorithm performs well on some instances, it performs poorly on others and vice versa for another algorithm. If we can identify when to use which algorithm, we can get the best of both worlds and improve overall performance. This is what algorithm selection aims to do. The only prerequisite for applying algorithm selection techniques is that there exists a set of complementary algorithms.

References

  1. 1 2 3 Donmez, P., Carbonell, J.G.: Proactive Learning: Cost-Sensitive Active Learning with Multiple Imperfect Oracles, in Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM '08), Napa Valley 2008. https://www.cs.cmu.edu/~pinard/Papers/cikm0613-donmez.pdf