Offline learning

Last updated November 01, 2024

Offline learning is a machine learning training approach in which a model is trained on a fixed dataset that is not updated during the learning process.^[1] This dataset is collected beforehand, and the learning typically occurs in a batch mode. Once the model is trained, it can make predictions on new, unseen data.

Applications for robotics control

The ability of robots to learn is equal to create a table (information) which is filled with values. One option for doing so is programming by demonstration. Here, the table is filled with values by a human teacher. The demonstration is provided either as direct numerical control policy which is equal to a trajectory, or as an indirect objective function which is given in advance.^[3]

Offline learning is working in batch mode. In step 1 the task is demonstrated and stored in the table, and in step 2 the task is reproduced by the robot.^[4] The pipeline is slow and inefficient because a delay is there between behavior demonstration and skill replay.^[5]^[6]

A short example will help to understand the idea. Suppose the robot should learn a wall following task and the internal table of the robot is empty. Before the robot gets activated in the replay mode, the human demonstrator has to teach the behavior. He is controlling the robot with teleoperation and during the learning step the skill table is generated. The process is called offline, because the robot control software is doing nothing but the device is utilized by the human operator as a pointing device for driving along the wall.^[6]

Related Research Articles

Computerized batch processing is a method of running software programs called jobs in batches automatically. While users are required to submit the jobs, no other interaction by the user is required to process the batch. Batches may automatically be run at scheduled times as well as being run contingent on the availability of computer resources.

Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learning is one of the three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

Learning classifier systems, or LCS, are a paradigm of rule-based machine learning methods that combine a discovery component with a learning component. Learning classifier systems seek to identify a set of context-dependent rules that collectively store and apply knowledge in a piecewise manner in order to make predictions. This approach allows complex solution spaces to be broken up into smaller, simpler parts for the reinforcement learning that is inside artificial intelligence research.

A blackboard system is an artificial intelligence approach based on the blackboard architectural model, where a common knowledge base, the "blackboard", is iteratively updated by a diverse group of specialist knowledge sources, starting with a problem specification and ending with a solution. Each knowledge source updates the blackboard with a partial solution when its internal constraints match the blackboard state. In this way, the specialists work together to solve the problem. The blackboard model was originally designed as a way to handle complex, ill-defined problems, where the solution is the sum of its parts.

Automated planning and scheduling, sometimes denoted as simply AI planning, is a branch of artificial intelligence that concerns the realization of strategies or action sequences, typically for execution by intelligent agents, autonomous robots and unmanned vehicles. Unlike classical control and classification problems, the solutions are complex and must be discovered and optimized in multidimensional space. Planning is also related to decision theory.

iCub is a one meter tall open source robotics humanoid robot testbed for research into human cognition and artificial intelligence.

In computer science, programming by demonstration (PbD) is an end-user development technique for teaching a computer or a robot new behaviors by demonstrating the task to transfer directly instead of programming it through machine commands.

Adaptable Robotics refers to a field of robotics with a focus on creating robotic systems capable of adjusting their hardware and software components to perform a wide range of tasks while adapting to varying environments. The 1960s introduced robotics into the industrial field. Since then, the need to make robots with new forms of actuation, adaptability, sensing and perception, and even the ability to learn stemmed the field of adaptable robotics. Significant developments such as the PUMA robot, manipulation research, soft robotics, swarm robotics, AI, cobots, bio-inspired approaches, and more ongoing research have advanced the adaptable robotics field tremendously. Adaptable robots are usually associated with their development kit, typically used to create autonomous mobile robots. In some cases, an adaptable kit will still be functional even when certain components break.

Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agents' actions and the environmental conditions. Since the 1980s, this research field has captured the attention of several computer science communities due to its strength in providing personalized support for many different applications and its connection to many different fields of study such as medicine, human-computer interaction, or sociology.

In artificial intelligence, apprenticeship learning is the process of learning by observing an expert. It can be viewed as a form of supervised learning, where the training dataset consists of task executions by a demonstration teacher.

Real-time Control System (RCS) is a reference model architecture, suitable for many software-intensive, real-time computing control problem domains. It defines the types of functions needed in a real-time intelligent control system, and how these functions relate to each other.

Adaptive collaborative control is the decision-making approach used in hybrid models consisting of finite-state machines with functional models as subcomponents to simulate behavior of systems formed through the partnerships of multiple agents for the execution of tasks and the development of work products. The term “collaborative control” originated from work developed in the late 1990s and early 2000 by Fong, Thorpe, and Baur (1999). It is important to note that according to Fong et al. in order for robots to function in collaborative control, they must be self-reliant, aware, and adaptive. In literature, the adjective “adaptive” is not always shown but is noted in the official sense as it is an important element of collaborative control. The adaptation of traditional applications of control theory in teleoperations sought initially to reduce the sovereignty of “humans as controllers/robots as tools” and had humans and robots working as peers, collaborating to perform tasks and to achieve common goals. Early implementations of adaptive collaborative control centered on vehicle teleoperation. Recent uses of adaptive collaborative control cover training, analysis, and engineering applications in teleoperations between humans and multiple robots, multiple robots collaborating among themselves, unmanned vehicle control, and fault tolerant controller design.

Dlib is a general purpose cross-platform software library written in the programming language C++. Its design is heavily influenced by ideas from design by contract and component-based software engineering. Thus it is, first and foremost, a set of independent software components. It is open-source software released under a Boost Software License.

A behavior tree is a mathematical model of plan execution used in computer science, robotics, control systems and video games. They describe switchings between a finite set of tasks in a modular fashion. Their strength comes from their ability to create very complex tasks composed of simple tasks, without worrying how the simple tasks are implemented. Behavior trees present some similarities to hierarchical state machines with the key difference that the main building block of a behavior is a task rather than a state. Its ease of human understanding make behavior trees less error prone and very popular in the game developer community. Behavior trees have been shown to generalize several other control architectures.

Cloud robotics is a field of robotics that attempts to invoke cloud technologies such as cloud computing, cloud storage, and other Internet technologies centered on the benefits of converged infrastructure and shared services for robotics. When connected to the cloud, robots can benefit from the powerful computation, storage, and communication resources of modern data center in the cloud, which can process and share information from various robots or agent. Humans can also delegate tasks to robots remotely through networks. Cloud computing technologies enable robot systems to be endowed with powerful capability whilst reducing costs through cloud technologies. Thus, it is possible to build lightweight, low-cost, smarter robots with an intelligent "brain" in the cloud. The "brain" consists of data center, knowledge base, task planners, deep learning, information processing, environment models, communication support, etc.

Deep reinforcement learning is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs and decide what actions to perform to optimize an objective. Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance and healthcare.

Gregory D. Hager is the Mandell Bellmore Professor of Computer Science and founding director of the Johns Hopkins Malone Center for Engineering in Healthcare at Johns Hopkins University.

Silvia Ferrari is an Italian-American aerospace engineer. She is John Brancaccio Professor at the Sibley School of Mechanical and Aerospace Engineering at Cornell University and also the director of the Laboratory for Intelligent Systems and Control (LISC) at the same university.

Aude G. Billard is a Swiss physicist in the fields of machine learning and human-robot interactions. As a full professor at the School of Engineering at Swiss Federal Institute of Technology in Lausanne (EPFL), Billard’s research focuses on applying machine learning to support robot learning through human guidance. Billard’s work on human-robot interactions has been recognized numerous times by the Institute of Electrical and Electronics Engineers (IEEE) and she currently holds a leadership position on the executive committee of the IEEE Robotics and Automation Society (RAS) as the vice president of publication activities.

Imitation learning is a paradigm in reinforcement learning, where an agent learns to perform a task by supervised learning from expert demonstrations. It is also called learning from demonstration and apprenticeship learning.

References

↑ Bishop, Christopher M. (2006-08-17). Pattern Recognition and Machine Learning. New York: Springer. ISBN 978-0-387-31073-2.
↑ Ben-David, Shai; Kushilevitz, Eyal; Mansour, Yishay (1997-10-01). "Online Learning versus Offline Learning". Machine Learning. 29 (1): 45–63. doi: 10.1023/A:1007465907571 . ISSN 0885-6125.
↑ Bajcsy, Andrea and Losey, Dylan P and O’Malley, Marcia K and Dragan, Anca D (2017). "Learning robot objectives from physical human interaction". Proceedings of Machine Learning Research. 78. PMLR: 217–226.{{cite journal}}: CS1 maint: multiple names: authors list (link)
↑ Meyer-Delius, Daniel and Beinhofer, Maximilian and Burgard, Wolfram (2012). Occupancy grid models for robot mapping in changing environments. Twenty-Sixth AAAI Conference on Artificial Intelligence.{{cite conference}}: CS1 maint: multiple names: authors list (link)
↑ Luka Peternel and Erhan Oztop and Jan Babic (2016). A shared control method for online human-in-the-loop robot learning based on Locally Weighted Regression. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE. doi:10.1109/iros.2016.7759574.
1 2 Jun, Li and Duckett, Tom (2003). Robot behavior learning with a dynamically adaptive RBF network: Experiments in offline and online learning. Proc. 2 Intern. Conf. on Comput. Intelligence, Robotics and Autonomous System, CIRAS. Citeseer.{{cite conference}}: CS1 maint: multiple names: authors list (link)

This computing article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Bishop, Christopher M. (2006-08-17). Pattern Recognition and Machine Learning. New York: Springer. ISBN 978-0-387-31073-2.

[2] Ben-David, Shai; Kushilevitz, Eyal; Mansour, Yishay (1997-10-01). "Online Learning versus Offline Learning". Machine Learning. 29 (1): 45–63. doi: 10.1023/A:1007465907571 . ISSN 0885-6125.

[3] Bajcsy, Andrea and Losey, Dylan P and O’Malley, Marcia K and Dragan, Anca D (2017). "Learning robot objectives from physical human interaction". Proceedings of Machine Learning Research. 78. PMLR: 217–226.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[4] Meyer-Delius, Daniel and Beinhofer, Maximilian and Burgard, Wolfram (2012). Occupancy grid models for robot mapping in changing environments. Twenty-Sixth AAAI Conference on Artificial Intelligence.{{cite conference}}: CS1 maint: multiple names: authors list (link)

[5] Luka Peternel and Erhan Oztop and Jan Babic (2016). A shared control method for online human-in-the-loop robot learning based on Locally Weighted Regression. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE. doi:10.1109/iros.2016.7759574.

[Jun2003-6] 1 2 Jun, Li and Duckett, Tom (2003). Robot behavior learning with a dynamically adaptive RBF network: Experiments in offline and online learning. Proc. 2 Intern. Conf. on Comput. Intelligence, Robotics and Autonomous System, CIRAS. Citeseer.{{cite conference}}: CS1 maint: multiple names: authors list (link)

[1]

[2]

[3]

[4]

[5]

[6]

Offline learning

Contents

Applications for robotics control

See also

Related Research Articles

References