Offline learning

Last updated

In machine learning, systems which employ offline learning do not change their approximation of the target function when the initial training phase has been completed. [1]

Contents

While in online learning, only the set of possible elements is known, in offline learning, the identity of the elements as well as the order in which they are presented is known to the learner. [2]

Applications for robotics control

The ability of robots to learn is equal to create a table (information) which is filled with values. One option for doing so is programming by demonstration. Here, the table is filled with values by a human teacher. The demonstration is provided either as direct numerical control policy which is equal to a trajectory, or as an indirect objective function which is given in advance. [3]

Offline learning is working in batch mode. In step 1 the task is demonstrated and stored in the table, and in step 2 the task is reproduced by the robot. [4] The pipeline is slow and inefficient because a delay is there between behavior demonstration and skill replay. [5] [6]

A short example will help to understand the idea. Suppose the robot should learn a wall following task and the internal table of the robot is empty. Before the robot gets activated in the replay mode, the human demonstrator has to teach the behavior. He is controlling the robot with teleoperation and during the learning step the skill table is generated. The process is called offline, because the robot control software is doing nothing but the device is utilized by the human operator as a pointing device for driving along the wall. [6]

See also

Related Research Articles

<span class="mw-page-title-main">Multi-agent system</span> Built of multiple interacting agents

A multi-agent system is a computerized system composed of multiple interacting intelligent agents. Multi-agent systems can solve problems that are difficult or impossible for an individual agent or a monolithic system to solve. Intelligence may include methodic, functional, procedural approaches, algorithmic search or reinforcement learning.

Task analysis is a fundamental tool of human factors engineering. It entails analyzing how a task is accomplished, including a detailed description of both manual and mental activities, task and element durations, task frequency, task allocation, task complexity, environmental conditions, necessary clothing and equipment, and any other unique factors involved in or required for one or more people to perform a given task.

A blackboard system is an artificial intelligence approach based on the blackboard architectural model, where a common knowledge base, the "blackboard", is iteratively updated by a diverse group of specialist knowledge sources, starting with a problem specification and ending with a solution. Each knowledge source updates the blackboard with a partial solution when its internal constraints match the blackboard state. In this way, the specialists work together to solve the problem. The blackboard model was originally designed as a way to handle complex, ill-defined problems, where the solution is the sum of its parts.

<span class="mw-page-title-main">Automated planning and scheduling</span> Branch of artificial intelligence

Automated planning and scheduling, sometimes denoted as simply AI planning, is a branch of artificial intelligence that concerns the realization of strategies or action sequences, typically for execution by intelligent agents, autonomous robots and unmanned vehicles. Unlike classical control and classification problems, the solutions are complex and must be discovered and optimized in multidimensional space. Planning is also related to decision theory.

Robot learning is a research field at the intersection of machine learning and robotics. It studies techniques allowing a robot to acquire novel skills or adapt to its environment through learning algorithms. The embodiment of the robot, situated in a physical embedding, provides at the same time specific difficulties and opportunities for guiding the learning process.

iCub Open source robotics humanoid robot testbed

iCub is a one metre tall open source robotics humanoid robot testbed for research into human cognition and artificial intelligence.

The following outline is provided as an overview of and topical guide to automation:

In computer science, programming by demonstration (PbD) is an end-user development technique for teaching a computer or a robot new behaviors by demonstrating the task to transfer directly instead of programming it through machine commands.

Adaptable Robotics refers to a field of robotics with a focus on creating robotic systems capable of adjusting their hardware and software components to perform a wide range of tasks while adapting to varying environments. The 1960s introduced robotics into the industrial field. Since then, the need to make robots with new forms of actuation, adaptability, sensing and perception, and even the ability to learn stemmed the field of adaptable robotics. Significant developments such as the PUMA robot, manipulation research, soft robotics, swarm robotics, AI, cobots, bio-inspired approaches, and more ongoing research have advanced the adaptable robotics field tremendously. Adaptable robots are usually associated with their development kit, typically used to create autonomous mobile robots. In some cases, an adaptable kit will still be functional even when certain components break.

In artificial intelligence, apprenticeship learning is the process of learning by observing an expert. It can be viewed as a form of supervised learning, where the training dataset consists of task executions by a demonstration teacher.

<span class="mw-page-title-main">Real-time Control System</span> Reference model architecture

Real-time Control System (RCS) is a reference model architecture, suitable for many software-intensive, real-time computing control problem domains. It defines the types of functions needed in a real-time intelligent control system, and how these functions relate to each other.

Adaptive collaborative control is the decision-making approach used in hybrid models consisting of finite-state machines with functional models as subcomponents to simulate behavior of systems formed through the partnerships of multiple agents for the execution of tasks and the development of work products. The term “collaborative control” originated from work developed in the late 1990s and early 2000 by Fong, Thorpe, and Baur (1999). It is important to note that according to Fong et al. in order for robots to function in collaborative control, they must be self-reliant, aware, and adaptive. In literature, the adjective “adaptive” is not always shown but is noted in the official sense as it is an important element of collaborative control. The adaptation of traditional applications of control theory in teleoperations sought initially to reduce the sovereignty of “humans as controllers/robots as tools” and had humans and robots working as peers, collaborating to perform tasks and to achieve common goals. Early implementations of adaptive collaborative control centered on vehicle teleoperation. Recent uses of adaptive collaborative control cover training, analysis, and engineering applications in teleoperations between humans and multiple robots, multiple robots collaborating among themselves, unmanned vehicle control, and fault tolerant controller design.

<span class="mw-page-title-main">Dlib</span> Cross-platform software library

Dlib is a general purpose cross-platform software library written in the programming language C++. Its design is heavily influenced by ideas from design by contract and component-based software engineering. Thus it is, first and foremost, a set of independent software components. It is open-source software released under a Boost Software License.

Google Brain was a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, Google Brain combined open-ended machine learning research with information systems and large-scale computing resources. The team has created tools such as TensorFlow, which allow for neural networks to be used by the public, with multiple internal AI research projects. The team aims to create research opportunities in machine learning and natural language processing. The team was merged into former Google sister company DeepMind to form Google DeepMind in April 2023.

<span class="mw-page-title-main">Behavior tree (artificial intelligence, robotics and control)</span>

A behavior tree is a mathematical model of plan execution used in computer science, robotics, control systems and video games. They describe switchings between a finite set of tasks in a modular fashion. Their strength comes from their ability to create very complex tasks composed of simple tasks, without worrying how the simple tasks are implemented. Behavior trees present some similarities to hierarchical state machines with the key difference that the main building block of a behavior is a task rather than a state. Its ease of human understanding make behavior trees less error prone and very popular in the game developer community. Behavior trees have been shown to generalize several other control architectures.

Cloud robotics is a field of robotics that attempts to invoke cloud technologies such as cloud computing, cloud storage, and other Internet technologies centered on the benefits of converged infrastructure and shared services for robotics. When connected to the cloud, robots can benefit from the powerful computation, storage, and communication resources of modern data center in the cloud, which can process and share information from various robots or agent. Humans can also delegate tasks to robots remotely through networks. Cloud computing technologies enable robot systems to be endowed with powerful capability whilst reducing costs through cloud technologies. Thus, it is possible to build lightweight, low-cost, smarter robots with an intelligent "brain" in the cloud. The "brain" consists of data center, knowledge base, task planners, deep learning, information processing, environment models, communication support, etc.

<span class="mw-page-title-main">Deep reinforcement learning</span> Machine learning that combines deep learning and reinforcement learning

Deep reinforcement learning is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs and decide what actions to perform to optimize an objective. Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance and healthcare.

<span class="mw-page-title-main">Gregory D. Hager</span> American computer scientist

Gregory D. Hager is the Mandell Bellmore Professor of Computer Science and founding director of the Johns Hopkins Malone Center for Engineering in Healthcare at Johns Hopkins University.

<span class="mw-page-title-main">Silvia Ferrari</span> American aerospace engineer

Silvia Ferrari is an Italian-American aerospace engineer. She is John Brancaccio Professor at the Sibley School of Mechanical and Aerospace Engineering at Cornell University and also the director of the Laboratory for Intelligent Systems and Control (LISC) at the same university.

<span class="mw-page-title-main">Aude Billard</span> Swiss physicist

Aude G. Billard is a Swiss physicist in the fields of machine learning and human-robot interactions. As a full professor at the School of Engineering at Swiss Federal Institute of Technology in Lausanne (EPFL), Billard’s research focuses on applying machine learning to support robot learning through human guidance. Billard’s work on human-robot interactions has been recognized numerous times by the Institute of Electrical and Electronics Engineers (IEEE) and she currently holds a leadership position on the executive committee of the IEEE Robotics and Automation Society (RAS) as the vice president of publication activities.

References

  1. Bishop, Christopher M. (2006-08-17). Pattern Recognition and Machine Learning. New York: Springer. ISBN   978-0-387-31073-2.
  2. Ben-David, Shai; Kushilevitz, Eyal; Mansour, Yishay (1997-10-01). "Online Learning versus Offline Learning". Machine Learning. 29 (1): 45–63. doi: 10.1023/A:1007465907571 . ISSN   0885-6125.
  3. Bajcsy, Andrea and Losey, Dylan P and O’Malley, Marcia K and Dragan, Anca D (2017). "Learning robot objectives from physical human interaction". Proceedings of Machine Learning Research. PMLR. 78: 217–226.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  4. Meyer-Delius, Daniel and Beinhofer, Maximilian and Burgard, Wolfram (2012). Occupancy grid models for robot mapping in changing environments. Twenty-Sixth AAAI Conference on Artificial Intelligence.{{cite conference}}: CS1 maint: multiple names: authors list (link)
  5. Luka Peternel and Erhan Oztop and Jan Babic (2016). A shared control method for online human-in-the-loop robot learning based on Locally Weighted Regression. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE. doi:10.1109/iros.2016.7759574.
  6. 1 2 Jun, Li and Duckett, Tom (2003). Robot behavior learning with a dynamically adaptive RBF network: Experiments in offline and online learning. Proc. 2 Intern. Conf. on Comput. Intelligence, Robotics and Autonomous System, CIRAS. Citeseer.{{cite conference}}: CS1 maint: multiple names: authors list (link)