Programming by demonstration

Last updated

In computer science, programming by demonstration (PbD) is an end-user development technique for teaching a computer or a robot new behaviors by demonstrating the task to transfer directly instead of programming it through machine commands.

Contents

The terms programming by example (PbE) and programming by demonstration (PbD) appeared in software development research as early as the mid 1980s [1] to define a way to define a sequence of operations without having to learn a programming language. The usual distinction in literature between these terms is that in PbE the user gives a prototypical product of the computer execution, such as a row in the desired results of a query; while in PbD the user performs a sequence of actions that the computer must repeat, generalizing it to be used in different data sets.

These two terms were first undifferentiated, but PbE then tended to be mostly adopted by software development researchers while PbD tended to be adopted by robotics researchers. Today, PbE refers to an entirely different concept, supported by new programming languages that are similar to simulators. This framework can be contrasted with Bayesian program synthesis.

Robot programming by demonstration

The PbD paradigm is first attractive to the robotics industry due to the costs involved in the development and maintenance of robot programs. In this field, the operator often has implicit knowledge on the task to achieve (he/she knows how to do it), but does not have usually the programming skills (or the time) required to reconfigure the robot. Demonstrating how to achieve the task through examples thus allows to learn the skill without explicitly programming each detail.

The first PbD strategies proposed in robotics were based on teach-in, guiding or play-back methods that consisted basically in moving the robot (through a dedicated interface or manually) through a set of relevant configurations that the robot should adopt sequentially (position, orientation, state of the gripper). The method was then progressively ameliorated by focusing principally on the teleoperation control and by using different interfaces such as vision.

However, these PbD methods still used direct repetition, which was useful in industry only when conceiving an assembly line using exactly the same product components. To apply this concept to products with different variants or to apply the programs to new robots, the generalization issue became a crucial point. To address this issue, the first attempts at generalizing the skill were mainly based on the help of the user through queries about the user's intentions. Then, different levels of abstractions were proposed to resolve the generalization issue, basically dichotomized in learning methods at a symbolic level or at a trajectory level.

The development of humanoid robots naturally brought a growing interest in robot programming by demonstration. As a humanoid robot is supposed by its nature to adapt to new environments, not only the human appearance is important but the algorithms used for its control require flexibility and versatility. Due to the continuously changing environments and to the huge varieties of tasks that a robot is expected to perform, the robot requires the ability to continuously learn new skills and adapt the existing skills to new contexts.

Research in PbD also progressively departed from its original purely engineering perspective to adopt an interdisciplinary approach, taking insights from neuroscience and social sciences to emulate the process of imitation in humans and animals. With the increasing consideration of this body of work in robotics, the notion of Robot programming by demonstration (also known as RPD or RbD) was also progressively replaced by the more biological label of Learning by imitation.

Neurally-imprinted Stable Vector Fields (NiVF)

Neurally-imprinted Stable Vector Fields [2] (NiVF) was introduced as a novel learning scheme during ESANN 2013 and show how to imprint vector fields into neurals networks such as Extreme Learning Machines (ELMs) in a guaranteed stable manner. Furthermore, the paper won the best student paper award. The networks represent movements, where asymptotic stability is incorporated through constraints derived from Lyapunov stability theory. It is shown that this approach successfully performs stable and smooth point-to-point movements learned from human handwriting movements.

It is also possible to learn the Lyapunov candidate that is used for stabilization of the dynamical system. [3] For this reason, neural learning scheme that estimates stable dynamical systems from demonstrations based on a two-stage process are needed: first, a data-driven Lyapunov function candidate is estimated. Second, stability is incorporated by means of a novel method to respect local constraints in the neural learning. This allows for learning stable dynamics while simultaneously sustaining the accuracy of the dynamical system and robustly generate complex movements.

Diffeomorphic Transformations

Diffeomorphic transformations turn out to be particularly suitable for substantially increasing the learnability of dynamical systems for robotic motions. The stable estimator of dynamical systems (SEDS) is an interesting approach to learn time invariant systems to control robotic motions. However, this is restricted to dynamical systems with only quadratic Lyapunov functions. The new approach Tau-SEDS [4] overcomes this limitations in a mathematical elegant manner.

Parameterized skills

After a task was demonstrated by a human operator, the trajectory is stored in a database. Getting easier access to the raw data is realized with parameterized skills. [5] A skill is requesting a database and generates a trajectory. For example, at first the skill “opengripper(slow)” is send to the motion database and in response, the stored movement of the robotarm is provided. The parameters of a skill allow to modify the policy to fulfill external constraints.

A skill is an interface between task names, given in natural language and the underlying spatiotemporal movement in the 3d space, which consists of points. Single skills can be combined into a task for defining longer motion sequences from a high level perspective. For practical applications, different actions are stored in a skill library. For increasing the abstraction level further, skills can be converted into dynamic movement primitives (DMP). They generate a robot trajectory on the fly which was unknown at the time of the demonstration. This helps to increase the flexibility of the solver. [6]

Non-robotic use

For final users, to automate a workflow in a complex tool (e.g. Photoshop), the most simple case of PbD is the macro recorder.

See also

Related Research Articles

<span class="mw-page-title-main">Reinforcement learning</span> Field of machine learning

Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent ought to take actions in a dynamic environment in order to maximize the cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

Developmental robotics (DevRob), sometimes called epigenetic robotics, is a scientific field which aims at studying the developmental mechanisms, architectures and constraints that allow lifelong and open-ended learning of new skills and new knowledge in embodied machines. As in human children, learning is expected to be cumulative and of progressively increasing complexity, and to result from self-exploration of the world in combination with social interaction. The typical methodological approach consists in starting from theories of human and animal development elaborated in fields such as developmental psychology, neuroscience, developmental and evolutionary biology, and linguistics, then to formalize and implement them in robots, sometimes exploring extensions or variants of them. The experimentation of those models in robots allows researchers to confront them with reality, and as a consequence, developmental robotics also provides feedback and novel hypotheses on theories of human and animal development.

Human–robot interaction (HRI) is the study of interactions between humans and robots. Human–robot interaction is a multidisciplinary field with contributions from human–computer interaction, artificial intelligence, robotics, natural language processing, design, and psychology. A subfield known as physical human–robot interaction (pHRI) has tended to focus on device design to enable people to safely interact with robotic systems.

Robot learning is a research field at the intersection of machine learning and robotics. It studies techniques allowing a robot to acquire novel skills or adapt to its environment through learning algorithms. The embodiment of the robot, situated in a physical embedding, provides at the same time specific difficulties and opportunities for guiding the learning process.

In computer science, programming by example (PbE), also termed programming by demonstration or more generally as demonstrational programming, is an end-user development technique for teaching a computer new behavior by demonstrating actions on concrete examples. The system records user actions and infers a generalized program that can be used on new examples.

Adaptable Robotics refers to a field of robotics with a focus on creating robotic systems capable of adjusting their hardware and software components to perform a wide range of tasks while adapting to varying environments. The 1960s introduced robotics into the industrial field. Since then, the need to make robots with new forms of actuation, adaptability, sensing and perception, and even the ability to learn stemmed the field of adaptable robotics. Significant developments such as the PUMA robot, manipulation research, soft robotics, swarm robotics, AI, cobots, bio-inspired approaches, and more ongoing research have advanced the adaptable robotics field tremendously. Adaptable robots are usually associated with their development kit, typically used to create autonomous mobile robots. In some cases, an adaptable kit will still be functional even when certain components break.

<span class="mw-page-title-main">Legged robot</span> Type of mobile robot

Legged robots are a type of mobile robot which use articulated limbs, such as leg mechanisms, to provide locomotion. They are more versatile than wheeled robots and can traverse many different terrains, though these advantages require increased complexity and power consumption. Legged robots often imitate legged animals, such as humans or insects, in an example of biomimicry.

In machine learning, systems which employ offline learning do not change their approximation of the target function when the initial training phase has been completed. These systems are also typically examples of eager learning.

In artificial intelligence, apprenticeship learning is the process of learning by observing an expert. It can be viewed as a form of supervised learning, where the training dataset consists of task executions by a demonstration teacher.

<span class="mw-page-title-main">Robotics</span> Design, construction, use, and application of robots

Robotics is an interdisciplinary field that involves the design, construction, operation, and use of robots.

Google Brain was a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, Google Brain combined open-ended machine learning research with information systems and large-scale computing resources. The team has created tools such as TensorFlow, which allow for neural networks to be used by the public, with multiple internal AI research projects. The team aims to create research opportunities in machine learning and natural language processing. The team was merged into former Google sister company DeepMind to form Google DeepMind in April 2023.

A juggling robot is a robot designed to be able to successfully carry out bounce or toss juggling. Robots capable of juggling are designed and built both to increase and test understanding and theories of human movement, juggling, and robotics. Juggling robots may include sensors to guide arm/hand movement or may rely on physical methods such as tracks or funnels to guide prop movement. Since true juggling requires more props than hands, many robots described as capable of juggling are not.

Cloud robotics is a field of robotics that attempts to invoke cloud technologies such as cloud computing, cloud storage, and other Internet technologies centered on the benefits of converged infrastructure and shared services for robotics. When connected to the cloud, robots can benefit from the powerful computation, storage, and communication resources of modern data center in the cloud, which can process and share information from various robots or agent. Humans can also delegate tasks to robots remotely through networks. Cloud computing technologies enable robot systems to be endowed with powerful capability whilst reducing costs through cloud technologies. Thus, it is possible to build lightweight, low-cost, smarter robots with an intelligent "brain" in the cloud. The "brain" consists of data center, knowledge base, task planners, deep learning, information processing, environment models, communication support, etc.

Robotic prosthesis control is a method for controlling a prosthesis in such a way that the controlled robotic prosthesis restores a biologically accurate gait to a person with a loss of limb. This is a special branch of control that has an emphasis on the interaction between humans and robotics.

In computer science, incremental learning is a method of machine learning in which input data is continuously used to extend the existing model's knowledge i.e. to further train the model. It represents a dynamic technique of supervised learning and unsupervised learning that can be applied when training data becomes available gradually over time or its size is out of system memory limits. Algorithms that can facilitate incremental learning are known as incremental machine learning algorithms.

<span class="mw-page-title-main">Jan Peters (computer scientist)</span> German computer scientist

Jan Peters is a German computer scientist. He is Professor of Intelligent Autonomous Systems at Department of Computer Science of the Technische Universität Darmstadt.

<span class="mw-page-title-main">Deep reinforcement learning</span> Machine learning that combines deep learning and reinforcement learning

Deep reinforcement learning is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs and decide what actions to perform to optimize an objective. Deep reinforcement learning has been used for a diverse set of applications including but not limited to robotics, video games, natural language processing, computer vision, education, transportation, finance and healthcare.

<span class="mw-page-title-main">Aude Billard</span> Swiss physicist

Aude G. Billard is a Swiss physicist in the fields of machine learning and human-robot interactions. As a full professor at the School of Engineering at Swiss Federal Institute of Technology in Lausanne (EPFL), Billard’s research focuses on applying machine learning to support robot learning through human guidance. Billard’s work on human-robot interactions has been recognized numerous times by the Institute of Electrical and Electronics Engineers (IEEE) and she currently holds a leadership position on the executive committee of the IEEE Robotics and Automation Society (RAS) as the vice president of publication activities.

<span class="mw-page-title-main">Auke Ijspeert</span> Swiss-Dutch roboticist and neuroscientist

Auke Jan Ijspeert is a Swiss-Dutch roboticist and neuroscientist. He is a professor of biorobotics in the Institute of Bioengineering at EPFL, École Polytechnique Fédérale de Lausanne, and the head of the Biorobotics Laboratory at the School of Engineering.

Andrea L. Thomaz is a senior research scientist in the Department of Electrical and Computer Engineering at The University of Texas at Austin and Director of Socially Intelligent Machines Lab. She specializes in Human-Robot Interaction, Artificial Intelligence and Interactive Machine Learning.

References

  1. Halbert, Dan (November 1984). "Programming by Example" (PDF). U.C. Berkeley (PhD diss.). Retrieved 2012-07-28.{{cite journal}}: Cite journal requires |journal= (help)
  2. A. Lemme, K. Neumann, R. F. Reinhart, J. J. Steil (2013). "Neurally Imprinted Stable Vector Fields" (PDF). Proc. Europ. Symp. On Artificial Neural Networks: 327–332.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  3. A. Lemme, K. Neumann, and J. J. Steil (2013). "Neural learning of stable dynamical systems based on data-driven Lyapunov candidates" (PDF). 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. pp. 1216–1222. doi:10.1109/IROS.2013.6696505. ISBN   978-1-4673-6358-7. S2CID   1612856.{{cite book}}: CS1 maint: multiple names: authors list (link)
  4. K. Neumann and J. J. Steil (2015). "Learning Robot Motions with Stable Dynamical Systems under Diffeomorphic Transformations" (PDF). Robotics and Autonomous Systems. 70 (C): 1–15. doi:10.1016/j.robot.2015.04.006.
  5. Pervez, Affan and Lee, Dongheui (2018). "Learning task-parameterized dynamic movement primitives using mixture of GMMs" (PDF). Intelligent Service Robotics. Springer. 11 (1): 61–78. doi:10.1007/s11370-017-0235-8. S2CID   3398752.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  6. Alizadeh, Tohid; Saduanov, Batyrkhan (2017). "Robot programming by demonstration of multiple tasks within a common environment". 2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI). IEEE. pp. 608–613. doi:10.1109/mfi.2017.8170389. ISBN   978-1-5090-6064-1. S2CID   40697084.

Reviews papers

Special issues in journals

Key laboratories and people

Videos

A robot that learns to cook an omelet:

A robot that learns to unscrew a bottle of coke: