Multi-agent pathfinding

Last updated
Example of Multi-Agent Path Finding in a grid environment. Example of Multi-Agent Path Finding in a grid environment.png
Example of Multi-Agent Path Finding in a grid environment.

The problem of Multi-Agent Pathfinding (MAPF) is an instance of multi-agent planning and consists in the computation of collision-free paths for a group of agents from their location to an assigned target. It is an optimization problem, since the aim is to find those paths that optimize a given objective function, usually defined as the number of time steps until all agents reach their goal cells. MAPF is the multi-agent generalization of the pathfinding problem, and it is closely related to the shortest path problem in the context of graph theory.

Contents

Several algorithms have been proposed to solve the MAPF problem. Due to its complexity, it happens that optimal approaches are infeasible on big environments and with a high number of agents. However, given the applications in which MAPF is involved such as automated warehouses and airport management, it is important to reach a trade-off between the efficiency of the solution and its effectiveness.

Problem Formalization

The elements of a classical MAPF [1] problem are the following:

It is assumed that time is discrete, and that each agent can perform one action at each time step. There are two possible types of actions: the wait action, in which the agent remains in its node, and the move action, that allows the agent to move to an adjacent node. An action is formalized as a function , meaning that represents the action of moving from to if is adjacent to and different than , or to stay in node if .

The agents perform sequences of actions to go from their starting point to their target location. A sequence of action performed by agent is denoted by and is called a plan. If agent starts from its location and arrives to its target location performing plan , then is called single-agent plan for agent . [1] A valid solution for the MAPF problem is a set of single-agent plans (one for each agent), such that the plans do not collide one another. Once an agent has reached its target, it can either remain in the target location or disappear. [1]

Types of Collisions

In order to have a valid solution for a MAPF problem, it is necessary that the single-agent plans of the agents do not collide one another. Given plan , the expression denotes the position of agent after having performed steps of plan . It is possible to distinguish five different types of collisions between two plans and . [1]

Types of conflicts: (a) is an edge conflict, (b) a vertex conflict, (c) a following conflict, (d) a cycle conflict, and (e) a swapping conflict. Types of agents' conflicts in a grid environment.png
Types of conflicts: (a) is an edge conflict, (b) a vertex conflict, (c) a following conflict, (d) a cycle conflict, and (e) a swapping conflict.

When formalizing a MAPF problem, it is possible to decide which conflicts are allowed and which are forbidden. A unified standard about permitted and denied conflicts does not exist, however vertex and edge conflicts are usually not allowed. [1]

Objective Functions

When computing single-agent plans, the aim is to maximize a user-defined objective function. There is not a standard objective function to adopt, however the most common are: [1]

Algorithms

Several algorithms have been proposed to solve the MAPF problem. The issue is that it is NP-hard to find optimal makespan or flow-time solutions, [3] also when considering planar graphs, [4] or graphs similar to grids. [5] For what concerns bounded suboptimal solutions, it is shown that it is NP-hard to find a makespan-optimal solution with a factor of suboptimality smaller than . [6] Optimal MAPF solvers return high quality solutions, but their efficiency is low. Instead, bounded-suboptimal and suboptimal solvers are more efficient, but their solutions are less effective. Also machine learning approaches have been proposed to solve the MAPF problem. [7]

Prioritized Planning

One possible approach to face the computational complexity is prioritized planning. [8] It consists in decoupling the MAPF problem into single-agent pathfinding problems.

The first step is to assign to each agent a unique number that corresponds to the priority given to the agent. Then, following the priority order, for each agent a plan is computed to reach the target location. When planning, agents have to avoid collisions with paths of other agents with a higher priority that have already computed their plans.

Finding a solution for the MAPF problem in such setting corresponds to the shortest path problem in a time-expansion graph. [9] A time-expansion graph is a graph that takes into account the passing of time. Each node is composed by two entries , where is the node name and is the time step. Each node is linked to the nodes such that is adjacent to and is not occupied at time step .

The drawback of prioritized planning is that, even if it is a sound approach (it returns valid solutions), it is neither optimal nor complete. [10] This means that it is not assured that the algorithm will return a solution and, even in that case, the solution may not be optimal.

Optimal MAPF Solvers

It is possible to distinguish four different categories of optimal MAPF solvers: [10]

Bounded Suboptimal MAPF Solvers

Bounded suboptimal algorithms offer a trade-off between the optimality and the cost of the solution. They are said to be bounded by a certain factor because they return solutions with a cost at most equal to the optimal solution cost times the factor. MAPF bounded suboptimal solvers can be divided following the same categorization presented for optimal MAPF solvers. [10]

Variations

The way in which MAPF problems are defined allows to change various aspects, for example the fact of being in a grid environment or the assumption that time is discrete. This section reports some variations of the classical MAPF problem.

Anonymous MAPF

It is a version of MAPF in which there is a set of target locations but agents are not assigned a specific target. [14] It does not matter whether the agent that reaches the target, the important thing is that targets are completed. A slight modification of this version is the one in which agents are divided into groups and each group has to perform a set of targets. [15]

Multi-Agent Pick-up and Delivery hww

MAPF problem is not able to capture some aspects relative to real world applications. For example, in automated warehouses it happens that robots have to complete several tasks one after the other. For this reason, an extended MAPF version is proposed, called Multi-Agent Pick-up and Delivery (MAPD). [16] In a MAPD setting, agents have to complete a stream of tasks, where each task is composed by a pick-up a location and a delivery location. When planning for the completion of a task, the path has to start from the current position of the robot and to end in the delivery position of the task, passing through the pick-up point. MAPD is considered a "lifelong" version of MAPF in which tasks arrive incrementally. [16]

Beyond Classical MAPF

The assumptions that the agents are in a grid environment, that their speed is constant and that time is discrete are simplifying hypotheses. Many works take into account the kinematic constraints of agents, [17] such as velocity and orientation, or go past the assumption that the weights of the arcs are all equal to 1. [18] Other works focus on eliminating the discrete time assumptions and that the duration of actions is exactly equal to one time step. [19] Another assumption that does not reflect reality is that agents occupy exactly one cell of the environment in which they are: some studies have been conducted to overcome this hypothesis. [20] It is interesting to note that the shape and geometry of agents may introduce new types of conflicts, since agents may crash with one another even if they are not completely overlapped.

Applications

MAPF can be applied in several real case scenarios:

See also

Related Research Articles

<span class="mw-page-title-main">Shortest path problem</span> Computational problem of graph theory

In graph theory, the shortest path problem is the problem of finding a path between two vertices in a graph such that the sum of the weights of its constituent edges is minimized.

Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent ought to take actions in a dynamic environment in order to maximize the cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

A* is a graph traversal and pathfinding algorithm, which is used in many fields of computer science due to its completeness, optimality, and optimal efficiency. Given a weighted graph, a source node and a goal node, the algorithm finds the shortest path from source to goal.

<span class="mw-page-title-main">Assignment problem</span> Combinatorial optimization problem

The assignment problem is a fundamental combinatorial optimization problem. In its most general form, the problem is as follows:

<span class="mw-page-title-main">Ant colony optimization algorithms</span> Optimization algorithm

In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems which can be reduced to finding good paths through graphs. Artificial ants stand for multi-agent methods inspired by the behavior of real ants. The pheromone-based communication of biological ants is often the predominant paradigm used. Combinations of artificial ants and local search algorithms have become a method of choice for numerous optimization tasks involving some sort of graph, e.g., vehicle routing and internet routing.

In mathematics, a Markov decision process (MDP) is a discrete-time stochastic control process. It provides a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker. MDPs are useful for studying optimization problems solved via dynamic programming. MDPs were known at least as early as the 1950s; a core body of research on Markov decision processes resulted from Ronald Howard's 1960 book, Dynamic Programming and Markov Processes. They are used in many disciplines, including robotics, automatic control, economics and manufacturing. The name of MDPs comes from the Russian mathematician Andrey Markov as they are an extension of Markov chains.

<span class="mw-page-title-main">Bellman equation</span> Necessary condition for optimality associated with dynamic programming

A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the "value" of a decision problem at a certain point in time in terms of the payoff from some initial choices and the "value" of the remaining decision problem that results from those initial choices. This breaks a dynamic optimization problem into a sequence of simpler subproblems, as Bellman's “principle of optimality" prescribes. The equation applies to algebraic structures with a total ordering; for algebraic structures with a partial ordering, the generic Bellman's equation can be used.

<span class="mw-page-title-main">Pathfinding</span> Plotting by a computer application

Pathfinding or pathing is the plotting, by a computer application, of the shortest route between two points. It is a more practical variant on solving mazes. This field of research is based heavily on Dijkstra's algorithm for finding the shortest path on a weighted graph.

A partially observable Markov decision process (POMDP) is a generalization of a Markov decision process (MDP). A POMDP models an agent decision process in which it is assumed that the system dynamics are determined by an MDP, but the agent cannot directly observe the underlying state. Instead, it must maintain a sensor model and the underlying MDP. Unlike the policy function in MDP which maps the underlying states to the actions, POMDP's policy is a mapping from the history of observations to the actions.

Distributed constraint optimization is the distributed analogue to constraint optimization. A DCOP is a problem in which a group of agents must distributedly choose values for a set of variables such that the cost of a set of constraints over the variables is minimized.

Motion planning, also path planning is a computational problem to find a sequence of valid configurations that moves the object from the source to destination. The term is used in computational geometry, computer animation, robotics and computer games.

In computer science and graph theory, the Canadian traveller problem (CTP) is a generalization of the shortest path problem to graphs that are partially observable. In other words, a "traveller" on a given point on the graph cannot see the full graph, rather only adjacent nodes or a certain "realization restriction."

Clustering is the problem of partitioning data points into groups based on their similarity. Correlation clustering provides a method for clustering a set of objects into the optimum number of clusters without specifying that number in advance.

<span class="mw-page-title-main">Any-angle path planning</span> Algorithm to find Euclidean shortest paths

Any-angle path planning algorithms are pathfinding algorithms that search for a Euclidean shortest path between two points on a grid map while allowing the turns in the path to have any angle. The result is a path that cuts directly through open areas and has relatively few turns. More traditional pathfinding algorithms such as A* either lack in performance or produce jagged, indirect paths.

Incremental heuristic search algorithms combine both incremental and heuristic search to speed up searches of sequences of similar search problems, which is important in domains that are only incompletely known or change dynamically. Incremental search has been studied at least since the late 1960s. Incremental search algorithms reuse information from previous searches to speed up the current search and solve search problems potentially much faster than solving them repeatedly from scratch. Similarly, heuristic search has also been studied at least since the late 1960s.

<span class="mw-page-title-main">Sven Koenig (computer scientist)</span> German computer scientist

Sven Koenig is a full professor in computer science at the University of Southern California. He received an M.S. degree in computer science from the University of California at Berkeley in 1991 and a Ph.D. in computer science from Carnegie Mellon University in 1997, advised by Reid Simmons.

The pebble motion problems, or pebble motion on graphs, are a set of related problems in graph theory dealing with the movement of multiple objects ("pebbles") from vertex to vertex in a graph with a constraint on the number of pebbles that can occupy a vertex at any time. Pebble motion problems occur in domains such as multi-robot motion planning and network routing. The best-known example of a pebble motion problem is the famous 15 puzzle where a disordered group of fifteen tiles must be rearranged within a 4x4 grid by sliding one tile at a time.

A Carathéodory-π solution is a generalized solution to an ordinary differential equation. The concept is due to I. Michael Ross and named in honor of Constantin Carathéodory. Its practicality was demonstrated in 2008 by Ross et al. in a laboratory implementation of the concept. The concept is most useful for implementing feedback controls, particularly those generated by an application of Ross' pseudospectral optimal control theory.

Approximate max-flow min-cut theorems are mathematical propositions in network flow theory. Approximate max-flow min-cut theorems deal with the relationship between maximum flow rate ("max-flow") and minimum cut ("min-cut") in a multi-commodity flow problem. The theorems have enabled the development of approximation algorithms for use in graph partition and related problems.

In cooperative game theory, a hedonic game is a game that models the formation of coalitions (groups) of players when players have preferences over which group they belong to. A hedonic game is specified by giving a finite set of players, and, for each player, a preference ranking over all coalitions (subsets) of players that the player belongs to. The outcome of a hedonic game consists of a partition of the players into disjoint coalitions, that is, each player is assigned a unique group. Such partitions are often referred to as coalition structures.

References

  1. 1 2 3 4 5 6 Stern, Roni; Sturtevant, Nathan R.; Felner, Ariel; Koenig, Sven; Ma, Hang; Walker, Thayne; Li, Jiaoyang; Atzmon, Dor; Cohen, Liron; Kumar, T. K. Satish; Boyarski, Eli; Barták, Roman (2019). "Multi-Agent Pathfinding: Definitions, Variants, and Benchmarks" (PDF). arXiv: 1906.08291 .{{cite journal}}: Cite journal requires |journal= (help)
  2. Ma, Hang; Wagner, Glenn; Felner, Ariel; Li, Jiaoyang; Kumar, T. K. Satish; Koenig, Sven (2018). "Multi-Agent Path Finding with Deadlines". arXiv: 1806.04216 [cs.AI].
  3. Yu, Jingjin; LaValle, Steven M. (2013). "Structure and Intractability of Optimal Multi-Robot Path Planning on Graphs". Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 27. pp. 1443–1449. doi:10.1609/aaai.v27i1.8541. S2CID   11655439.
  4. Yu, Jingjin (January 2016). "Intractability of Optimal Multirobot Path Planning on Planar Graphs". IEEE Robotics and Automation Letters. 1 (1): 33–40. arXiv: 1504.02072 . doi:10.1109/LRA.2015.2503143. S2CID   10275469.
  5. Banfi, Jacopo; Basilico, Nicola; Amigoni, Francesco (October 2017). "Intractability of Time-Optimal Multirobot Path Planning on 2D Grid Graphs with Holes". IEEE Robotics and Automation Letters. 2 (4): 1941–1947. doi:10.1109/LRA.2017.2715406. S2CID   36801258.
  6. Ma, Hang; Tovey, Craig; Sharon, Guni; Kumar, T. K.; Koenig, Sven (5 March 2016). "Multi-Agent Path Finding with Payload Transfers and the Package-Exchange Robot-Routing Problem". Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 30. doi:10.1609/aaai.v30i1.10409. S2CID   1585005.
  7. Sartoretti, Guillaume; Kerr, Justin; Shi, Yunfei; Wagner, Glenn; Kumar, T. K. Satish; Koenig, Sven; Choset, Howie (July 2019). "PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning". IEEE Robotics and Automation Letters. 4 (3): 2378–2385. arXiv: 1809.03531 . doi: 10.1109/LRA.2019.2903261 . S2CID   52191178.
  8. Cap, Michal; Novak, Peter; Kleiner, Alexander; Selecky, Martin (July 2015). "Prioritized Planning Algorithms for Trajectory Coordination of Multiple Mobile Robots". IEEE Transactions on Automation Science and Engineering. 12 (3): 835–849. arXiv: 1409.2399 . doi:10.1109/TASE.2015.2445780. S2CID   347488.
  9. Silver, David (2021). "Cooperative Pathfinding". Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment. Vol. 1. pp. 117–122. doi:10.1609/aiide.v1i1.18726. S2CID   17714238.
  10. 1 2 3 Stern, Roni (2019). "Multi-Agent Path Finding – an Overview". Artificial Intelligence. Lecture Notes in Computer Science. Vol. 11866. pp. 96–115. doi:10.1007/978-3-030-33274-7_6. ISBN   978-3-030-33273-0. S2CID   204832267.
  11. Sharon, Guni; Stern, Roni; Goldenberg, Meir; Felner, Ariel (February 2013). "The increasing cost tree search for optimal multi-agent pathfinding". Artificial Intelligence. 195: 470–495. doi: 10.1016/j.artint.2012.11.006 .
  12. Sharon, Guni; Stern, Roni; Felner, Ariel; Sturtevant, Nathan R. (February 2015). "Conflict-based search for optimal multi-agent pathfinding". Artificial Intelligence. 219: 40–66. doi:10.1016/j.artint.2014.11.006. S2CID   3809045.
  13. Bartak, Roman; Zhou, Neng-Fa; Stern, Roni; Boyarski, Eli; Surynek, Pavel (November 2017). "Modeling and Solving the Multi-agent Pathfinding Problem in Picat". 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI). pp. 959–966. doi:10.1109/ICTAI.2017.00147. ISBN   978-1-5386-3876-7. S2CID   7819470.
  14. Kloder, S.; Hutchinson, S. (August 2006). "Path planning for permutation-invariant multirobot formations". IEEE Transactions on Robotics. 22 (4): 650–665. doi:10.1109/TRO.2006.878952. S2CID   62805494.
  15. Ma, Hang; Koenig, Sven (2016). "Optimal Target Assignment and Path Finding for Teams of Agents". arXiv: 1612.05693 [cs.AI].
  16. 1 2 Ma, Hang; Li, Jiaoyang; Kumar, T. K. Satish; Koenig, Sven (30 May 2017). "Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks". arXiv: 1705.10868 [cs.AI].
  17. Hoenig, Wolfgang; Kumar, T. K. Satish; Cohen, Liron; Ma, Hang; Xu, Hong; Ayanian, Nora; Koenig, Sven (2016). "Multi-Agent Path Finding with Kinematic Constraints". Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17).
  18. Barták, Roman; Švancara, Jiří; Vlk, Marek (2018). "A scheduling-based approach to multi-agent path finding with weighted and capacitated arcs" (PDF). Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. pp. 748–756.
  19. Andreychuk, Anton; Yakovlev, Konstantin; Surynek, Pavel; Atzmon, Dor; Stern, Roni (April 2022). "Multi-agent pathfinding with continuous time". Artificial Intelligence. 305: 103662. arXiv: 1901.05506 . doi:10.1016/j.artint.2022.103662. S2CID   207791641.
  20. Li, Jiaoyang; Surynek, Pavel; Felner, Ariel; Ma, Hang; Kumar, T. K. Satish; Koenig, Sven (17 July 2019). "Multi-Agent Path Finding for Large Agents". Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. pp. 7627–7634. doi:10.1609/aaai.v33i01.33017627. S2CID   53687939.
  21. Wurman, Peter R.; D'Andrea, Raffaello; Mountz, Mick (20 March 2008). "Coordinating Hundreds of Cooperative, Autonomous Vehicles in Warehouses". AI Magazine. 29 (1): 9. doi:10.1609/aimag.v29i1.2082. S2CID   10475273.
  22. Morris, Robert; Pasareanu, Corina S.; Luckow, Kasper; Malik, Waqar; Ma, Hang; Kumar, T. K. Satish; Koenig, Sven (2016). "Planning, Scheduling and Monitoring for Airport Surface Operations". AAAI Workshop: Planning for Hybrid Systems.
  23. Veloso, Manuela; Biswas, Joydeep; Coltin, Brian; Rosenthal, Stephanie (2015). "CoBots: Robust Symbiotic Autonomous Mobile Service Robots". IJCAI'15: Proceedings of the 24th International Conference on Artificial Intelligence.
  24. Ma, Hang; Yang, Jingxing; Cohen, Liron; Kumar, T. K. Satish; Koenig, Sven (2017). "Feasibility Study: Moving Non-Homogeneous Teams in Congested Video Game Environments". Proceedings, The Thirteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-17). arXiv: 1710.01447 .