NetworkX

Last updated
NetworkX
Original author(s) Aric Hagberg
Pieter Swart
Dan Schult
Developer(s) Many others
Initial release11 April 2005;19 years ago (2005-04-11) [1] [2]
Stable release
3.3 [3]   OOjs UI icon edit-ltr-progressive.svg / 6 April 2024;43 days ago (6 April 2024)
Repository
Written in Python
Operating system Cross-platform
Type Software library
License BSD-new license
Website networkx.github.io

NetworkX is a Python library for studying graphs and networks. NetworkX is free software released under the BSD-new license.

Contents

History

NetworkX began development in 2002 by Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart. [4] It is supported by the National Nuclear Security Administration of the U.S. Department of Energy at Los Alamos National Laboratory.

The package was crafted with the aim of creating tools to analyze data and intervention strategies for controlling the epidemic spread of disease, while also exploring the structure and dynamics of more general social, biological, and infrastructural systems. [4]

Inspired by Guido van Rossum's 1998 essay on Python graph representation, [5] NetworkX made its public debut at the 2004 SciPy annual conference. In April of 2005, NetworkX was made available as open source software. [1]

Several Python packages focusing on graph theory, including igraph, graph-tool, and numerous others, are available. As of April 2024, NetworkX had over 50 million downloads, [6] surpassing the download count of the second most popular package, igraph, by more than 50-fold. [7] This substantial adoption rate could potentially be attributed to NetworkX's early release and its continued evolution within the SciPy ecosystem.

In 2008, SageMath, an open source mathematics system, incorporated NetworkX into its package and added support for more graphing algorithms and functions. [4]

Major NetworkX Releases
VersionRelease DateMajor Changes
0.22
17 June 2005
Topological sorting for testing directed acyclic graphs (DAGs).

Integration of Dijkstra's algorithm for finding shortest paths in weighted graphs. [8]

0.99
18 November 2008
Default graph type transitioned to a weighted graph.

MultiGraph, MultiDiGraph, LabeledGraph, and LabeledDiGraph introduced. [8]

1.0
8 January 2010
Addition of difference and intersection operators.

Implementation of the A* algorithm for optimized pathfinding.

Incorporation of PageRank, HITS, and [eigenvector] centrality algorithms for network analysis.

Integration of Kruskal’s algorithm for constructing minimum spanning trees. [8]

2.0
20 September 2017
Significant revisions to the methods within the MultiGraph and DiGraph classes.

Overhaul of documentation system. Various user quality of life changes. [8]

3.0
7 January 2023
Improved integrations to SciPy ecosystem packages.

Added new plugin feature to allow users to use different backends (GraphBLAS, CuGraph) for computation. [9]


Features

Supported Graph Types

Overview

Graphs, in this context, represent collections of vertices (nodes) and edges (connections) between them. NetworkX provides support for several types of graphs, each suited for different applications and scenarios.

Directed Graphs (DiGraph)

Directed graphs, or DiGraphs, consist of nodes connected by directed edges. In a directed graph, edges have a direction indicating the flow or relationship between nodes. [10]

Directed graph made using NetworkX Directed NetworkX Graph.png
Directed graph made using NetworkX

Undirected Graphs (Graph)

Undirected graphs, simply referred to as graphs in NetworkX, are graphs where edges have no inherent direction. The connections between nodes are symmetrical, meaning if node A is connected to node B, then node B is also connected to node A. [11]

Undirected graph made using NetworkX Undirected NetworkX Graph.png
Undirected graph made using NetworkX

MultiGraphs

MultiGraphs allow multiple edges between the same pair of nodes. In other words, MultiGraphs permit parallel edges, where more than one edge can exist between two nodes. [12]

MultiGraph made using NetworkX Multigraph NetworkX.png
MultiGraph made using NetworkX

MultiDiGraphs

MultiDiGraphs are directed graphs that allow multiple directed edges between the same pair of nodes. Similar to MultiGraphs, MultiDiGraphs enable the modeling of scenarios where multiple directed relationships exist between nodes. [13]

MultiDiGraph made using NetworkX Multidigraph NetworkX.png
MultiDiGraph made using NetworkX

Challenges in Visualization

While NetworkX provides powerful tools for graph creation and analysis, producing visualizations of complex graphs can be challenging. Visualizing large or densely connected graphs may require specialized techniques and external libraries beyond the capabilities of NetworkX alone.

Graph Layouts

NetworkX provides various layout algorithms for visualizing graphs in two-dimensional space. These layout algorithms determine the positions of nodes and edges in a graph visualization, aiming to reveal its structure and relationships effectively.

Spring Layout

The Spring layout is a force-directed layout algorithm inspired by physical systems. It simulates the attraction and repulsion forces between nodes, treating edges as springs and nodes as charged particles. This results in a layout where nodes with strong connections are placed closer together, while nodes with weaker connections are pushed farther apart. [14]

Spectral Layout

The Spectral layout is based on the spectral properties of the graph's adjacency matrix. It uses the eigenvalues and eigenvectors of the adjacency matrix to position nodes in a low-dimensional space. Spectral layout tends to emphasize the global structure of the graph, making it useful for identifying clusters and communities. [15]

Circular Layout

The Circular layout arranges nodes evenly around a circle, with edges drawn as straight lines connecting them. This layout is particularly suitable for visualizing cyclic or symmetric graphs, where the arrangement of nodes along the circle reflects the underlying topology of the graph. [16]

Shell Layout

The Shell layout organizes nodes into concentric circles or shells based on their distance from a specified center. Nodes within the same shell have the same distance from the center, while edges are drawn radially between nodes in adjacent shells. Shell layout is often used for visualizing hierarchical or tree structures. [17]

Kamada-Kawai Layout

The Kamada-Kawai layout algorithm positions nodes based on their pairwise distances, aiming to minimize the total energy of the system. It takes into account both the graph's topology and edge lengths, resulting in a layout that emphasizes geometric accuracy and readability. [18]

Usage

NetworkX provides functions for applying different layout algorithms to graphs and visualizing the results using Matplotlib or other plotting libraries. Users can specify the desired layout algorithm when calling the drawing functions, allowing for flexible and customizable graph visualizations.

Suitability

NetworkX is suitable for operation on large real-world graphs: e.g., graphs in excess of 10 million nodes and 100 million edges.[ clarification needed ] [19] Due to its dependence on a pure-Python "dictionary of dictionary" data structure, NetworkX is a reasonably efficient, very scalable, highly portable framework for network and social network analysis. [4]

Applications

NetworkX was designed to be easy to use and learn, as well as a powerful and sophisticated tool for network analysis. It is used widely on many levels, ranging from computer science and data analysis education to large-scale scientific studies. [4]

NetworkX has applications in any field that studies data as graphs or networks, such as mathematics, physics, biology, computer science and social science. [20] The nodes in a NetworkX graph can be specialized to hold any data, and the data stored in edges is arbitrary, further making it widely applicable to different fields. It is able to read in networks from data and randomly generate networks with specified qualities. This allows it to be used to explore changes across wide amounts of networks. [4] The figure below demonstrates a simple example of the software's ability to create and modify variations across large amounts of networks.

Graph representations of several spanning tree networks in Karger's algorithm Spanning tree interpretation of Karger's algorithm.svg
Graph representations of several spanning tree networks in Karger's algorithm

NetworkX has many network and graph analysis algorithms, aiding in a wide array of data analysis purposes. One important example of this is its various options for shortest path algorithms. The following algorithms are included in NetworkX, with time complexities given the number of vertices (V) and edges (E) in the graph: [21]

An example of the use of NetworkX graph algorithms can be seen in a 2018 study, in which it was used to analyze the resilience of livestock production networks to the spread of epidemics. The study used a computer model to predict and study trends in epidemics throughout American hog production networks, taking into account all livestock industry roles. In the study, NetworkX was used to find information on degree, shortest paths, clustering, and k-cores as the model introduced infections and simulated their spread. This was then used to determine which networks are most susceptible to epidemics. [22]

In addition to network creation and analysis, NetworkX also has many visualization capabilities. It provides hooks into Matplotlib and GraphViz for 2D visuals, and VTK and UbiGraph for 3D visuals. [4] This makes the package useful in easily demonstrating and reporting network analysis and data, and allows for the simplification of networks for visual processing.

Integration

NetworkX is integrated into SageMath. [23]

See also

Related Research Articles

<span class="mw-page-title-main">Graph theory</span> Area of discrete mathematics

In mathematics, graph theory is the study of graphs, which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of vertices which are connected by edges. A distinction is made between undirected graphs, where edges link two vertices symmetrically, and directed graphs, where edges link two vertices asymmetrically. Graphs are one of the principal objects of study in discrete mathematics.

<span class="mw-page-title-main">Graph drawing</span> Visualization of node-link graphs

Graph drawing is an area of mathematics and computer science combining methods from geometric graph theory and information visualization to derive two-dimensional depictions of graphs arising from applications such as social network analysis, cartography, linguistics, and bioinformatics.

<span class="mw-page-title-main">Graph (abstract data type)</span> Abstract data type in computer science

In computer science, a graph is an abstract data type that is meant to implement the undirected graph and directed graph concepts from the field of graph theory within mathematics.

<span class="mw-page-title-main">Graphviz</span> Software package for graph visualization

Graphviz is a package of open-source tools initiated by AT&T Labs Research for drawing graphs specified in DOT language scripts having the file name extension "gv". It also provides libraries for software applications to use the tools. Graphviz is free software licensed under the Eclipse Public License.

DOT is a graph description language, developed as a part of the Graphviz project. DOT graphs are typically stored as files with the .gv or .dot filename extension — .gv is preferred, to avoid confusion with the .dot extension used by versions of Microsoft Word before 2007. dot is also the name of the main program to process DOT files in the Graphviz package.

<span class="mw-page-title-main">Force-directed graph drawing</span> Physical simulation to visualize graphs

Force-directed graph drawing algorithms are a class of algorithms for drawing graphs in an aesthetically-pleasing way. Their purpose is to position the nodes of a graph in two-dimensional or three-dimensional space so that all the edges are of more or less equal length and there are as few crossing edges as possible, by assigning forces among the set of edges and the set of nodes, based on their relative positions, and then using these forces either to simulate the motion of the edges and nodes or to minimize their energy.

<span class="mw-page-title-main">Orange (software)</span> Open-source data analysis software

Orange is an open-source data visualization, machine learning and data mining toolkit. It features a visual programming front-end for explorative qualitative data analysis and interactive data visualization.

Graph Modeling Language (GML) is a hierarchical ASCII-based file format for describing graphs. It has been also named Graph Meta Language.

GraphML is an XML-based file format for graphs. The GraphML file format results from the joint effort of the graph drawing community to define a common format for exchanging graph structure data. It uses an XML-based syntax and supports the entire range of possible graph structure constellations including directed, undirected, mixed graphs, hypergraphs, and application-specific attributes.

<span class="mw-page-title-main">Call graph</span> Structure in computing

A call graph is a control-flow graph, which represents calling relationships between subroutines in a computer program. Each node represents a procedure and each edge (f, g) indicates that procedure f calls procedure g. Thus, a cycle in the graph indicates recursive procedure calls.

<span class="mw-page-title-main">IPython</span> Advanced interactive shell for Python

IPython is a command shell for interactive computing in multiple programming languages, originally developed for the Python programming language, that offers introspection, rich media, shell syntax, tab completion, and history. IPython provides the following features:

<span class="mw-page-title-main">JUNG</span>

JUNG is an open-source graph modeling and visualization framework written in Java, under the BSD license. The framework comes with a number of layout algorithms built in, as well as analysis algorithms such as graph clustering and metrics for node centrality.

Enthought, Inc. is a software company based in Austin, Texas, United States that develops scientific and analytic computing solutions using primarily the Python programming language. It is best known for the early development and maintenance of the SciPy library of mathematics, science, and engineering algorithms and for its Python for scientific computing distribution Enthought Canopy.

graph-tool is a Python module for manipulation and statistical analysis of graphs. The core data structures and algorithms of graph-tool are implemented in C++, making extensive use of metaprogramming, based heavily on the Boost Graph Library. Many algorithms are implemented in parallel using OpenMP, which provides increased performance on multi-core architectures.

Meurs Challenger is an online graph visualization program, with data analysis and browsing.

<span class="mw-page-title-main">NodeXL</span> Network analysis and visualization package for Microsoft Excel

NodeXL is a network analysis and visualization software package for Microsoft Excel 2007/2010/2013/2016. The package is similar to other network visualization tools such as Pajek, UCINet, and Gephi. It is widely applied in ring, mapping of vertex and edge, and customizable visual attributes and tags. NodeXL enables researchers to undertake social network analysis work metrics such as centrality, degree, and clustering, as well as monitor relational data and describe the overall relational network structure. When applied to Twitter data analysis, it showed the total network of all users participating in public discussion and its internal structure through data mining. It allows social Network analysis (SNA) to emphasize the relationships rather than the isolated individuals or organizations, allowing interested parties to investigate the two-way dialogue between organizations and the public. SNA also provides a flexible measurement system and parameter selection to confirm the influential nodes in the network, such as in-degree and out-degree centrality. The software contains network visualization, social network analysis features, access to social media network data importers, advanced network metrics, and automation.

BioFabric is an open-source software application for graph drawing. It presents graphs as a node-link diagram, but unlike other graph drawing tools that depict the nodes using discrete symbols, it represents nodes using horizontal lines.

igraph is a library collection for creating and manipulating graphs and analyzing networks. It is written in C and also exists as Python and R packages. There exists moreover an interface for Mathematica. The software is widely used in academic research in network science and related fields. The publication that introduces the software has 5623 citations as of June 5, 2015 according to Google Scholar.

References

  1. 1 2 NetworkX first public release (NX-0.2), From: Aric Hagberg, Date: 12 April 2005, Python-announce-list mailing list
  2. NetworkX initial release, NX-0.2, hagberg – 2005-04-11, Project Info – NetworkX, Registered: 2004-10-21, SourceForge.net
  3. "Release 3.3". 6 April 2024. Retrieved 8 April 2024.
  4. 1 2 3 4 5 6 7 Aric A. Hagberg, Daniel A. Schult, Pieter J. Swart, Exploring Network Structure, Dynamics, and Function using NetworkX, Proceedings of the 7th Python in Science conference (SciPy 2008), G. Varoquaux, T. Vaught, J. Millman (Eds.), pp. 11–15.
  5. van Rossum, Guido (February 1998). "Python Patterns - Implementing Graphs". Python.
  6. "networkx". PyPi Statistics. April 2024.
  7. "igraph". PyPi Statistics. April 2024.
  8. 1 2 3 4 "Old Release Log". NetworkX. 22 August 2020. Retrieved 24 April 2024.
  9. "NetworkX 3.0". NetworkX. 7 January 2023. Retrieved 24 April 2024.
  10. "DiGraph—Directed graphs with self loops — NetworkX 3.3 documentation". networkx.org. Retrieved 2024-04-24.
  11. "Graph—Undirected graphs with self loops — NetworkX 3.3 documentation". networkx.org. Retrieved 2024-04-24.
  12. "MultiGraph—Undirected graphs with self loops and parallel edges — NetworkX 3.3 documentation". networkx.org. Retrieved 2024-04-24.
  13. "MultiDiGraph—Directed graphs with self loops and parallel edges — NetworkX 3.3 documentation". networkx.org. Retrieved 2024-04-24.
  14. "spring_layout — NetworkX 3.3 documentation". networkx.org. Retrieved 2024-05-02.
  15. "spectral_layout — NetworkX 3.3 documentation". networkx.org. Retrieved 2024-05-02.
  16. "circular_layout — NetworkX 3.3 documentation". networkx.org. Retrieved 2024-05-02.
  17. "shell_layout — NetworkX 3.3 documentation". networkx.org. Retrieved 2024-05-02.
  18. "kamada_kawai_layout — NetworkX 3.3 documentation". networkx.org. Retrieved 2024-05-02.
  19. Aric Hagberg, Drew Conway, "Hacking social networks using the Python programming language (Module II – Why do SNA in NetworkX)", Sunbelt 2010: International Network for Social Network Analysis.
  20. Hadaj, P.; Strzałka, D.; Nowak, M. (19 October 2022). "The use of PLANS and NetworkX in modeling power grid system failures". Sci Rep. 12.
  21. "Shortest Paths — NetworkX 3.3 documentation". networkx.org. Retrieved 2024-04-29.
  22. Wiltshire, Serge W. (March 9, 2018). "Using an agent-based model to evaluate the effect of producer specialization on the epidemiological resilience of livestock production networks". PLOS ONE.
  23. "SageMath Mathematical Software System - Sage".