HPX

Last updated
HPX
Developer(s) The STEllAR Group
LSU Center for Computation and Technology
Initial release2008 (2008)
Stable release
1.9.0 / May 3, 2023;3 months ago (2023-05-03)
Repository github.com/STEllAR-GROUP/hpx
Written in C++
Operating system Microsoft Windows
Linux
Mac OS X
Type Partitioned global address space
Parallel programming
Runtime System
License Boost Software License [1]
Website stellar-group.github.io/hpx/docs/sphinx/latest/html/index.html

HPX, short for High Performance ParalleX, is a runtime system for high-performance computing. It is currently under active development by the STE||AR group [2] at Louisiana State University. Focused on scientific computing, it provides an alternative execution model to conventional approaches such as MPI. HPX aims to overcome the challenges MPI faces with increasing large supercomputers by using asynchronous communication between nodes and lightweight control objects instead of global barriers, allowing application developers to exploit fine-grained parallelism. [3] [4] [5]

Contents

HPX is developed in idiomatic C++ and released as open source under the Boost Software License, which allows usage in commercial applications.

Applications

Though designed as a general-purpose environment for high-performance computing, HPX has primarily been used in

Related Research Articles

<span class="mw-page-title-main">Parallel computing</span> Programming paradigm in which many processes are executed simultaneously

Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has long been employed in high-performance computing, but has gained broader interest due to the physical constraints preventing frequency scaling. As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors.

Neuromorphic computing is an approach to computing that is inspired by the structure and function of the human brain. A neuromorphic computer/chip is any device that uses physical artificial neurons to do computations. In recent times, the term neuromorphic has been used to describe analog, digital, mixed-mode analog/digital VLSI, and software systems that implement models of neural systems. The implementation of neuromorphic computing on the hardware level can be realized by oxide-based memristors, spintronic memories, threshold switches, transistors, among others. Training software-based neuromorphic systems of spiking neural networks can be achieved using error backpropagation, e.g., using Python based frameworks such as snnTorch, or using canonical learning rules from the biological learning literature, e.g., using BindsNet.

General-purpose computing on graphics processing units is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a sequential language, as an extension to an existing language, or as an entirely new language.

In software engineering, profiling is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid program optimization, and more specifically, performance engineering.

Software visualization or software visualisation refers to the visualization of information of and related to software systems—either the architecture of its source code or metrics of their runtime behavior—and their development process by means of static, interactive or animated 2-D or 3-D visual representations of their structure, execution, behavior, and evolution.

Charm++ is a parallel object-oriented programming paradigm based on C++ and developed in the Parallel Programming Laboratory at the University of Illinois at Urbana–Champaign. Charm++ is designed with the goal of enhancing programmer productivity by providing a high-level abstraction of a parallel program while at the same time delivering good performance on a wide variety of underlying hardware platforms. Programs written in Charm++ are decomposed into a number of cooperating message-driven objects called chares. When a programmer invokes a method on an object, the Charm++ runtime system sends a message to the invoked object, which may reside on the local processor or on a remote processor in a parallel computation. This message triggers the execution of code within the chare to handle the message asynchronously.

<span class="mw-page-title-main">Microsoft Robotics Developer Studio</span>

Microsoft Robotics Developer Studio is a discontinued Windows-based environment for robot control and simulation that was aimed at academic, hobbyist, and commercial developers and handled a wide variety of robot hardware. It requires a Microsoft Windows 7 operating system or later.

<span class="mw-page-title-main">CUDA</span> Parallel computing platform and programming model

CUDA is a proprietary and closed source parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach called general-purpose computing on GPUs (GPGPU). CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements, for the execution of compute kernels.

In mathematics, a graph partition is the reduction of a graph to a smaller graph by partitioning its set of nodes into mutually exclusive groups. Edges of the original graph that cross between the groups will produce edges in the partitioned graph. If the number of resulting edges is small compared to the original graph, then the partitioned graph may be better suited for analysis and problem-solving than the original. Finding a partition that simplifies graph analysis is a hard problem, but one that has applications to scientific computing, VLSI circuit design, and task scheduling in multiprocessor computers, among others. Recently, the graph partition problem has gained importance due to its application for clustering and detection of cliques in social, pathological and biological networks. For a survey on recent trends in computational methods and applications see Buluc et al. (2013). Two common examples of graph partitioning are minimum cut and maximum cut problems.

<span class="mw-page-title-main">Hartmut Neven</span> German scientist

Hartmut Neven is a scientist working in quantum computing, computer vision, robotics and computational neuroscience. He is best known for his work in face and object recognition and his contributions to quantum machine learning. He is currently Vice President of Engineering at Google where he is leading the Quantum Artificial Intelligence Lab which he founded in 2012.

Reverse computation is a software application of the concept of reversible computing.

<span class="mw-page-title-main">MilkyWay@home</span> BOINC based volunteer computing project researching astronomy

MilkyWay@home is a volunteer computing project in the astrophysics category, running on the Berkeley Open Infrastructure for Network Computing (BOINC) platform. Using spare computing power from over 38,000 computers run by over 27,000 active volunteers as of November 2011, the MilkyWay@home project aims to generate accurate three-dimensional dynamic models of stellar streams in the immediate vicinity of the Milky Way. With SETI@home and Einstein@home, it is the third computing project of this type that has the investigation of phenomena in interstellar space as its primary purpose. Its secondary objective is to develop and optimize algorithms for volunteer computing.

<span class="mw-page-title-main">Object code optimizer</span> Aspect of software compilation

An object code optimizer, sometimes also known as a post pass optimizer or, for small sections of code, peephole optimizer, forms part of a software compiler. It takes the output from the source language compile step - the object code or binary file - and tries to replace identifiable sections of the code with replacement code that is more algorithmically efficient.

MADNESS is a high-level software environment for the solution of integral and differential equations in many dimensions using adaptive and fast harmonic analysis methods with guaranteed precision based on multiresolution analysis and separated representations .

<span class="mw-page-title-main">Catamount (operating system)</span> Operating system for supercomputers

Catamount is an operating system for supercomputers.

<span class="mw-page-title-main">SpiNNaker</span>

SpiNNaker is a massively parallel, manycore supercomputer architecture designed by the Advanced Processor Technologies Research Group (APT) at the Department of Computer Science, University of Manchester. It is composed of 57,600 processing nodes, each with 18 ARM9 processors and 128 MB of mobile DDR SDRAM, totalling 1,036,800 cores and over 7 TB of RAM. The computing platform is based on spiking neural networks, useful in simulating the human brain.

Serverless computing is a cloud computing execution model in which the cloud provider allocates machine resources on demand, taking care of the servers on behalf of their customers. "Serverless" is a misnomer in the sense that servers are still used by cloud service providers to execute code for developers. However, developers of serverless applications are not concerned with capacity planning, configuration, management, maintenance, fault tolerance, or scaling of containers, VMs, or physical servers. Serverless computing does not hold resources in volatile memory; computing is rather done in short bursts with the results persisted to storage. When an app is not in use, there are no computing resources allocated to the app. Pricing is based on the actual amount of resources consumed by an application. It can be a form of utility computing.

An event camera, also known as a neuromorphic camera, silicon retina or dynamic vision sensor, is an imaging sensor that responds to local changes in brightness. Event cameras do not capture images using a shutter as conventional (frame) cameras do. Instead, each pixel inside an event camera operates independently and asynchronously, reporting changes in brightness as they occur, and staying silent otherwise.

Manish Parashar is a Presidential Professor in the School of Computing, Director of the Scientific Computing and Imaging (SCI) Institute and Chair in Computational Science and Engineering at the University of Utah. He also currently serves as Office Director in the US National Science Foundation’s Office of Advanced Cyberinfrastructure. Parashar is the editor-in-chief of IEEE Transactions on Parallel and Distributed Systems, and Founding Chair of the IEEE Technical Community on High Performance Computing. He is an AAAS Fellow, ACM Fellow, and IEEE Fellow.

References

  1. "License", Boost Software License – Version 1.0, boost.org, retrieved 2012-07-30
  2. "About the STE||AR Group" . Retrieved 17 April 2019.
  3. Kaiser, Hartmut; Brodowicz, Maciek; Sterling, Thomas (2009). "ParalleX an Advanced Parallel Execution Model for Scaling-Impaired Applications". 2009 International Conference on Parallel Processing Workshops. pp. 394–401. doi:10.1109/icppw.2009.14. ISBN   978-1-4244-4923-1. S2CID   898158.
  4. Wagle, Bibek; Kellar, Samuel; Serio, Adrian; Kaiser, Hartmut (2018). "Methodology for Adaptive Active Message Coalescing in Task Based Runtime Systems". 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). pp. 1133–1140. doi:10.1109/IPDPSW.2018.00173. ISBN   978-1-5386-5555-9. S2CID   51921994.
  5. 1 2 Wagle, Bibek; Monil, Mohammad Alaul Haque; Huck, Kevin; Malony, Allen D.; Serio, Adrian; Kaiser, Hartmut (2019). "Runtime Adaptive Task Inlining on Asynchronous Multitasking Runtime Systems". Proceedings of the 48th International Conference on Parallel Processing. pp. 1–10. doi:10.1145/3337821.3337915. ISBN   9781450362955. S2CID   198963569.
  6. C. Dekate, M. Anderson, M. Brodowicz, H. Kaiser, B. Adelstein-Lelbach and T. Sterling (2012). "Improving the Scalability of Parallel N-body Applications with an Event-driven Constraint-based Execution Model". International Journal of High Performance Computing Applications. 26 (3): 319–332. arXiv: 1109.5190 . doi:10.1177/1094342012440585. S2CID   9556798.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  7. M. Anderson, T. Sterling, H. Kaiser and D. Neilsen (2011). "Neutron Star Evolutions using Tabulated Equations of State with a New Execution Model" (PDF). American Physical Society April 2012 Meeting.{{cite web}}: CS1 maint: multiple names: authors list (link)
  8. D. Pfander, G. Daiß, D. Marcello, H. Kaiser, D. Pflüger, David (2018). "Accelerating Octo-Tiger: Stellar Mergers on Intel Knights Landing with HPX". DHPCC++ Conference 2018 Hosted by IWOCL. doi:10.1145/3204919.3204938. S2CID   21126354.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  9. Marcello, Dominic; Daiß, Gregor; Parsa Amini; Kaiser, Hartmut; Diehl, Patrick; Wash, Bryce Adelstein Lelbach Aka; Heller, Thomas; Shibersag; Huck, Kevin; Biddiscombe, John; Schäfer, Andreas (2019-04-17), STEllAR-GROUP/octotiger Repository on GitHub, The STE||AR Group, doi:10.5281/zenodo.5093174 , retrieved 2019-04-17
  10. Heller, Thomas; Lelbach, Bryce Adelstein; Huck, Kevin A; Biddiscombe, John; Grubel, Patricia; Koniges, Alice E; Kretz, Matthias; Marcello, Dominic; Pfander, David (2019-02-14). "Harnessing billions of tasks for a scalable portable hydrodynamic simulation of the merger of two stars". The International Journal of High Performance Computing Applications. 33 (4): 699–715. doi: 10.1177/1094342018819744 . ISSN   1094-3420. OSTI   1524389.
  11. "LibGeoDecomp – Petascale Computer Simulations". www.libgeodecomp.org. Retrieved 2019-04-17.
  12. A library for C++/Fortran computer simulations (e.g. stencil codes, mesh-free, unstructured grids, n-body & particle methods). Scales from smartphones to petascale supercomputers (e.g. Titan, T.., The STE||AR Group, 2019-04-06, retrieved 2019-04-17
  13. A. Schäfer, D. Fey (2008). "LibGeoDecomp: A Grid-Enabled Library for Geometric Decomposition Codes". Recent Advances in Parallel Virtual Machine and Message Passing Interface. Lecture Notes in Computer Science. Vol. 5205. pp. 285–294. doi:10.1007/978-3-540-87475-1_39. ISBN   978-3-540-87474-4.
  14. Diehl, Patrick; Jha, Prashant K.; Kaiser, Hartmut; Lipton, Robert; Levesque, Martin (2020). "An asynchronous and task-based implementation of peridynamics utilizing HPX—the C++ standard library for parallelism and concurrency". SN Applied Sciences. 2 (12). arXiv: 1806.06917 . doi: 10.1007/s42452-020-03784-x . S2CID   227240479.
  15. "Phylanx – A Distributed Array Toolkit" . Retrieved 2019-04-17.
  16. An Asynchronous Distributed C++ Array Processing Toolkit: STEllAR-GROUP/phylanx, The STE||AR Group, 2019-04-16, retrieved 2019-04-17
  17. Tohid, R.; Wagle, Bibek; Shirzad, Shahrzad; Diehl, Patrick; Serio, Adrian; Kheirkhahan, Alireza; Amini, Parsa; Williams, Katy; Isaacs, Kate; Huck, Kevin; Brandt, Steven; Kaiser, Hartmut (2018). "Asynchronous Execution of Python Code on Task-Based Runtime Systems". 2018 IEEE/ACM 4th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2). pp. 37–45. arXiv: 1810.07591 . doi:10.1109/ESPM2.2018.00009. ISBN   978-1-72810-178-1. S2CID   52988499.