NESL

NESL
Paradigm	parallel, functional, array
Developer	SCandAL project
First appeared	1993
Stable release	3.1 / November 1995
License	permissive license similar to the ISC and X11 licenses

Last updated January 19, 2023

NESL is a parallel programming language developed at Carnegie Mellon by the SCandAL project and released in 1993. It integrates various ideas from parallel algorithms, functional programming, and array programming languages.

Nested data parallelism: this feature offers the benefits of data parallelism, concise code that is easy to understand and debug, while being well suited for irregular algorithms, such as algorithms on trees, graphs or sparse matrices.
A language based performance model: this gives a formal way to calculate the work and depth of a program. These measures can be related to running time on parallel machines.

The main design guideline for NESL was to make parallel programming easy and portable. Algorithms are typically significantly more concise in NESL than in most other parallel programming languages, and the code closely resembles high-level pseudocode.

NESL handles nested data parallelism by using the flattening transformation to convert nested data parallelism to flat data parallelism. This works by storing nested vectors as the nested data and a segment descriptor of vector lengths, separately.^[1] This flattening transform, however, can increase the asymptotic work and space complexity of the original program, leading to a much less efficient result.^[2]

Influences

NESL heavily influenced Data Parallel Haskell.^[3]

Related Research Articles

In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program.

<span class="mw-page-title-main">Single instruction, multiple data</span> Type of parallel processing

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

Generic programming is a style of computer programming in which algorithms are written in terms of types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplication. Such software entities are known as generics in Ada, C#, Delphi, Eiffel, F#, Java, Nim, Python, Go, Rust, Swift, TypeScript and Visual Basic .NET. They are known as parametric polymorphism in ML, Scala, Julia, and Haskell ; templates in C++ and D; and parameterized types in the influential 1994 book Design Patterns.

Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has long been employed in high-performance computing, but has gained broader interest due to the physical constraints preventing frequency scaling. As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors.

In computer science, a deterministic algorithm is an algorithm that, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states. Deterministic algorithms are by far the most studied and familiar kind of algorithm, as well as one of the most practical, since they can be run on real machines efficiently.

In computer science, array programming refers to solutions which allow the application of operations to an entire set of values at once. Such solutions are commonly used in scientific and engineering settings.

X10 is a programming language being developed by IBM at the Thomas J. Watson Research Center as part of the Productive, Easy-to-use, Reliable Computing System (PERCS) project funded by DARPA's High Productivity Computing Systems (HPCS) program.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a sequential language, as an extension to an existing language, or as an entirely new language.

Automatic parallelization, also auto parallelization, or autoparallelization refers to converting sequential code into multi-threaded and/or vectorized code in order to use multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. Fully automatic parallelization of sequential programs is a challenge because it requires complex program analysis and the best approach may depend upon parameter values that are not known at compilation time.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

In computer science, implicit parallelism is a characteristic of a programming language that allows a compiler or interpreter to automatically exploit the parallelism inherent to the computations expressed by some of the language's constructs. A pure implicitly parallel language does not need special directives, operators or functions to enable parallel execution, as opposed to explicit parallelism.

Data parallelism is parallelization across multiple processors in parallel computing environments. It focuses on distributing the data across different nodes, which operate on the data in parallel. It can be applied on regular data structures like arrays and matrices by working on each element in parallel. It contrasts to task parallelism as another form of parallelism.

In computing, algorithmic skeletons, or parallelism patterns, are a high-level parallel programming model for parallel and distributed computing.

Guy Edward Blelloch is a professor of computer science at Carnegie Mellon University. He is known for his work in parallel programming and parallel algorithms. He teaches the 15-853: Algorithms in the Real World course, the 15-492: Parallel Algorithms course, and the 15-210: Parallel and Sequential Data Structure and Algorithms course at the Carnegie Mellon University. In 2011 he was inducted as a Fellow of the Association for Computing Machinery.

For several years parallel hardware was only available for distributed computing but recently it is becoming available for the low end computers as well. Hence it has become inevitable for software programmers to start writing parallel applications. It is quite natural for programmers to think sequentially and hence they are less acquainted with writing multi-threaded or parallel processing applications. Parallel programming requires handling various issues such as synchronization and deadlock avoidance. Programmers require added expertise for writing such applications apart from their expertise in the application domain. Hence programmers prefer to write sequential code and most of the popular programming languages support it. This allows them to concentrate more on the application. Therefore, there is a need to convert such sequential applications to parallel applications with the help of automated tools. The need is also non-trivial because large amount of legacy code written over the past few decades needs to be reused and parallelized.

SequenceL is a general purpose functional programming language and auto-parallelizing compiler and tool set, whose primary design objectives are performance on multi-core processor hardware, ease of programming, platform portability/optimization, and code clarity and readability. Its main advantage is that it can be used to write straightforward code that automatically takes full advantage of all the processing power available, without programmers needing to be concerned with identifying parallelisms, specifying vectorization, avoiding race conditions, and other challenges of manual directive-based programming approaches such as OpenMP.

In parallel computing, work stealing is a scheduling strategy for multithreaded computer programs. It solves the problem of executing a dynamically multithreaded computation, one that can "spawn" new threads of execution, on a statically multithreaded computer, with a fixed number of processors. It does so efficiently in terms of execution time, memory usage, and inter-processor communication.

In computing, a memory access pattern or IO access pattern is the pattern with which a system or program reads and writes memory on secondary storage. These patterns differ in the level of locality of reference and drastically affect cache performance, and also have implications for the approach to parallelism and distribution of workload in shared memory systems. Further, cache coherency issues can affect multiprocessor performance, which means that certain memory access patterns place a ceiling on parallelism.

Futhark is a functional data parallel array programming language originally developed at DIKU as part of the HIPERFIT project. It focuses on enabling data parallel programs written in a functional style to be executed with high performance on massively parallel hardware, in particular on GPUs. Futhark is strongly inspired by NESL, but imposes constraints on how parallelism can be expressed in order to enable more aggressive compiler optimisations. In particular, irregular nested data parallelism is not supported.

The flattening transformation is an algorithm that transforms nested data parallelism into flat data parallelism. It was pioneered by Guy Blelloch as part of the NESL programming language. The flattening transformation is also sometimes called vectorization, but is completely unrelated to automatic vectorization. The original flattening algorithm was concerned solely with first-order multidimensional arrays containing primitive types, but was extended to handle higher-order and recursive data types in the work on Data Parallel Haskell.

References

↑ Blelloch, Guy (1995). "NESL: A Nested Data-Parallel Language".{{cite journal}}: Cite journal requires |journal= (help)
↑ Spoonhower, Daniel; Harper; Blelloch; Gibbons (2008). "Space profiling for parallel functional programs".{{cite journal}}: Cite journal requires |journal= (help)
↑ Jones, Simon Peyton. "Data Parallel Haskell". YouTube . Retrieved 6 September 2011.

External links

NESL homepage

This programming-language-related article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Blelloch, Guy (1995). "NESL: A Nested Data-Parallel Language".{{cite journal}}: Cite journal requires |journal= (help)

[2] Spoonhower, Daniel; Harper; Blelloch; Gibbons (2008). "Space profiling for parallel functional programs".{{cite journal}}: Cite journal requires |journal= (help)

[dph-3] Jones, Simon Peyton. "Data Parallel Haskell". YouTube . Retrieved 6 September 2011.

[1]

[2]

[3]

NESL

Contents

Influences

See also

Related Research Articles

References

External links