Association list

Last updated
Association list
Type associative array
Time complexity in big O notation
OperationAverageWorst case
Search O(n) O(n)
Insert O(1) O(1)
Delete O(n) O(n)
Space complexity
Space O(n) O(n)

In computer programming and particularly in Lisp, an association list, often referred to as an alist, is a linked list in which each list element (or node) comprises a key and a value. The association list is said to associate the value with the key. In order to find the value associated with a given key, a sequential search is used: each element of the list is searched in turn, starting at the head, until the key is found. Associative lists provide a simple way of implementing an associative array, but are efficient only when the number of keys is very small.

Contents

Operation

An associative array is an abstract data type that can be used to maintain a collection of key–value pairs and look up the value associated with a given key. The association list provides a simple way of implementing this data type.

To test whether a key is associated with a value in a given association list, search the list starting at its first node and continuing either until a node containing the key has been found or until the search reaches the end of the list (in which case the key is not present). To add a new key–value pair to an association list, create a new node for that key-value pair, set the node's link to be the previous first element of the association list, and replace the first element of the association list with the new node. [1] Although some implementations of association lists disallow having multiple nodes with the same keys as each other, such duplications are not problematic for this search algorithm: duplicate keys that appear later in the list are ignored. [2]

It is also possible to delete a key from an association list, by scanning the list to find each occurrence of the key and splicing the nodes containing the key out of the list. [1] The scan should continue to the end of the list, even when the key is found, in case the same key may have been inserted multiple times.

Performance

The disadvantage of association lists is that the time to search is O(n) , where n is the length of the list. [3] For large lists, this may be much slower than the times that can be obtained by representing an associative array as a binary search tree or as a hash table. Additionally, unless the list is regularly pruned to remove elements with duplicate keys, multiple values associated with the same key will increase the size of the list, and thus the time to search, without providing any compensatory advantage.

One advantage of association lists is that a new element can be added in constant time. Additionally, when the number of keys is very small, searching an association list may be more efficient than searching a binary search tree or hash table, because of the greater simplicity of their implementation. [4]

Applications and software libraries

In the early development of Lisp, association lists were used to resolve references to free variables in procedures. [5] [6] In this application, it is convenient to augment association lists with an additional operation, that reverses the addition of a key–value pair without scanning the list for other copies of the same key. In this way, the association list can function as a stack, allowing local variables to temporarily shadow other variables with the same names, without destroying the values of those other variables. [7]

Many programming languages, including Lisp, [5] Scheme, [8] OCaml, [9] and Haskell [10] have functions for handling association lists in their standard libraries.

See also

Related Research Articles

In computer science, an array is a data structure consisting of a collection of elements, of same memory size, each identified by at least one array index or key. An array is stored such that the position of each element can be computed from its index tuple by a mathematical formula. The simplest type of data structure is a linear array, also called one-dimensional array.

<span class="mw-page-title-main">Binary search algorithm</span> Search algorithm finding the position of a target value within a sorted array

In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array.

<span class="mw-page-title-main">Data structure</span> Particular way of storing and organizing data in a computer

In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data, i.e., it is an algebraic structure about data.

<span class="mw-page-title-main">Hash table</span> Associative array for storing key-value pairs

In computing, a hash table, also known as a hash map or a hash set, is a data structure that implements an associative array, also called a dictionary, which is an abstract data type that maps keys to values. A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored.

In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes which together represent a sequence. In its most basic form, each node contains data, and a reference to the next node in the sequence. This structure allows for efficient insertion or removal of elements from any position in the sequence during iteration. More complex variants add additional links, allowing more efficient insertion or removal of nodes at arbitrary positions. A drawback of linked lists is that data access time is linear in respect to the number of nodes in the list. Because nodes are serially linked, accessing any node requires that the prior node be accessed beforehand. Faster access, such as random access, is not feasible. Arrays have better cache locality compared to linked lists.

<span class="mw-page-title-main">Trie</span> K-ary search tree data structure

In computer science, a trie, also called digital tree or prefix tree, is a type of k-ary search tree, a tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes defined not by the entire key, but by individual characters. In order to access a key, the trie is traversed depth-first, following the links between nodes, which represent each character in the key.

In computer science, an associative array, map, symbol table, or dictionary is an abstract data type that stores a collection of pairs, such that each possible key appears at most once in the collection. In mathematical terms, an associative array is a function with finite domain. It supports 'lookup', 'remove', and 'insert' operations.

In computer science, a set is an abstract data type that can store unique values, without any particular order. It is a computer implementation of the mathematical concept of a finite set. Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set.

In computer science, a list or sequence is an abstract data type that represents a finite number of ordered values, where the same value may occur more than once. An instance of a list is a computer representation of the mathematical concept of a tuple or finite sequence; the (potentially) infinite analog of a list is a stream. Lists are a basic example of containers, as they contain other values. If the same value occurs multiple times, each occurrence is considered a distinct item.

A list comprehension is a syntactic construct available in some programming languages for creating a list based on existing lists. It follows the form of the mathematical set-builder notation as distinct from the use of map and filter functions.

<span class="mw-page-title-main">Self-balancing binary search tree</span> Any node-based binary search tree that automatically keeps its height the same

In computer science, a self-balancing binary search tree (BST) is any node-based binary search tree that automatically keeps its height small in the face of arbitrary item insertions and deletions. These operations when designed for a self-balancing binary search tree, contain precautionary measures against boundlessly increasing tree height, so that these abstract data structures receive the attribute "self-balancing".

<span class="mw-page-title-main">Adjacency list</span> Data structure representing a graph

In graph theory and computer science, an adjacency list is a collection of unordered lists used to represent a finite graph. Each unordered list within an adjacency list describes the set of neighbors of a particular vertex in the graph. This is one of several commonly used representations of graphs for use in computer programs.

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed ; the more items added, the larger the probability of false positives.

In computing, a persistent data structure or not ephemeral data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure. The term was introduced in Driscoll, Sarnak, Sleator, and Tarjan's 1986 article.

<span class="mw-page-title-main">Radix tree</span> Data structure

In computer science, a radix tree is a data structure that represents a space-optimized trie in which each node that is the only child is merged with its parent. The result is that the number of children of every internal node is at most the radix r of the radix tree, where r is a positive integer and a power x of 2, having x ≥ 1. Unlike regular trees, edges can be labeled with sequences of elements as well as single elements. This makes radix trees much more efficient for small sets and for sets of strings that share long prefixes.

In computer science, a multimap is a generalization of a map or associative array abstract data type in which more than one value may be associated with and returned for a given key. Both map and multimap are particular cases of containers. Often the multimap is implemented as a map with lists or sets as the map values.

In computer programming, a collection is a grouping of some variable number of data items that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. Generally, the data items will be of the same type or, in languages supporting inheritance, derived from some common ancestor type. A collection is a concept applicable to abstract data types, and does not prescribe a specific implementation as a concrete data structure, though often there is a conventional choice.

This Comparison of programming languages (associative arrays) compares the features of associative array data structures or array-lookup processing for over 40 computer programming languages.

A concurrent hash-trie or Ctrie is a concurrent thread-safe lock-free implementation of a hash array mapped trie. It is used to implement the concurrent map abstraction. It has particularly scalable concurrent insert and remove operations and is memory-efficient. It is the first known concurrent data-structure that supports O(1), atomic, lock-free snapshots.

References

  1. 1 2 Marriott, Kim; Stuckey, Peter J. (1998). Programming with Constraints: An Introduction. MIT Press. pp. 193–195. ISBN   9780262133418.
  2. Frické, Martin (2012). "2.8.3 Association Lists". Logic and the Organization of Information. Springer. pp. 44–45. ISBN   9781461430872.
  3. Knuth, Donald. "6.1 Sequential Searching". The Art of Computer Programming, Vol. 3: Sorting and Searching (2nd ed.). Addison Wesley. pp. 396–405. ISBN   0-201-89685-0.
  4. Janes, Calvin (2011). "Using Association Lists for Associative Arrays". Developer's Guide to Collections in Microsoft .NET. Pearson Education. p. 191. ISBN   9780735665279.
  5. 1 2 McCarthy, John; Abrahams, Paul W.; Edwards, Daniel J.; Hart, Timothy P.; Levin, Michael I. (1985). LISP 1.5 Programmer's Manual . MIT Press. ISBN   0-262-13011-4. See in particular p. 12 for functions that search an association list and use it to substitute symbols in another expression, and p. 103 for the application of association lists in maintaining variable bindings.
  6. van de Snepscheut, Jan L. A. (1993). What Computing Is All About. Monographs in Computer Science. Springer. p. 201. ISBN   9781461227106.
  7. Scott, Michael Lee (2000). "3.3.4 Association Lists and Central Reference Tables". Programming Language Pragmatics. Morgan Kaufmann. p. 137. ISBN   9781558604421.
  8. Pearce, Jon (2012). Programming and Meta-Programming in Scheme. Undergraduate Texts in Computer Science. Springer. p. 214. ISBN   9781461216827.
  9. Minsky, Yaron; Madhavapeddy, Anil; Hickey, Jason (2013). Real World OCaml: Functional Programming for the Masses. O'Reilly Media. p. 253. ISBN   9781449324766.
  10. O'Sullivan, Bryan; Goerzen, John; Stewart, Donald Bruce (2008). Real World Haskell: Code You Can Believe In. O'Reilly Media. p. 299. ISBN   9780596554309.
  11. "10.1. The Property List". Cs.cmu.edu. Retrieved 29 September 2017.