Peek (data type operation)

Last updated

In computer science, peek is an operation on certain abstract data types, specifically sequential collections such as stacks and queues, which returns the value of the top ("front") of the collection without removing the element from the collection. It thus returns the same value as operations such as "pop" or "dequeue", but does not modify the data.

Computer science Study of the theoretical foundations of information and computation

Computer science is the study of processes that interact with data and that can be represented as data in the form of programs. It enables the use of algorithms to manipulate, store, and communicate digital information. A computer scientist studies the theory of computation and the practice of designing software systems.

In computer science, an abstract data type (ADT) is a mathematical model for data types, where a data type is defined by its behavior (semantics) from the point of view of a user of the data, specifically in terms of possible values, possible operations on data of this type, and the behavior of these operations. This contrasts with data structures, which are concrete representations of data, and are the point of view of an implementer, not a user.

In computer science, a collection or container is a grouping of some variable number of data items that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. Generally, the data items will be of the same type or, in languages supporting inheritance, derived from some common ancestor type. A collection is a concept applicable to abstract data types, and does not prescribe a specific implementation as a concrete data structure, though often there is a conventional choice.

Contents

The name "peek" is similar to the basic "push" and "pop" operations on a stack, but the name for this operation varies depending on data type and language. Peek is generally considered an inessential operation, compared with the more basic operations of adding and removing data, and as such is not included in the basic definition of these data types. However, since it is a useful operation and generally easily implemented, it is frequently included in practices, and in some definitions peek is included as basic, with pop (or analog) defined in terms of peek; see abstract definition.

Data types

Sequential types for which peek is often implemented include:

Stack (abstract data type) abstract data type

In computer science, a stack is an abstract data type that serves as a collection of elements, with two principal operations:

Queue (abstract data type) abstract data type

In computer science, a queue is a collection in which the entities in the collection are kept in order and the principal operations on the collection are the addition of entities to the rear terminal position, known as enqueue, and removal of entities from the front terminal position, known as dequeue. This makes the queue a First-In-First-Out (FIFO) data structure. In a FIFO data structure, the first element added to the queue will be the first one to be removed. This is equivalent to the requirement that once a new element is added, all elements that were added before have to be removed before the new element can be removed. Often a peek or front operation is also entered, returning the value of the front element without dequeuing it. A queue is an example of a linear data structure, or more abstractly a sequential collection.

In computer science, a priority queue is an abstract data type which is like a regular queue or stack data structure, but where additionally each element has a "priority" associated with it. In a priority queue, an element with high priority is served before an element with low priority. In some implementations, if two elements have the same priority, they are served according to the order in which they were enqueued, while in other implementations, ordering of elements with the same priority is undefined.

Single-ended types, such as stack, generally only admit a single peek, at the end that is modified. Double-ended types, such as deques, admit two peeks, one at each end.

Names for peek vary. "Peek" or "top" are common for stacks, while for queues "front" is common. Operations on deques have varied names, often "front" and "back" or "first" and "last". The name "peak" is also occasionally found, in the sense of "top, summit", though this also occurs as a spelling error for the verb "peek".

Abstract definition

Intuitively, peek returns the same value as pop, but does not change the data. Behavior when the collection is empty varies – most often this yields an underflow error, identically to a pop on an empty collection, but some implementations provide a function which instead simply returns (without error), essentially implementing if isempty then return, else peek.

This behavior can be axiomatized in various ways. For example, a common VDM ( Vienna Development Method ) description of a stack defines top (peek) and remove as atomic, where top returns the top value (without modifying the stack), and remove modifies the stack (without returning a value). [1] In this case pop is defined in terms of top and remove.

The Vienna Development Method (VDM) is one of the longest-established formal methods for the development of computer-based systems. Originating in work done at the IBM Laboratory Vienna in the 1970s, it has grown to include a group of techniques and tools based on a formal specification language—the VDM Specification Language (VDM-SL). It has an extended form, VDM++, which supports the modeling of object-oriented and concurrent systems. Support for VDM includes commercial and academic tools for analyzing models, including support for testing and proving properties of models and generating program code from validated VDM models. There is a history of industrial usage of VDM and its tools and a growing body of research in the formalism has led to notable contributions to the engineering of critical systems, compilers, concurrent systems and in logic for computer science.

Alternatively, given pop, the operation peek can be axiomatized as:

peek(D) = pop(D)
peek(D), D = D

meaning "returns the same value as pop", and "does not change the underlying data" (value of data after peek same as before peek).

Implementation

Peek can generally be implemented very easily in simple routine taking O(1) time and no added space, by a simple variant of the pop operation. Most sequential data types are implemented by a data structure containing a reference to the end, and thus peek is simply implemented by dereferencing this. In some cases it is more complicated, however.

In computer science, a reference is a value that enables a program to indirectly access a particular datum, such as a variable's value or a record, in the computer's memory or in some other storage device. The reference is said to refer to the datum, and accessing the datum is called dereferencing the reference.

The dereference operator or indirection operator, sometimes denoted by "*", is a unary operator found in C-like languages that include pointer variables. It operates on a pointer variable, and returns an l-value equivalent to the value at the pointer address. This is called "dereferencing" the pointer. For example, the C code

For some data types, such as stacks, this can be replicated in terms of more basic operations, but for other data types, such as queues, it cannot. Even if peek can be replicated in terms of other operations, it is almost always more efficient to implement it separately, as this avoids modifying the data, and it is easy to implement, as this simply consists of returning the same value as the "pop" (or analogous operation), but then not modifying the data.

For the stack, priority queue, deque, and DEPQ types, peek can be implemented in terms of pop and push (if done at same end). For stacks and deques this is generally efficient, as these operations are O(1) in most implementations, and do not require memory allocation (as they decrease the size of the data) – the two ends of a deque each functioning as a stack. For priority queues and DEPQs, however, dequeuing and enqueuing often take O(log n) time (for example if implemented as a binary heap), while O(1) performance of "peek" (here generally called "find-min" or "find-max") is a key desired characteristic of priority queues, and thus peek is almost invariably implemented separately.

For queue, because enqueuing and dequeuing occur at opposite ends, peek cannot be implemented in terms of basic operations, and thus is often implemented separately.

One case in which peek is not trivial is in an ordered list type (i.e., elements accessible in order) implemented by a self-balancing binary search tree. In this case find-min or find-max take O(log n) time, as does access to any other element. Making find-min or find-max take O(1) time can be done by caching the min or max values, but this adds overhead to the data structure and to the operations of adding or removing elements.

Related Research Articles

In computer science, a double-ended queue is an abstract data type that generalizes a queue, for which elements can be added to or removed from either the front (head) or back (tail). It is also often called a head-tail linked list, though properly this refers to a specific data structure implementation of a deque.

FIFO (computing and electronics) scheduling algorithm

FIFO is an acronym for first in, first out, a method for organising and manipulating a data buffer, where the oldest (first) entry, or 'head' of the queue, is processed first. It is analogous to processing a queue with first-come, first-served (FCFS) behaviour: where the people leave the queue in the order in which they arrive.

Heap (data structure) tree-based data structure in computer science

In computer science, a heap is a specialized tree-based data structure which is essentially an almost complete tree that satisfies the heap property: in a max heap, for any given node C, if P is a parent node of C, then the key of P is greater than or equal to the key of C. In a min heap, the key of P is less than or equal to the key of C. The node at the "top" of the heap is called the root node.

Dijkstras algorithm graph search algorithm

Dijkstra's algorithm is an algorithm for finding the shortest paths between nodes in a graph, which may represent, for example, road networks. It was conceived by computer scientist Edsger W. Dijkstra in 1956 and published three years later.

Data type classification of data in computer science

In computer science and computer programming, a data type or simply type is an attribute of data which tells the compiler or interpreter how the programmer intends to use the data. Most programming languages support common data types of real, integer and boolean. A data type constrains the values that an expression, such as a variable or a function, might take. This data type defines the operations that can be done on the data, the meaning of the data, and the way values of that type can be stored. A type of value from which an expression may take its value.

In computer science, a semaphore is a variable or abstract data type used to control access to a common resource by multiple processes in a concurrent system such as a multitasking operating system. A semaphore is simply a variable. This variable is used to solve critical section problems and to achieve process synchronization in the multi processing environment. A trivial semaphore is a plain variable that is changed depending on programmer-defined conditions.

In computer science, amortized analysis is a method for analyzing a given algorithm's complexity, or how much of a resource, especially time or memory, it takes to execute. The motivation for amortized analysis is that looking at the worst-case run time per operation, rather than per algorithm, can be too pessimistic.

In computer science, tree traversal is a form of graph traversal and refers to the process of visiting each node in a tree data structure, exactly once. Such traversals are classified by the order in which the nodes are visited. The following algorithms are described for a binary tree, but they may be generalized to other trees as well.

In concurrent programming, a monitor is a synchronization construct that allows threads to have both mutual exclusion and the ability to wait (block) for a certain condition to become true. Monitors also have a mechanism for signaling other threads that their condition has been met. A monitor consists of a mutex (lock) object and condition variables. A condition variable is basically a container of threads that are waiting for a certain condition. Monitors provide a mechanism for threads to temporarily give up exclusive access in order to wait for some condition to be met, before regaining exclusive access and resuming their task.

Java collections framework Collections in Java

The Java collections framework is a set of classes and interfaces that implement commonly reusable collection data structures.

A calendar queue (CQ) is a priority queue. It is analogous to desk calendar, which is used by humans for ordering future events by date. Discrete event simulations require a future event list (FEL) structure that sorts pending events according to their time. Such simulators require a good and efficient data structure as time spent on queue management can be significant. The calendar queue can approach O(1) average performance.

In computer science, a double-ended priority queue (DEPQ) or double-ended heap is a data structure similar to a priority queue or heap, but allows for efficient removal of both the maximum and minimum, according to some ordering on the keys (items) stored in the structure. Every element in a DEPQ has a priority or value. In a DEPQ, it is possible to remove the elements in both ascending as well as descending order.

Queap priority queue data structure

In computer science, a queap is a priority queue data structure. The data structure allows insertions and deletions of arbitrary elements, as well as retrieval of the highest-priority element. Each deletion takes amortized time logarithmic in the number of items that have been in the structure for a longer time than the removed item. Insertions take constant amortized time.

In computing, sequence containers refer to a group of container class templates in the standard library of the C++ programming language that implement storage of data elements. Being templates, they can be used to store arbitrary elements, such as integers or custom classes. One common property of all sequential containers is that the elements can be accessed sequentially. Like all other standard library components, they reside in namespace std.

In computer science, a randomized meldable heap is a priority queue based data structure in which the underlying structure is also a heap-ordered binary tree. However, there are no restrictions on the shape of the underlying binary tree.

References

  1. Jones: "Systematic Software Development Using VDM"