Filter (higher-order function)

Last updated

In functional programming, filter is a higher-order function that processes a data structure (usually a list) in some order to produce a new data structure containing exactly those elements of the original data structure for which a given predicate returns the boolean value true.

Contents

Example

In Haskell, the code example

filtereven[1..10]

evaluates to the list 2, 4, …, 10 by applying the predicate even to every element of the list of integers 1, 2, …, 10 in that order and creating a new list of those elements for which the predicate returns the boolean value true, thereby giving a list containing only the even members of that list. Conversely, the code example

filter(not.even)[1..10]

evaluates to the list 1, 3, …, 9 by collecting those elements of the list of integers 1, 2, …, 10 for which the predicate even returns the boolean value false (with . being the function composition operator).

Visual example


Below, you can see a view of each step of the filter process for a list of integers X = [0, 5, 8, 3, 2, 1] according to the function :

This function express that if is even the return value is , otherwise it's . This is the predicate.

View of processing steps when applying filter function on a list Filter-steps-loillierbe.gif
View of processing steps when applying filter function on a list

Language comparison

Filter is a standard function for many programming languages, e.g., Haskell, [1] OCaml, [2] Standard ML, [3] or Erlang. [4] Common Lisp provides the functions remove-if and remove-if-not. [5] Scheme Requests for Implementation (SRFI) 1 provides an implementation of filter for the language Scheme. [6] C++ provides the algorithms remove_if (mutating) and remove_copy_if (non-mutating); C++11 additionally provides copy_if (non-mutating). [7] Smalltalk provides the select: method for collections. Filter can also be realized using list comprehensions in languages that support them.

In Haskell, filter can be implemented like this:

filter::(a->Bool)->[a]->[a]filter_[]=[]filterp(x:xs)=[x|px]++filterpxs

Here, [] denotes the empty list, ++ the list concatenation operation, and [x | p x] denotes a list conditionally holding a value, x, if the condition p x holds (evaluates to True).

Filter in various languages
LanguageFilterNotes
APL (predarray)/array
or
pred{/⍨⍺⍺}array
The second example is an APL dop.
C# 3.0ienum.Where(pred)
or
The where clause
Where is an extension method
ienum is an IEnumerable
Similarly in all .NET languages
CFML obj.filter(func)Where obj is an array or a structure. The func receives as an argument each element's value.
Clojure (filter predicatelist) [8] Or, via list comprehension: (for [x list :when (pred x)] x)
Common Lisp (remove-if inverted-predlist)
(remove-if (complement pred) list)
(remove-if-not predlist)
The function remove-if-not has been deprecated [5] in favor of the equivalent remove-if where the predicate is complemented. [9] Thus the filter (remove-if-not#'oddp'(0123)) should be written (remove-if(complement#'oddp)'(0123)) or more simply: (remove-if#'evenp'(0123)) where evenp returns the inverted value of oddp. [10]
C++ std::remove_copy_if(begin, end, result, prednot)
std::copy_if(begin, end, result, pred) (C++11)
in header <algorithm>
begin, end, result are iterators
predicate is reversed
D std.algorithm.filter!(pred)(list)
Erlang lists:filter(Fun, List)Or, via list comprehension: [ X || X <- List, Fun(X) ]
Groovy list.findAll(pred)
Haskell filter predlistOr, via list comprehension: [x | x <- list, pred x]
Haxe list.filter(pred)
Lambda.filter(list, pred)
Or, via list comprehension: [x | x <- list, pred x]
J (#~ pred) listAn example of a monadic hook. # is copy, ~ reverses arguments. (f g) y = y f (g y)
Julia filter(pred, array)The filter function also accepts dict datatype. Or, via list comprehension: [x for x in array if pred(x)]
Java 8+stream.filter(pred)
JavaScript 1.6array.filter(pred)
Kotlin array.filter(pred)
Mathematica Select[list, pred]
Objective-C (Cocoa in Mac OS X 10.4+)[array filteredArrayUsingPredicate:pred]pred is an NSPredicate object, which may be limited in expressiveness
F#, OCaml, Standard ML List.filter predlist
PARI/GP select(expr, list)The order of arguments is reversed in v. 2.4.2.
Perl grep blocklist
grep expr, list
PHP array_filter(array, pred)
Prolog filter(+Closure,+List,-List)Since ISO/IEC 13211-1:1995/Cor.2:2012 [11] the core standard contains closure application via call/N [12]
Python filter(func, list)Or, via list comprehension: [x for x in list if pred(x)]. In Python 3, filter was changed to return an iterator rather than a list. [13] The complementary functionality, returning an iterator over elements for which the predicate is false, is also available in the standard library as filterfalse in the itertools module.
Ruby enum.find_all {block}
enum.select {block}
enum is an Enumeration
Rust iterator.filter(pred)iterator is an Iterator and the filter method returns a new iterator; pred is a function (specifically FnMut ) that receives the iterator's item and returns a bool
S, R Filter(pred,array)
array[pred(array)]
In the second case, pred must be a vectorized function
Scala list.filter(pred)Or, via for-comprehension: for(x <- list; if pred) yield x
Scheme R6RS(filter predlist)
(remove inverted predlist)
(partition predlistlist)
Smalltalk aCollection select: aBlock
Swift array.filter(pred)
filter(sequence, pred)
XPath, XQuery list[block]
filter(list, func)
In block the context item . holds the current value

Variants

Filter creates its result without modifying the original list. Many programming languages also provide variants that destructively modify the list argument instead for faster performance. Other variants of filter (e.g., Haskell dropWhile [14] and partition [15] ) are also common. A common memory optimization for purely functional programming languages is to have the input list and filtered result share the longest common tail (tail-sharing).

See also

Related Research Articles

Common Lisp (CL) is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 (S20018). The Common Lisp HyperSpec, a hyperlinked HTML version, has been derived from the ANSI Common Lisp standard.

In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program.

Lisp (programming language) Programming language family

Lisp is a family of programming languages with a long history and a distinctive, fully parenthesized prefix notation. Originally specified in 1958, Lisp is the second-oldest high-level programming language still in common use. Only Fortran is older, by one year. Lisp has changed since its early days, and many dialects have existed over its history. Today, the best-known general-purpose Lisp dialects are Racket, Common Lisp, Scheme, and Clojure.

Scheme (programming language) Dialect of Lisp

Scheme is a minimalist dialect of the Lisp family of programming languages. Scheme consists of a small standard core with several tools for language extension.

This is a "genealogy" of programming languages. Languages are categorized under the ancestor language with the strongest influence. Those ancestor languages are listed in alphabetical order. Any such categorization has a large arbitrary element, since programming languages often incorporate major ideas from multiple sources.

Common Lisp Object System

The Common Lisp Object System (CLOS) is the facility for object-oriented programming which is part of ANSI Common Lisp. CLOS is a powerful dynamic object system which differs radically from the OOP facilities found in more static languages such as C++ or Java. CLOS was inspired by earlier Lisp object systems such as MIT Flavors and CommonLoops, although it is more general than either. Originally proposed as an add-on, CLOS was adopted as part of the ANSI standard for Common Lisp and has been adapted into other Lisp dialects such as EuLisp or Emacs Lisp.

In computer science, a set is an abstract data type that can store unique values, without any particular order. It is a computer implementation of the mathematical concept of a finite set. Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set.

A list comprehension is a syntactic construct available in some programming languages for creating a list based on existing lists. It follows the form of the mathematical set-builder notation as distinct from the use of map and filter functions.

In computer programming, especially functional programming and type theory, an algebraic data type is a kind of composite type, i.e., a type formed by combining other types.

In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled programs must use. Most processors support a similar set of primitive data types, although the specific representations vary. More generally "primitive data types" may refer to the standard data types built into a programming language. Data types which are not primitive are referred to as derived or composite data types.

Conditional (computer programming) Programming language construct that performs actions according to boolean conditions

In computer science, conditionals are programming language commands for handling decisions. Specifically, conditionals perform different computations or actions depending on whether a programmer-defined boolean condition evaluates to true or false. In terms of control flow, the decision is always achieved by selectively altering the control flow based on some condition.

System F is a typed lambda calculus that introduces, to simply typed lambda calculus, a mechanism of universal quantification over types. System F formalizes parametric polymorphism in programming languages, thus forming a theoretical basis for languages such as Haskell and ML. It was discovered independently by logician Jean-Yves Girard (1972) and computer scientist John C. Reynolds (1974).

In computer science, the Boolean is a data type that has one of two possible values which is intended to represent the two truth values of logic and Boolean algebra. It is named after George Boole, who first defined an algebraic system of logic in the mid 19th century. The Boolean data type is primarily associated with conditional statements, which allow different actions by changing control flow depending on whether a programmer-specified Boolean condition evaluates to true or false. It is a special case of a more general logical data type —logic doesn't always need to be Boolean.

In computer programming, a guard is a boolean expression that must evaluate to true if the program execution is to continue in the branch in question.

In computer science, a relational operator is a programming language construct or operator that tests or defines some kind of relation between two entities. These include numerical equality and inequalities.

In computing, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another.

In many programming languages, map is the name of a higher-order function that applies a given function to each element of a collection, e.g. a list or set, returning the results in a collection of the same type. It is often called apply-to-all when considered in functional form.

In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.

Some programming languages provide a built-in (primitive) rational data type to represent rational numbers like 1/3 and -11/17 without rounding, and to do arithmetic on them. Examples are the ratio type of Common Lisp, and analogous types provided by most languages for algebraic computation, such as Mathematica and Maple. Many languages that do not have a built-in rational type still provide it as a library-defined type.

References

  1. filter in the Haskell Standard Prelude
  2. filter in the OCaml standard library module list
  3. "The List structure". The Standard ML Basis Library. Retrieved 2007-09-25.
  4. filter/2 in the Erlang STDLIB Reference Manual documentation of the module lists
  5. 1 2 FunctionREMOVE, REMOVE-IF, REMOVE-IF-NOT, DELETE, DELETE-IF, DELETE-IF-NOT in the Common Lisp HyperSpec
  6. filter in SRFI 1
  7. remove_if and remove_copy_if in the SGI Standard Template Library (STL) spec
  8. clojure.core/filter on ClojureDocs
  9. FunctionCOMPLEMENT in the Common Lisp HyperSpec
  10. FunctionEVENP, ODDP in the Common Lisp HyperSpec
  11. ISO/IEC 13211-1:1995/Cor 2:2012
  12. "Draft technical corrigendum 2".
  13. "Built-in Functions — Python 3.9.0 documentation". docs.python.org. Retrieved 2020-10-28.
  14. Haskell filter dropWhile
  15. Haskell filter partition