Filter (higher-order function)

Last updated November 14, 2024

In functional programming, filter is a higher-order function that processes a data structure (usually a list) in some order to produce a new data structure containing exactly those elements of the original data structure for which a given predicate returns the Boolean value true.

Example

In Haskell, the code example

filtereven[1..10]

evaluates to the list 2, 4, …, 10 by applying the predicate even to every element of the list of integers 1, 2, …, 10 in that order and creating a new list of those elements for which the predicate returns the Boolean value true, thereby giving a list containing only the even members of that list. Conversely, the code example

filter(not.even)[1..10]

evaluates to the list 1, 3, …, 9 by collecting those elements of the list of integers 1, 2, …, 10 for which the predicate even returns the Boolean value false (with . being the function composition operator).

Visual example

Below, you can see a view of each step of the filter process for a list of integers X = [0, 5, 8, 3, 2, 1] according to the function : $f(x)={\begin{cases}\mathrm {True} &{\text{ if }}x\equiv 0{\pmod {2}}\\\mathrm {False} &{\text{ if }}x\equiv 1{\pmod {2}}.\end{cases}}$

This function express that if $x$ is even the return value is $\mathrm {True}$ , otherwise it's $\mathrm {False}$ . This is the predicate.

View of processing steps when applying filter function on a list Filter-steps-loillierbe.gif — View of processing steps when applying filter function on a list

Language comparison

Filter is a standard function for many programming languages, e.g., Haskell,^[1] OCaml,^[2] Standard ML,^[3] or Erlang.^[4] Common Lisp provides the functions remove-if and remove-if-not.^[5] Scheme Requests for Implementation (SRFI) 1 provides an implementation of filter for the language Scheme.^[6] C++ provides the algorithms remove_if (mutating) and remove_copy_if (non-mutating); C++11 additionally provides copy_if (non-mutating).^[7] Smalltalk provides the select: method for collections. Filter can also be realized using list comprehensions in languages that support them.

In Haskell, filter can be implemented like this:

filter::(a->Bool)->[a]->[a]filter_[]=[]filterp(x:xs)=[x|px]++filterpxs

Here, [] denotes the empty list, ++ the list concatenation operation, and [x | p x] denotes a list conditionally holding a value, x, if the condition p x holds (evaluates to True).

Filter in various languages
Language	Filter	Notes
APL	`(predarray)/array` or `pred{⍵/⍨⍺⍺⍵}array`	The second example is an APL dop.
C# 3.0	`ienum.Where(pred)` or The `where` clause	Where is an extension method ienum is an IEnumerable Similarly in all .NET languages
CFML	`obj.filter(func)`	Where `obj` is an array or a structure. The `func` receives as an argument each element's value.
Clojure	`(filter predicatelist)^[8]`	Or, via list comprehension: `(for [x list :when (pred x)] x)`
Common Lisp	`(remove-if inverted-predlist) (remove-if (complement pred) list) (remove-if-not predlist)`	The function `remove-if-not` has been deprecated^[5] in favor of the equivalent `remove-if` where the predicate is complemented.^[9] Thus the filter `(remove-if-not#'oddp'(0123))` should be written `(remove-if(complement#'oddp)'(0123))` or more simply: `(remove-if#'evenp'(0123))` where `evenp` returns the inverted value of `oddp`.^[10]
C++	`std::remove_copy_if(begin, end, result, prednot) std::copy_if(begin, end, result, pred) (C++11)`	in header <algorithm> begin, end, result are iterators predicate is reversed
D	`std.algorithm.filter!(pred)(list)`
Erlang	`lists:filter(Fun, List)`	Or, via list comprehension: `[ X \|\| X <- List, Fun(X) ]`
Groovy	`list.findAll(pred)`
Haskell	`filter predlist`	Or, via list comprehension: `[x \| x <- list, pred x]`
Haxe	`list.filter(pred)` `Lambda.filter(list, pred)`	Or, via list comprehension: `[x \| x <- list, pred x]`
J	`(#~ pred) list`	An example of a monadic hook. # is copy, ~ reverses arguments. `(f g) y = y f (g y)`
Julia	`filter(pred, array)`	The filter function also accepts `dict` datatype. Or, via list comprehension: `[x for x in array if pred(x)]`
Java 8+	`stream.filter(pred)`
JavaScript 1.6	`array.filter(pred)`
Kotlin	`array.filter(pred)`
Mathematica	`Select[list, pred]`
Objective-C (Cocoa in Mac OS X 10.4+)	`[array filteredArrayUsingPredicate:pred]`	`pred` is an NSPredicate object, which may be limited in expressiveness
F#, OCaml, Standard ML	`List.filter predlist`
PARI/GP	`select(expr, list)`	The order of arguments is reversed in v. 2.4.2.
Perl	`grep blocklist grep expr, list`
PHP	`array_filter(array, pred)`
Prolog	`filter(+Closure,+List,-List)`	Since ISO/IEC 13211-1:1995/Cor.2:2012^[11] the core standard contains closure application via `call/N`^[12]
Python	`filter(func, list)`	Or, via list comprehension: `[x for x in list if pred(x)]`. In Python 3, `filter` was changed to return an iterator rather than a list.^[13] The complementary functionality, returning an iterator over elements for which the predicate is false, is also available in the standard library as `filterfalse` in the `itertools` module.
Ruby	`enum.find_all {block} enum.select {block}`	`enum` is an Enumeration
Rust	`iterator.filter(pred)`	`iterator` is an `Iterator` and the `filter` method returns a new iterator; `pred` is a function (specifically `FnMut` ) that receives the iterator's item and returns a `bool`
S, R	`Filter(pred,array) array[pred(array)]`	In the second case, pred must be a vectorized function
Scala	`list.filter(pred)`	Or, via for-comprehension: `for(x <- list; if pred) yield x`
Scheme R⁶RS	`(filter predlist)` `(remove inverted predlist)` `(partition predlistlist)`
Smalltalk	`aCollection select: aBlock`
Swift	`array.filter(pred) filter(sequence, pred)`
XPath, XQuery	`list[block]` `filter(list, func)`	In `block` the context item `.` holds the current value

Variants

Filter creates its result without modifying the original list. Many programming languages also provide variants that destructively modify the list argument instead for faster performance. Other variants of filter (e.g., Haskell dropWhile^[14] and partition^[15]) are also common. A common memory optimization for purely functional programming languages is to have the input list and filtered result share the longest common tail (tail-sharing).

Related Research Articles

In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program.

Scheme is a dialect of the Lisp family of programming languages. Scheme was created during the 1970s at the MIT Computer Science and Artificial Intelligence Laboratory and released by its developers, Guy L. Steele and Gerald Jay Sussman, via a series of memos now known as the Lambda Papers. It was the first dialect of Lisp to choose lexical scope and the first to require implementations to perform tail-call optimization, giving stronger support for functional programming and associated techniques such as recursive algorithms. It was also one of the first programming languages to support first-class continuations. It had a significant influence on the effort that led to the development of Common Lisp.

A mathematical symbol is a figure or a combination of figures that is used to represent a mathematical object, an action on mathematical objects, a relation between mathematical objects, or for structuring the other symbols that occur in a formula. As formulas are entirely constituted with symbols of various types, many symbols are needed for expressing all mathematics.

In computer science and computer programming, a data type is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these values as machine types. A data type specification in a program constrains the possible values that an expression, such as a variable or a function call, might take. On literal data, it tells the compiler or interpreter how the programmer intends to use the data. Most programming languages support basic data types of integer numbers, floating-point numbers, characters and Booleans.

This is a "genealogy" of programming languages. Languages are categorized under the ancestor language with the strongest influence. Those ancestor languages are listed in alphabetic order. Any such categorization has a large arbitrary element, since programming languages often incorporate major ideas from multiple sources.

In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every term. Usually the terms are various language constructs of a computer program, such as variables, expressions, functions, or modules. A type system dictates the operations that can be performed on a term. For variables, the type system determines the allowed values of that term.

In computer science, a set is an abstract data type that can store unique values, without any particular order. It is a computer implementation of the mathematical concept of a finite set. Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set.

A list comprehension is a syntactic construct available in some programming languages for creating a list based on existing lists. It follows the form of the mathematical set-builder notation as distinct from the use of map and filter functions.

In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of composite type, i.e., a type formed by combining other types.

In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled programs must use. Most processors support a similar set of primitive data types, although the specific representations vary. More generally, primitive data types may refer to the standard data types built into a programming language. Data types which are not primitive are referred to as derived or composite.

<span class="mw-page-title-main">Conditional (computer programming)</span> Control flow statement that executes code according to some condition(s)

In computer science, conditionals are programming language constructs that perform different computations or actions or return different values depending on the value of a Boolean expression, called a condition.

System F is a typed lambda calculus that introduces, to simply typed lambda calculus, a mechanism of universal quantification over types. System F formalizes parametric polymorphism in programming languages, thus forming a theoretical basis for languages such as Haskell and ML. It was discovered independently by logician Jean-Yves Girard (1972) and computer scientist John C. Reynolds.

In computer science, the Boolean is a data type that has one of two possible values which is intended to represent the two truth values of logic and Boolean algebra. It is named after George Boole, who first defined an algebraic system of logic in the mid 19th century. The Boolean data type is primarily associated with conditional statements, which allow different actions by changing control flow depending on whether a programmer-specified Boolean condition evaluates to true or false. It is a special case of a more general logical data type—logic does not always need to be Boolean.

In computer science, a relational operator is a programming language construct or operator that tests or defines some kind of relation between two entities. These include numerical equality and inequalities.

In computing, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another, called the modulus of the operation.

String functions are used in computer programming languages to manipulate a string or query information about a string.

In many programming languages, map is a higher-order function that applies a given function to each element of a collection, e.g. a list or set, returning the results in a collection of the same type. It is often called apply-to-all when considered in functional form.

In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.

Some programming languages provide a built-in (primitive) rational data type to represent rational numbers like 1/3 and −11/17 without rounding, and to do arithmetic on them. Examples are the ratio type of Common Lisp, and analogous types provided by most languages for algebraic computation, such as Mathematica and Maple. Many languages that do not have a built-in rational type still provide it as a library-defined type.

References

↑ filter in the Haskell Standard Prelude
↑ filter in the OCaml standard library module list
↑ "The List structure". The Standard ML Basis Library. Retrieved 2007-09-25.
↑ filter/2 in the Erlang STDLIB Reference Manual documentation of the module lists
1 2 FunctionREMOVE, REMOVE-IF, REMOVE-IF-NOT, DELETE, DELETE-IF, DELETE-IF-NOT in the Common Lisp HyperSpec
↑ filter in SRFI 1
↑ remove_if and remove_copy_if in the SGI Standard Template Library (STL) spec
↑ clojure.core/filter on ClojureDocs
↑ FunctionCOMPLEMENT in the Common Lisp HyperSpec
↑ FunctionEVENP, ODDP in the Common Lisp HyperSpec
↑ ISO/IEC 13211-1:1995/Cor 2:2012
↑ "Draft technical corrigendum 2".
↑ "Built-in Functions — Python 3.9.0 documentation". docs.python.org. Retrieved 2020-10-28.
↑ Haskell filter dropWhile
↑ Haskell filter partition

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[HaskellStandard-1] filter in the Haskell Standard Prelude

[OcamlManual-2] filter in the OCaml standard library module list

[3] "The List structure". The Standard ML Basis Library. Retrieved 2007-09-25.

[4] filter/2 in the Erlang STDLIB Reference Manual documentation of the module lists

[remove-if-5] 1 2 FunctionREMOVE, REMOVE-IF, REMOVE-IF-NOT, DELETE, DELETE-IF, DELETE-IF-NOT in the Common Lisp HyperSpec

[SRFI1-6] filter in SRFI 1

[SGISTLSpec-7] remove_if and remove_copy_if in the SGI Standard Template Library (STL) spec

[8] ure.core/filter on ClojureDocs

[9] FunctionCOMPLEMENT in the Common Lisp HyperSpec

[10] FunctionEVENP, ODDP in the Common Lisp HyperSpec

[Cor.2-11] ISO/IEC 13211-1:1995/Cor 2:2012

[12] "Draft technical corrigendum 2".

[13] "Built-in Functions — Python 3.9.0 documentation". docs.python.org. Retrieved 2020-10-28.

[14] Haskell filter dropWhile

[15] Haskell filter partition

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]