Memory-bound function

Last updated August 06, 2024

Memory bound refers to a situation in which the time to complete a given computational problem is decided primarily by the amount of free memory required to hold the working data. This is in contrast to algorithms that are compute-bound, where the number of elementary computation steps is the deciding factor.

Memory-bound functions and memory functions

Memory-bound functions and memory functions are related in that both involve extensive memory access, but a distinction exists between the two.

Memory functions use a dynamic programming technique called memoization in order to relieve the inefficiency of recursion that might occur. It is based on the simple idea of calculating and storing solutions to subproblems so that the solutions can be reused later without recalculating the subproblems again. The best known example that takes advantage of memoization is an algorithm that computes the Fibonacci numbers. The following pseudocode uses recursion and memoization, and runs in linear CPU time:

Fibonacci(n){fori=0ton-1results[i]=-1// -1 means undefinedreturnFibonacci_Results(results,n);}Fibonacci_Results(results,n){if(results[n]!=-1)// If it has been solved before,returnresults[n]// look it up.if(n==0)val=0elseif(n==1)val=1elseval=Fibonacci_Results(results,n-2)+Fibonacci_Results(results,n-1)results[n]=val// Save this result for re-use.returnval}

Compare the above to an algorithm that uses only recursion, and runs in exponential CPU time:

Recursive_Fibonacci(n){if(n==0)return0if(n==1)return1returnRecursive_Fibonacci(n-1)+Recursive_Fibonacci(n-2)}

While the recursive-only algorithm is simpler and more elegant than the algorithm that uses recursion and memoization, the latter has a significantly lower time complexity than the former.

The term "memory-bound function" has surfaced only recently and is used principally to describe a function that uses XOR and consists of a series of computations in which each computation depends on the previous computation. Whereas memory functions have long been an important actor in improving time complexity, memory-bound functions have seen far fewer applications. Recently^{[ when? ]}, however, scientists have proposed a method using memory-bound functions as a means to discourage spammers from abusing resources, which could be a major breakthrough in that area.

Using memory-bound functions to prevent spam

Memory-bound functions might be useful in a proof-of-work system that could deter spam, which has become a problem of epidemic proportions on the Internet.

In 1992, IBM research scientists Cynthia Dwork and Moni Naor published a paper at CRYPTO 1992 titled Pricing via Processing or Combatting Junk Mail,^[1] suggesting a possibility of using CPU-bound functions to deter abusers from sending spam. The scheme was based on the idea that computer users are much more likely to abuse a resource if the cost of abusing the resource is negligible: the underlying reason spam has become so rampant is that sending an e-mail has minuscule cost for spammers.

Dwork and Naor proposed that spamming might be reduced by injecting an additional cost in the form of an expensive CPU computation: CPU-bound functions would consume CPU resources at the sender's machine for each message, thus preventing huge amounts of spam from being sent in a short period.

The basic scheme that protects against abuses is as follows:
Given a Sender, a Recipient, and an email Message. If Recipient has agreed beforehand to receive e-mail from Sender, then Message is transmitted in the usual way. Otherwise, Sender computes some function $G(Message)$ and sends $(Message, G(Message))$ to Recipient. Recipient checks if what it receives from Sender is of the form $(Message, G(Message))$ . If yes, Recipient accepts Message. Otherwise, Recipient rejects Message.

The function $G()$ is selected such that the verification by Recipient is relatively fast (e.g., taking a millisecond) and such that the computation by Sender is somewhat slow (involving at least several seconds). Therefore, Sender will be discouraged from sending Message to multiple recipients with no prior agreements: the cost in terms of both time and computing resources of computing $G()$ repeatedly will become very prohibitive for a spammer who intends to send many millions of e-mails.

The major problem of using the above scheme is that fast CPUs compute much faster than slow CPUs. Further, higher-end computer systems also have sophisticated pipelines and other advantageous features that facilitate computations. As a result, a spammer with a state-of-the-art system will hardly be affected by such deterrence while a typical user with a mediocre system will be adversely affected. If a computation takes a few seconds on a new PC, it may take a minute on an old PC, and several minutes on a PDA, which might be a nuisance for users of old PCs, but probably unacceptable for users of PDAs. The disparity in client CPU speed constitutes one of the prominent roadblocks to widespread adoption of any scheme based on a CPU-bound function. Therefore, researchers are concerned with finding functions that most computer systems will evaluate at about the same speed, so that high-end systems might evaluate these functions somewhat faster than low-end systems (2–10 times faster, but not 10–100 times faster) as CPU disparities might imply. These ratios are "egalitarian" enough for the intended applications: the functions are effective in discouraging abuses and do not add a prohibitive delay on legitimate interactions, across a wide range of systems.

The new egalitarian approach is to rely on memory-bound functions. As stated before, a memory-bound function is a function whose computation time is dominated by the time spent accessing memory. A memory-bound function accesses locations in a large region of memory in an unpredictable way, in such a way that using caches are not effective. In recent years, the speed of CPU has grown drastically, but there has been comparatively small progress in developing faster main memory. Since the ratios of memory latencies of machines built in the last five years is typically no greater than two, and almost always less than four, the memory-bound function will be egalitarian to most systems for the foreseeable future.

Related Research Articles

In computability theory, the Ackermann function, named after Wilhelm Ackermann, is one of the simplest and earliest-discovered examples of a total computable function that is not primitive recursive. All primitive recursive functions are total and computable, but the Ackermann function illustrates that not all total computable functions are primitive recursive.

In computability theory, the Church–Turing thesis is a thesis about the nature of computable functions. It states that a function on the natural numbers can be calculated by an effective method if and only if it is computable by a Turing machine. The thesis is named after American mathematician Alonzo Church and the British mathematician Alan Turing. Before the precise definition of computable function, mathematicians often used the informal term effectively calculable to describe functions that are computable by paper-and-pencil methods. In the 1930s, several independent attempts were made to formalize the notion of computability:

In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed and which also avoids repeated evaluations.

In theoretical computer science and mathematics, the theory of computation is the branch that deals with what problems can be solved on a model of computation, using an algorithm, how efficiently they can be solved or to what degree. The field is divided into three major branches: automata theory and formal languages, computability theory, and computational complexity theory, which are linked by the question: "What are the fundamental capabilities and limitations of computers?".

In computer science, algorithmic efficiency is a property of an algorithm which relates to the amount of computational resources used by the algorithm. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process.

Computability theory, also known as recursion theory, is a branch of mathematical logic, computer science, and the theory of computation that originated in the 1930s with the study of computable functions and Turing degrees. The field has since expanded to include the study of generalized computability and definability. In these areas, computability theory overlaps with proof theory and effective descriptive set theory.

In computer science, divide and conquer is an algorithm design paradigm. A divide-and-conquer algorithm recursively breaks down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved directly. The solutions to the sub-problems are then combined to give a solution to the original problem.

Hashcash is a proof-of-work system used to limit email spam and denial-of-service attacks. Hashcash was proposed in 1997 by Adam Back and described more formally in Back's 2002 paper "Hashcash - A Denial of Service Counter-Measure". In Hashcash the client has to concatenate a random number with a string several times and hash this new string. It then has to do so over and over until a hash beginning with a certain number of zeros is found.

The Penny Black Project is a Microsoft Research project that tries to find effective and practical ways of fighting spam. Because identifying spams consumes a recipient's time, the idea is to make the sender of emails "pay" a certain amount for sending them. The currency or the mode of payment could be CPU cycles, Turing tests or memory cycles. Such a payment would limit spammers' ability to send out large quantities of emails quickly.

In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls to pure functions and returning the cached result when the same inputs occur again. Memoization has also been used in other contexts, such as in simple mutually recursive descent parsing. It is a type of caching, distinct from other forms of caching such as buffering and page replacement. In the context of some logic programming languages, memoization is also known as tabling.

Computable functions are the basic objects of study in computability theory. Computable functions are the formalized analogue of the intuitive notion of algorithms, in the sense that a function is computable if there exists an algorithm that can do the job of the function, i.e. given an input of the function domain it can return the corresponding output. Computable functions are used to discuss computability without referring to any concrete model of computation such as Turing machines or register machines. Any definition, however, must make reference to some specific model of computation but all valid definitions yield the same class of functions. Particular models of computability that give rise to the set of computable functions are the Turing-computable functions and the general recursive functions.

In computer science, a problem is said to have overlapping subproblems if the problem can be broken down into subproblems which are reused several times or a recursive algorithm for the problem solves the same subproblem over and over rather than always generating new subproblems.

In computer science, corecursion is a type of operation that is dual to recursion. Whereas recursion works analytically, starting on data further from a base case and breaking it down into smaller data and repeating until one reaches a base case, corecursion works synthetically, starting from a base case and building it up, iteratively producing data further removed from a base case. Put simply, corecursive algorithms use the data that they themselves produce, bit by bit, as they become available, and needed, to produce further bits of data. A similar but distinct concept is generative recursion, which may lack a definite "direction" inherent in corecursion and recursion.

Proof of work (PoW) is a form of cryptographic proof in which one party proves to others that a certain amount of a specific computational effort has been expended. Verifiers can subsequently confirm this expenditure with minimal effort on their part. The concept was invented by Moni Naor and Cynthia Dwork in 1993 as a way to deter denial-of-service attacks and other service abuses such as spam on a network by requiring some work from a service requester, usually meaning processing time by a computer. The term "proof of work" was first coined and formalized in a 1999 paper by Markus Jakobsson and Ari Juels. The concept was adapted to digital tokens by Hal Finney in 2004 through the idea of "reusable proof of work" using the 160-bit secure hash algorithm 1 (SHA-1).

<span class="mw-page-title-main">Recursion (computer science)</span> Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

Algorithm characterizations are attempts to formalize the word algorithm. Algorithm does not have a generally accepted formal definition. Researchers are actively working on this problem. This article will present some of the "characterizations" of the notion of "algorithm" in more detail.

The history of the Church–Turing thesis ("thesis") involves the history of the development of the study of the nature of functions whose values are effectively calculable; or, in more modern terms, functions whose values are algorithmically computable. It is an important topic in modern mathematical theory and computer science, particularly associated with the work of Alonzo Church and Alan Turing.

Cynthia Dwork is an American computer scientist best known for her contributions to cryptography, distributed computing, and algorithmic fairness. She is one of the inventors of differential privacy and proof-of-work.

In cryptography, a memory-hard function (MHF) is a function that costs a significant amount of memory to efficiently evaluate. It differs from a memory-bound function, which incurs cost by slowing down computation through memory latency. MHFs have found use in key stretching and proof of work as their increased memory requirements significantly reduce the computational efficiency advantage of custom hardware over general-purpose hardware compared to non-MHFs.

References

↑ Dwork, Cynthia; Naor, Moni (1992). "Pricing via Processing or Combatting Junk Mail". Advances in Cryptology — CRYPTO' 92. Lecture Notes in Computer Science. Vol. 740. pp. 139–147. doi: 10.1007/3-540-48071-4_10 . ISBN 978-3-540-57340-1. (updated version of same)

Abadi, M., Burrows, M., Manasse, M., & Wobber, T. (2005, May). Moderately Hard, Memory-bound Functions, ACM Transactions on Internet Technology.
Dwork, C., Goldberg, A., & Naor, M. (2003). On Memory-Bound Functions for Fighting Spam, Advances in Cryptology.
Hellman, M. E. (1980). A Cryptanalytic Time-Memory Trade Off, IEEE Transactionson Information Theory.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Dwork, Cynthia; Naor, Moni (1992). "Pricing via Processing or Combatting Junk Mail". Advances in Cryptology — CRYPTO' 92. Lecture Notes in Computer Science. Vol. 740. pp. 139–147. doi: 10.1007/3-540-48071-4_10 . ISBN 978-3-540-57340-1. (updated version of same)

[1]

Memory-bound function

Contents