Concurrent Haskell

Last updated

Concurrent Haskell extends [1] Haskell 98 with explicit concurrency. Its two main underlying concepts are:

Contents

Built atop this is a collection of useful concurrency and synchronisation abstractions [2] such as unbounded channels, semaphores and sample variables.

Haskell threads have very low overhead: creation, context-switching and scheduling are all internal to the Haskell runtime. These Haskell-level threads are mapped onto a configurable number of OS-level threads, usually one per processor core.

Software Transactional Memory

The software transactional memory (STM) extension [3] to GHC reuses the process forking primitives of Concurrent Haskell. STM however:

STM monad

The STM monad [4] is an implementation of software transactional memory in Haskell. It is implemented in GHC, and allows for mutable variables to be modified in transactions.

Traditional approach

Consider a banking application as an example, and a transaction in it—the transfer function, which takes money from one account, and puts it into another account. In the IO monad, this might look like:

typeAccount=IORefIntegertransfer::Integer->Account->Account->IO()transferamountfromto=dofromVal<-readIOReffrom-- (A)toVal<-readIOReftowriteIOReffrom(fromVal-amount)writeIORefto(toVal+amount)

This causes problems in concurrent situations where multiple transfers might be taking place on the same account at the same time. If there were two transfers transferring money from account from, and both calls to transfer ran line (A) before either of them had written their new values, it is possible that money would be put into the other two accounts, with only one of the amounts being transferred being removed from account from, thus creating a race condition. This would leave the banking application in an inconsistent state.

A traditional solution to such a problem is locking. For instance, locks can be placed around modifications to an account to ensure that credits and debits occur atomically. In Haskell, locking is accomplished with MVars:

typeAccount=MVarIntegercredit::Integer->Account->IO()creditamountaccount=docurrent<-takeMVaraccountputMVaraccount(current+amount)debit::Integer->Account->IO()debitamountaccount=docurrent<-takeMVaraccountputMVaraccount(current-amount)

Using such procedures will ensure that money will never be lost or gained due to improper interleaving of reads and writes to any individual account. However, if one tries to compose them together to create a procedure like transfer:

transfer::Integer->Account->Account->IO()transferamountfromto=dodebitamountfromcreditamountto

a race condition still exists: the first account may be debited, then execution of the thread may be suspended, leaving the accounts as a whole in an inconsistent state. Thus, additional locks must be added to ensure correctness of composite operations, and in the worst case, one might need to simply lock all accounts regardless of how many are used in a given operation.

Atomic transactions

To avoid this, one can use the STM monad, which allows one to write atomic transactions. This means that all operations inside the transaction fully complete, without any other threads modifying the variables that our transaction is using, or it fails, and the state is rolled back to where it was before the transaction was begun. In short, atomic transactions either complete fully, or it is as if they were never run at all. The lock-based code above translates in a relatively straightforward way:

typeAccount=TVarIntegercredit::Integer->Account->STM()creditamountaccount=docurrent<-readTVaraccountwriteTVaraccount(current+amount)debit::Integer->Account->STM()debitamountaccount=docurrent<-readTVaraccountwriteTVaraccount(current-amount)transfer::Integer->Account->Account->STM()transferamountfromto=dodebitamountfromcreditamountto

The return types of STM () may be taken to indicate that we are composing scripts for transactions. When the time comes to actually execute such a transaction, a function atomically :: STM a -> IO a is used. The above implementation will make sure that no other transactions interfere with the variables it is using (from and to) while it is executing, allowing the developer to be sure that race conditions like that above are not encountered. More improvements can be made to make sure that some other "business logic" is maintained in the system, i.e. that the transaction should not try to take money from an account until there is enough money in it:

transfer::Integer->Account->Account->STM()transferamountfromto=dofromVal<-readTVarfromif(fromVal-amount)>=0thendodebitamountfromcreditamounttoelseretry

Here the retry function has been used, which will roll back a transaction, and try it again. Retrying in STM is smart in that it won't try to run the transaction again until one of the variables it references during the transaction has been modified by some other transactional code. This makes the STM monad quite efficient.

An example program using the transfer function might look like this:

moduleMainwhereimportControl.Concurrent(forkIO)importControl.Concurrent.STMimportControl.Monad(forever)importSystem.Exit(exitSuccess)typeAccount=TVarIntegermain=dobob<-newAccount10000jill<-newAccount4000repeatIO2000$forkIO$atomically$transfer1bobjillforever$dobobBalance<-atomically$readTVarbobjillBalance<-atomically$readTVarjillputStrLn("Bob's balance: "++showbobBalance++", Jill's balance: "++showjillBalance)ifbobBalance==8000thenexitSuccesselseputStrLn"Trying again."repeatIO::Integer->IOa->IOarepeatIO1m=mrepeatIOnm=m>>repeatIO(n-1)mnewAccount::Integer->IOAccountnewAccountamount=newTVarIOamounttransfer::Integer->Account->Account->STM()transferamountfromto=dofromVal<-readTVarfromif(fromVal-amount)>=0thendodebitamountfromcreditamounttoelseretrycredit::Integer->Account->STM()creditamountaccount=docurrent<-readTVaraccountwriteTVaraccount(current+amount)debit::Integer->Account->STM()debitamountaccount=docurrent<-readTVaraccountwriteTVaraccount(current-amount)

which should print out "Bob's balance: 8000, Jill's balance: 6000". Here the atomically function has been used to run STM actions in the IO monad.

Related Research Articles

<span class="mw-page-title-main">Bookkeeping</span> Recording financial transactions or events

Bookkeeping is the recording of financial transactions, and is part of the process of accounting in business and other organizations. It involves preparing source documents for all transactions, operations, and other events of a business. Transactions include purchases, sales, receipts and payments by an individual person or an organization/corporation. There are several standard methods of bookkeeping, including the single-entry and double-entry bookkeeping systems. While these may be viewed as "real" bookkeeping, any process for recording financial transactions is a bookkeeping process.

In computer science, ACID is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequence of database operations that satisfies the ACID properties is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.

Double-entry bookkeeping, also known as double-entry accounting, is a method of bookkeeping that relies on a two-sided accounting entry to maintain financial information. Every entry to an account requires a corresponding and opposite entry to a different account. The double-entry system has two equal and corresponding sides known as debit and credit. A transaction in double-entry bookkeeping always affects at least two accounts, always includes at least one debit and one credit, and always has total debits and total credits that are equal. The purpose of double-entry bookkeeping is to allow the detection of financial errors and fraud.

A database transaction symbolizes a unit of work, performed within a database management system against a database, that is treated in a coherent and reliable way independent of other transactions. A transaction generally represents any change in a database. Transactions in a database environment have two main purposes:

  1. To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure. For example: when execution prematurely and unexpectedly stops in which case many operations upon a database remain uncompleted, with unclear status.
  2. To provide isolation between programs accessing a database concurrently. If this isolation is not provided, the programs' outcomes are possibly erroneous.

In computer science, a lock or mutex is a synchronization primitive that prevents state from being modified or accessed by multiple threads of execution at once. Locks enforce mutual exclusion concurrency control policies, and with a variety of possible methods there exists multiple unique implementations for different applications.

<span class="mw-page-title-main">Debits and credits</span> Sides of an account in double-entry bookeeping

Debits and credits in double-entry bookkeeping are entries made in account ledgers to record changes in value resulting from business transactions. A debit entry in an account represents a transfer of value to that account, and a credit entry represents a transfer from the account. Each transaction transfers value from credited accounts to debited accounts. For example, a tenant who writes a rent check to a landlord would enter a credit for the bank account on which the check is drawn, and a debit in a rent expense account. Similarly, the landlord would enter a credit in the rent income account associated with the tenant and a debit for the bank account where the check is deposited.

In database systems, atomicity is one of the ACID transaction properties. An atomic transaction is an indivisible and irreducible series of database operations such that either all occur, or none occur. A guarantee of atomicity prevents partial database updates from occurring, because they can cause greater problems than rejecting the whole series outright. As a consequence, the transaction cannot be observed to be in progress by another database client. At one moment in time, it has not yet happened, and at the next it has already occurred in whole.

<span class="mw-page-title-main">Transaction account</span> Bank holding that clients can access on demand

A transaction account, also called a checking account, chequing account, current account, demand deposit account, or share draft account at credit unions, is a deposit account or bank account held at a bank or other financial institution. It is available to the account owner "on demand" and is available for frequent and immediate access by the account owner or to others as the account owner may direct. Access may be in a variety of ways, such as cash withdrawals, use of debit cards, cheques and electronic transfer. In economic terms, the funds held in a transaction account are regarded as liquid funds. In accounting terms, they are considered as cash.

In computer science, an algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread; for some operations, these algorithms provide a useful alternative to traditional blocking implementations. A non-blocking algorithm is lock-free if there is guaranteed system-wide progress, and wait-free if there is also guaranteed per-thread progress. "Non-blocking" was used as a synonym for "lock-free" in the literature until the introduction of obstruction-freedom in 2003.

In functional programming, a monad is a structure that combines program fragments (functions) and wraps their return values in a type with additional computation. In addition to defining a wrapping monadic type, monads define two operators: one to wrap a value in the monad type, and another to compose together functions that output values of the monad type. General-purpose languages use monads to reduce boilerplate code needed for common operations. Functional languages use monads to turn complicated sequences of functions into succinct pipelines that abstract away control flow, and side-effects.

In computer science, compare-and-swap (CAS) is an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location with a given value and, only if they are the same, modifies the contents of that memory location to a new given value. This is done as a single atomic operation. The atomicity guarantees that the new value is calculated based on up-to-date information; if the value had been updated by another thread in the meantime, the write would fail. The result of the operation must indicate whether it performed the substitution; this can be done either with a simple boolean response, or by returning the value read from the memory location.

<span class="mw-page-title-main">Trial balance</span> List of all business accounts in a ledger

A trial balance is a list of all the general ledger accounts contained in the ledger of a business. This list will contain the name of each nominal ledger account in the order of liquidity and the value of that nominal ledger balance. Each nominal ledger account will hold either a debit balance or a credit balance. The debit balance values will be listed in the debit column of the trial balance and the credit value balance will be listed in the credit column. The trading profit and loss statement and balance sheet and other financial reports can then be produced using the ledger accounts listed on the same balance.

<span class="mw-page-title-main">Linearizability</span> Property of some operation(s) in concurrent programming

In concurrent programming, an operation is linearizable if it consists of an ordered list of invocation and response events, that may be extended by adding response events such that:

  1. The extended list can be re-expressed as a sequential history.
  2. That sequential history is a subset of the original unextended list.

In computer science, software transactional memory (STM) is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. It is an alternative to lock-based synchronization. STM is a strategy implemented in software, rather than as a hardware component. A transaction in this context occurs when a piece of code executes a series of reads and writes to shared memory. These reads and writes logically occur at a single instant in time; intermediate states are not visible to other (successful) transactions. The idea of providing hardware support for transactions originated in a 1986 paper by Tom Knight. The idea was popularized by Maurice Herlihy and J. Eliot B. Moss. In 1995 Nir Shavit and Dan Touitou extended this idea to software-only transactional memory (STM). Since 2005, STM has been the focus of intense research and support for practical implementations is growing.

Lamport's bakery algorithm is a computer algorithm devised by computer scientist Leslie Lamport, as part of his long study of the formal correctness of concurrent systems, which is intended to improve the safety in the usage of shared resources among multiple threads by means of mutual exclusion.

<span class="mw-page-title-main">Payment card</span> Card issued by a financial institution that can be used to make a payment

Payment cards are part of a payment system issued by financial institutions, such as a bank, to a customer that enables its owner to access the funds in the customer's designated bank accounts, or through a credit account and make payments by electronic transfer with a payment terminal and access automated teller machines (ATMs). Such cards are known by a variety of names, including bank cards, ATM cards, client cards, key cards or cash cards.

In computer science and engineering, transactional memory attempts to simplify concurrent programming by allowing a group of load and store instructions to execute in an atomic way. It is a concurrency control mechanism analogous to database transactions for controlling access to shared memory in concurrent computing. Transactional memory systems provide high-level abstraction as an alternative to low-level thread synchronization. This abstraction allows for coordination between concurrent reads and writes of shared data in parallel systems.

Authorization hold is a service offered by credit and debit card providers whereby the provider puts a hold of the amount approved by the cardholder, reducing the balance of available funds until the merchant clears the transaction, after the transaction is completed or aborted, or because the hold expires.

This article describes the features in the programming language Haskell.

A concurrent hash-trie or Ctrie is a concurrent thread-safe lock-free implementation of a hash array mapped trie. It is used to implement the concurrent map abstraction. It has particularly scalable concurrent insert and remove operations and is memory-efficient. It is the first known concurrent data-structure that supports O(1), atomic, lock-free snapshots.

References

  1. Simon Peyton Jones, Andrew D. Gordon, and Sigbjorn Finne. Concurrent Haskell. ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (PoPL). 1996. (Some sections are out of date with respect to the current implementation.)
  2. The Haskell Hierarchical Libraries, Control.Concurrent Archived 2012-08-02 at archive.today
  3. Tim Harris, Simon Marlow, Simon Peyton Jones, and Maurice Herlihy. Composable Memory Transactions. ACM Symposium on Principles and Practice of Parallel Programming 2005 (PPoPP'05). 2005.
  4. Control.Concurrent.STM