Hashcash

Last updated July 16, 2024

Hashcash is a proof-of-work system used to limit email spam and denial-of-service attacks. Hashcash was proposed in 1997 by Adam Back ^[1] and described more formally in Back's 2002 paper "Hashcash - A Denial of Service Counter-Measure".^[2] In Hashcash the client has to concatenate a random number with a string several times and hash this new string. It then has to do so over and over until a hash beginning with a certain amount of zeros is found.^[3]

Background

The idea "...to require a user to compute a moderately hard, but not intractable function..." was proposed by Cynthia Dwork and Moni Naor in their 1992 paper "Pricing via Processing or Combatting Junk Mail".^[4]

How it works

Hashcash is a cryptographic hash-based proof-of-work algorithm that requires a selectable amount of work to compute, but the proof can be verified efficiently. For email uses, a textual encoding of a hashcash stamp is added to the header of an email to prove the sender has expended a modest amount of CPU time calculating the stamp prior to sending the email. In other words, as the sender has taken a certain amount of time to generate the stamp and send the email, it is unlikely that they are a spammer. The receiver can, at negligible computational cost, verify that the stamp is valid. However, the only known way to find a header with the necessary properties is brute force, trying random values until the answer is found; though testing an individual string is easy, satisfactory answers are rare enough that it will require a substantial number of tries to find the answer.

The hypothesis is that spammers, whose business model relies on their ability to send large numbers of emails with very little cost per message, will cease to be profitable if there is even a small cost for each spam they send. Receivers can verify whether a sender made such an investment and use the results to help filter email.

Technical details

The header line looks something like this:^[5]

X-Hashcash: 1:20:1303030600:adam@cypherspace.org::McMybZIhxKXu57jd:ckvi

The header contains:

ver: Hashcash format version, 1 (which supersedes version 0).
bits: Number of "partial pre-image" (zero) bits in the hashed code.
date: The time that the message was sent, in the format YYMMDD[hhmm[ss]].
resource: Resource data string being transmitted, e.g., an IP address or email address.
ext: Extension (optional; ignored in version 1).
rand: String of random characters, encoded in base-64 format.
counter: Binary counter, encoded in base-64 format.

The header contains the recipient's email address, the date of the message, and information proving that the required computation has been performed. The presence of the recipient's email address requires that a different header be computed for each recipient. The date allows the recipient to record headers received recently and to ensure that the header is unique to the email message.

Sender's side

The sender prepares a header and appends a counter value initialized to a random number. It then computes the 160-bit SHA-1 hash of the header. If the first 20 bits (i.e. the 5 most significant hex digits) of the hash are all zeros, then this is an acceptable header. If not, then the sender increments the counter and tries the hash again. Out of 2¹⁶⁰ possible hash values, there are 2¹⁴⁰ hash values that satisfy this criterion. Thus the chance of randomly selecting a header that will have 20 zeros as the beginning of the hash is 1 in 2²⁰ (approx. 10⁶, or about one in a million). The number of times that the sender needs to try to get a valid hash value is modeled by geometric distribution. Hence the sender will on average have to try 2²⁰ values to find a valid header. Given reasonable estimates of the time needed to compute the hash, this would take about one second to find. No more efficient method than this brute force approach is known to find a valid header.

A normal user on a desktop PC would not be significantly inconvenienced by the processing time required to generate the Hashcash string. However, spammers would suffer significantly due to the large number of spam messages sent by them.

Recipient's side

Technically the system is implemented with the following steps:

The recipient's computer calculates the 160-bit SHA-1 hash of the entire string (e.g., "1:20:060408:adam@cypherspace.org::1QTjaYd7niiQA/sc:ePa"). This takes about two microseconds on a 1 GHz machine, far less time than the time it takes for the rest of the e-mail to be received. If the first 20 bits are not all zero, the hash is invalid. (Later versions may require more bits to be zero as machine processing speeds increase.)
The recipient's computer checks the date in the header (e.g., "060408", which represents the date 8 Apr 2006). If it is not within two days of the current date, it is invalid. (The two-day window compensates for clock skew and network routing time between different systems.)
The recipient's computer checks whether the e-mail address in the hash string matches any of the valid e-mail addresses registered by the recipient, or matches any of the mailing lists to which the recipient is subscribed. If a match is not found, the hash string is invalid.
The recipient's computer inserts the hash string into a database. If the string is already in the database (indicating that an attempt is being made to re-use the hash string), it is invalid.

If the hash string passes all of these tests, it is considered a valid hash string. All of these tests take far less time and disk space than receiving the body content of the e-mail.

Required effort

The time needed to compute such a hash collision is exponential with the number of zero bits. So additional zero bits can be added (doubling the amount of time needed to compute a hash with each additional zero bit) until it is too expensive for spammers to generate valid header lines.

Confirming that the header is valid is much faster and always takes the same amount of time, no matter how many zero bits are required for a valid header, since this requires only a single hashing operation.

Advantages and disadvantages

The Hashcash system has the advantage over micropayment proposals applying to legitimate e-mail that no real money is involved. Neither the sender nor recipient need to pay, thus the administrative issues involved with any micropayment system and moral issues related to charging for e-mail are entirely avoided.

On the other hand, as Hashcash requires potentially significant computational resources to be expended on each e-mail being sent, it is somewhat difficult to tune the ideal amount of average time one wishes clients to expend computing a valid header. This can mean sacrificing accessibility from low-end embedded systems or else running the risk of hostile hosts not being challenged enough to provide an effective filter from spam.

Hashcash is also fairly simple to implement in mail user agents and spam filters. No central server is needed. Hashcash can be incrementally deployed—the extra Hashcash header is ignored when it is received by mail clients that do not understand it.

One plausible analysis^[6] concluded that only one of the following cases is likely: either non-spam e-mail will get stuck due to lack of processing power of the sender, or spam e-mail is bound to still get through. Examples of each include, respectively, a centralized e-mail topology (like a mailing list), in which some server is to send an enormous amount of legitimate e-mails, and botnets or cluster farms with which spammers can increase their processing power enormously.

Most of these issues may be addressed. E.g., botnets may expire faster because users notice the high CPU load and take counter-measures, and mailing list servers can be registered in white lists on the subscribers' hosts and thus be relieved from the hashcash challenges.

Another projected problem is that computers continue to get faster according to Moore's law. So the difficulty of the calculations required must be increased over time. However, developing countries can be expected to use older hardware, which means that they will find it increasingly difficult to participate in the e-mail system. This also applies to lower-income individuals in developed countries who cannot afford the latest hardware.

Like hashcash, cryptocurrencies use a hash function as their proof-of-work system. The rise of cryptocurrency has created a demand for ASIC-based mining machines. Although most cryptocurrencies use the SHA-256 hash function, the same ASIC technology could be used to create hashcash solvers that are three orders of magnitude faster than a consumer CPU, reducing the computational hurdle for spammers.

Applications

Bitcoin mining

In contrast to hashcash in mail applications that relies on recipients to set manually an amount of work intended to deter malicious senders, the Bitcoin cryptocurrency network employs a different hash-based proof-of-work challenge to enable competitive Bitcoin mining. A Bitcoin miner runs a computer program that collects unconfirmed transactions from users on the network. Together, these can form a "block" and earn a payment to the miner, but a block is only accepted by the network if its hash meets the network's difficulty target. Thus, as in hashcash, miners must discover by brute force the "nonce" that, when included in the block, results in an acceptable hash.

Unlike hashcash, Bitcoin's difficulty target does not specify a minimum number of leading zeros in the hash. Instead, the hash is interpreted as a (very large) integer, and this integer must be less than the target integer. This is necessary because the Bitcoin network must periodically adjust its difficulty level to maintain an average time of 10 minutes between successive blocks. If only leading zeros were considered, then the difficulty could only be doubled or halved, causing the adjustment to greatly overshoot or undershoot in response to small changes in the average block time. Still, the number of leading zeros in the target serves as a good approximation of the current difficulty. In January 2020, block #614525 had 74 leading zeros.

Spam filters

Hashcash was used as a potential solution for false positives with automated spam filtering systems, as legitimate users will rarely be inconvenienced by the extra time it takes to mine a stamp.^[7] SpamAssassin was able to check for Hashcash stamps since version 2.70 until version 3.4.2, assigning a negative score (i.e. less likely to be spam) for valid, unspent Hashcash stamps. However, although the hashcash plugin is on by default, it still needs to be configured with a list of address patterns that must match against the Hashcash resource field before it will be used.^[8] Support was removed from SpamAssassin's trunk on 2019-06-26, affecting version 3.4.3 and beyond.^[9]

Email clients

The Penny Post software project^[10] on SourceForge implements Hashcash in the Mozilla Thunderbird email client.^[11] The project is named for the historical availability of conventional mailing services that cost the sender just one penny; see Penny Post for information about such mailing services in history.

Email Postmark

Microsoft also designed and implemented a now deprecated^[12] open specification called "Email Postmark". It is similar to Hashcash.^[13] This was part of Microsoft's Coordinated Spam Reduction Initiative (CSRI).^[14] The Microsoft email postmark variant of Hashcash is implemented in the Microsoft mail infrastructure components Exchange, Outlook, and Hotmail. The format differences between Hashcash and Microsoft's email postmark are that postmark hashes the body in addition to the recipient, uses a modified SHA-1 as the hash function, and uses multiple sub-puzzles to reduce proof of work variance.

Blogs

Like e-mail, blogs often fall victim to comment spam. Some blog owners have used hashcash scripts written in the JavaScript language to slow down comment spammers.^[15] Some scripts (such as wp-hashcash) claim to implement hashcash but instead depend on JavaScript obfuscation to force the client to generate a matching key; while this does require some processing power, it does not use the hashcash algorithm or hashcash stamps.

Reputation

In a digital marketplace, service providers can use hashcash to build reputation to attract clients. To build reputation, a service provider first selects a public key as its ID, and then discovers by brute force a nonce that, when concatenated to the ID, results in a hash digest with several leading zeros. The more zeros, the higher the reputation.^[16]

Intellectual property

Hashcash is not patented, and the reference implementation^[17] and most of the other implementations are free software. Hashcash is included or available for many Linux distributions.

RSA has made IPR statements to the IETF about client-puzzles^[18] in the context of an RFC^[19] that described client-puzzles (not hashcash). The RFC included hashcash in the title and referenced hashcash, but the mechanism described in it is a known-solution interactive challenge which is more akin to Client-Puzzles; hashcash is non-interactive and therefore does not have a known solution. In any case RSA's IPR statement can not apply to hashcash because hashcash predates^[1] (March 1997) the client-puzzles publication^[20] (February 1999) and the client-puzzles patent filing US7197639^[21] (February 2000).

Notes

1 2 "A partial hash collision based postage scheme" (Txt). Hashcash.org. Retrieved 13 October 2014.
↑ "Hashcash - A Denial of Service Counter-Measure" (PDF). hashcash.org. 1 August 2002. Retrieved 2 January 2019.
↑ https://www.csc.kth.se/utbildning/kth/kurser/DD143X/dkand12/Group5Mikael/final/Jonatan_Landsberg_and_Anton_Lundqvist.pdf
↑ Dwork, Cynthia; Naor, Moni (18 May 2001). "Pricing via Processing or Combatting Junk Mail". Advances in Cryptology — CRYPTO' 92. Lecture Notes in Computer Science. Vol. 740. Springer. pp. 139–147. doi: 10.1007/3-540-48071-4_10 . ISBN 978-3-540-57340-1.
↑ "hashcash - hashcash anti-spam / denial of service counter-measure tool" (Txt). Hashcash.org. Retrieved 13 October 2014.
↑ "Hashcash proof-of-work paper" (PDF). Hashcash.org. Retrieved 13 October 2014.
↑ "Hashcash FAQ". Hashcash.org. 26 June 2003. Retrieved 11 February 2014.
↑ "Mail::SpamAssassin::Plugin::Hashcash - perform hashcash verification tests". spamassassin.apache.org. Retrieved 11 November 2021.
↑ "Bug 7728 - Remove HashCash support from trunk" . Retrieved 22 September 2023.
↑ "Penny Post software project on SourceForge". Pennypost.sourceforge.net. Retrieved 13 October 2014.
↑ "Penny Post: What do you mean by Postage Stamp?". Pennypost.sourceforge.net. 16 June 2008. Archived from the original on 19 February 2014. Retrieved 11 February 2014.
↑ "Discontinued features and modified functionality in Outlook 2010". Office.microsoft.com. Retrieved 13 October 2014.
↑ "Email Postmark Validation Algorithm" (PDF). download.microsoft.com. Retrieved 13 October 2014.
↑ "The Coordinated Spam Reduction Initiative: A Technology and Policy Proposal" (PDF). Archived from the original (PDF) on 21 October 2013. Retrieved 11 February 2014.
↑ WP-Hashcash, a plugin for Wordpress blog software Archived 2005-10-27 at the Wayback Machine that implements a Hashcash-like facility, written in JavaScript, by Elliott Back
↑ Rahimpour, Sonbol; Khabbazian, Majid (3 May 2021). "Hashcashed Reputation with Application in Designing Watchtowers". 2021 IEEE International Conference on Blockchain and Cryptocurrency (ICBC). IEEE. pp. 1–9. arXiv: 2012.10825 . doi:10.1109/icbc51069.2021.9461123. ISBN 978-1-6654-3578-9. S2CID 229340600.
↑ "C reference implementation". hashcash.org. Retrieved 13 October 2014.
↑ "RSA Security Inc. has submitted a patent application (US Serial No. 09/496,824)" (Txt). Ietf.org. Retrieved 13 October 2014.
↑ "SIP Computational Puzzles". Tools.ietf.org. Retrieved 13 October 2014.
↑ "Client Puzzles" (PDF). Retrieved 13 October 2014.
↑ "Client-puzzle patent filing" . Retrieved 13 October 2014.

Related Research Articles

<span class="mw-page-title-main">Checksum</span> Data used to detect errors in other data

A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data integrity but are not relied upon to verify data authenticity.

Electronic mail is a method of transmitting and receiving messages using electronic devices. It was conceived in the late–20th century as the digital version of, or counterpart to, mail. Email is a ubiquitous and very widely used communication medium; in current use, an email address is often treated as a basic and necessary part of many processes in business, commerce, government, education, entertainment, and other spheres of daily life in most countries.

The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typically use SMTP only for sending messages to a mail server for relaying, and typically submit outgoing email to the mail server on port 587 or 465 per RFC 8314. For retrieving messages, IMAP is standard, but proprietary servers also often implement proprietary protocols, e.g., Exchange ActiveSync.

Apache SpamAssassin is a computer program used for e-mail spam filtering. It uses a variety of spam-detection techniques, including DNS and fuzzy checksum techniques, Bayesian filtering, external programs, blacklists and online databases. It is released under the Apache License 2.0 and is a part of the Apache Foundation since 2004.

Various anti-spam techniques are used to prevent email spam.

Sender Policy Framework (SPF) is an email authentication method which ensures the sending mail server is authorized to originate mail from the email sender's domain. This authentication only applies to the email sender listed in the "envelope from" field during the initial SMTP connection. If the email is bounced, a message is sent to this address, and for downstream transmission it typically appears in the "Return-Path" header. To authenticate the email address which is actually visible to recipients on the "From:" line, other technologies such as DMARC must be used. Forgery of this address is known as email spoofing, and is often used in phishing and email spam.

Greylisting is a method of defending e-mail users against spam. A mail transfer agent (MTA) using greylisting will "temporarily reject" any email from a sender it does not recognize. If the mail is legitimate, the originating server will try again after a delay, and if sufficient time has elapsed, the email will be accepted.

The Penny Black Project is a Microsoft Research project that tries to find effective and practical ways of fighting spam. Because identifying spams consumes a recipient's time, the idea is to make the sender of emails "pay" a certain amount for sending them. The currency or the mode of payment could be CPU cycles, Turing tests or memory cycles. Such a payment would limit spammers' ability to send out large quantities of emails quickly.

Proof of work (PoW) is a form of cryptographic proof in which one party proves to others that a certain amount of a specific computational effort has been expended. Verifiers can subsequently confirm this expenditure with minimal effort on their part. The concept was invented by Moni Naor and Cynthia Dwork in 1993 as a way to deter denial-of-service attacks and other service abuses such as spam on a network by requiring some work from a service requester, usually meaning processing time by a computer. The term "proof of work" was first coined and formalized in a 1999 paper by Markus Jakobsson and Ari Juels. The concept was adapted to digital tokens by Hal Finney in 2004 through the idea of "reusable proof of work" using the 160-bit secure hash algorithm 1 (SHA-1).

The Sender Rewriting Scheme (SRS) is a scheme for bypassing the Sender Policy Framework's (SPF) methods of preventing forged sender addresses. Forging a sender address is also known as email spoofing.

Emailtracking is a method for monitoring whether the email message is read by the intended recipient. Most tracking technologies use some form of digitally time-stamped record to reveal the exact time and date when an email is received or opened, as well as the IP address of the recipient.

In computing, Bounce Address Tag Validation (BATV) is a method, defined in an Internet Draft, for determining whether the bounce address specified in an E-mail message is valid. It is designed to reject backscatter, that is, bounce messages to forged return addresses.

A challenge–response system is a type of that automatically sends a reply with a challenge to the (alleged) sender of an incoming e-mail. It was originally designed in 1997 by Stan Weatherby, and was called Email Verification. In this reply, the purported sender is asked to perform some action to assure delivery of the original message, which would otherwise not be delivered. The action to perform typically takes relatively little effort to do once, but great effort to perform in large numbers. This effectively filters out spammers. Challenge–response systems only need to send challenges to unknown senders. Senders that have previously performed the challenging action, or who have previously been sent e-mail(s) to, would be automatically receive a challenge.

DomainKeys Identified Mail (DKIM) is an email authentication method designed to detect forged sender addresses in email, a technique often used in phishing and email spam.

<span class="mw-page-title-main">Cryptographic nonce</span> Concept in cryptography

In cryptography, a nonce is an arbitrary number that can be used just once in a cryptographic communication. It is often a random or pseudo-random number issued in an authentication protocol to ensure that old communications cannot be reused in replay attacks. They can also be useful as initialization vectors and in cryptographic hash functions.

A Digital Postmark (DPM) is a technology that applies a trusted time stamp issued by a postal operator to an electronic document, validates electronic signatures, and stores and archives all non-repudiation data needed to support a potential court challenge. It guarantees the certainty of date and time of the postmarking. This global standard was renamed the Electronic Postal Certification Mark (EPCM) in 2007 shortly after a new iteration of the technology was developed by Microsoft and Poste Italiane. The key addition to the traditional postmarking technology was integrity of the electronically postmarked item, meaning any kind of falsification and tampering will be easily and definitely detected.

Trusted timestamping is the process of securely keeping track of the creation and modification time of a document. Security here means that no one—not even the owner of the document—should be able to change it once it has been recorded provided that the timestamper's integrity is never compromised.

Since spam occurs primarily because it is so cheap to send, a proposed set of solutions require that senders pay some cost in order to send spam, making it prohibitively expensive for spammers.

The Bitcoin protocol is the set of rules that govern the functioning of Bitcoin. Its key components and principles are: a peer-to-peer decentralized network with no central oversight; the blockchain technology, a public ledger that records all Bitcoin transactions; mining and proof of work, the process to create new bitcoins and verify transactions; and cryptographic security.

Proof of space (PoS) is a type of consensus algorithm achieved by demonstrating one's legitimate interest in a service by allocating a non-trivial amount of memory or disk space to solve a challenge presented by the service provider. The concept was formulated in 2013 by Dziembowski et al. and by Ateniese et al.. Proofs of space are very similar to proofs of work (PoW), except that instead of computation, storage is used to earn cryptocurrency. Proof-of-space is different from memory-hard functions in that the bottleneck is not in the number of memory access events, but in the amount of memory required.

References

Adam Back, "Hashcash - A Denial of Service Counter-Measure", technical report, August 2002 (PDF).
Ben Laurie and Richard Clayton, "'Proof-of-Work' Proves Not to Work", WEIS 04. (PDF).
Dwork, C. and Naor, M. (1992) "Pricing via Processing or Combating Junk Mail", Crypto '92, pp. 139–147. (PDF)

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Announce-1] 1 2 "A partial hash collision based postage scheme" (Txt). Hashcash.org. Retrieved 13 October 2014.

[Hashcash-2] "Hashcash - A Denial of Service Counter-Measure" (PDF). hashcash.org. 1 August 2002. Retrieved 2 January 2019.

[3] ttps://www.csc.kth.se/utbildning/kth/kurser/DD143X/dkand12/Group5Mikael/final/Jonatan_Landsberg_and_Anton_Lundqvist.pdf

[POW_Paper-4] Dwork, Cynthia; Naor, Moni (18 May 2001). "Pricing via Processing or Combatting Junk Mail". Advances in Cryptology — CRYPTO' 92. Lecture Notes in Computer Science. Vol. 740. Springer. pp. 139–147. doi: 10.1007/3-540-48071-4_10 . ISBN 978-3-540-57340-1.

[5] "hashcash - hashcash anti-spam / denial of service counter-measure tool" (Txt). Hashcash.org. Retrieved 13 October 2014.

[6] "Hashcash proof-of-work paper" (PDF). Hashcash.org. Retrieved 13 October 2014.

[7] "Hashcash FAQ". Hashcash.org. 26 June 2003. Retrieved 11 February 2014.

[8] "Mail::SpamAssassin::Plugin::Hashcash - perform hashcash verification tests". spamassassin.apache.org. Retrieved 11 November 2021.

[9] "Bug 7728 - Remove HashCash support from trunk" . Retrieved 22 September 2023.

[10] "Penny Post software project on SourceForge". Pennypost.sourceforge.net. Retrieved 13 October 2014.

[11] "Penny Post: What do you mean by Postage Stamp?". Pennypost.sourceforge.net. 16 June 2008. Archived from the original on 19 February 2014. Retrieved 11 February 2014.

[12] "Discontinued features and modified functionality in Outlook 2010". Office.microsoft.com. Retrieved 13 October 2014.

[13] "Email Postmark Validation Algorithm" (PDF). download.microsoft.com. Retrieved 13 October 2014.

[14] "The Coordinated Spam Reduction Initiative: A Technology and Policy Proposal" (PDF). Archived from the original (PDF) on 21 October 2013. Retrieved 11 February 2014.

[15] WP-Hashcash, a plugin for Wordpress blog software Archived 2005-10-27 at the Wayback Machine that implements a Hashcash-like facility, written in JavaScript, by Elliott Back

[16] Rahimpour, Sonbol; Khabbazian, Majid (3 May 2021). "Hashcashed Reputation with Application in Designing Watchtowers". 2021 IEEE International Conference on Blockchain and Cryptocurrency (ICBC). IEEE. pp. 1–9. arXiv: 2012.10825 . doi:10.1109/icbc51069.2021.9461123. ISBN 978-1-6654-3578-9. S2CID 229340600.

[17] "C reference implementation". hashcash.org. Retrieved 13 October 2014.

[18] "RSA Security Inc. has submitted a patent application (US Serial No. 09/496,824)" (Txt). Ietf.org. Retrieved 13 October 2014.

[19] "SIP Computational Puzzles". Tools.ietf.org. Retrieved 13 October 2014.

[20] "Client Puzzles" (PDF). Retrieved 13 October 2014.

[21] "Client-puzzle patent filing" . Retrieved 13 October 2014.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]