This article needs additional citations for verification .(May 2024) |
A tarpit is a service on a computer system (usually a server) that purposely delays incoming connections. The technique was developed as a defense against a computer worm, and the idea is that network abuses such as spamming or broad scanning are less effective, and therefore less attractive, if they take too long. The concept is analogous with a tar pit, in which animals can get bogged down and slowly sink under the surface, like in a swamp.
Tom Liston developed the original tarpitting program LaBrea. [1] It can protect an entire network with a tarpit run on a single machine.
The machine listens for Address Resolution Protocol requests that go unanswered (indicating unused addresses), then replies to those requests, receives the initial SYN packet of the scanner and sends a SYN/ACK in response. It does not open a socket or prepare a connection, in fact it can forget all about the connection after sending the SYN/ACK. However, the remote site sends its ACK (which gets ignored) and believes the 3-way-handshake to be complete. Then it starts to send data, which never reaches a destination. The connection will time out after a while, but since the system believes it is dealing with a live (established) connection, it is conservative in timing it out and will instead try to retransmit, back-off, retransmit, etc. for quite a while.
Later versions of LaBrea also added functionality to reply to the incoming data, again using raw IP packets and no sockets or other resources of the tarpit server, with bogus packets that request that the sending site "slow down". This will keep the connection established and waste even more time of the scanner.
One of the possible avenues that were considered to battle bulk-spam at one time, was to mandate a small fee for every submitted mail. By introducing such artificial cost, with negligible impact on legitimate use as long as the fee is small enough, automated mass-scale spam would instantly become unattractive. Tarpitting could be seen as a similar (but technically much less complex) approach, where the cost for the spammer would be measured in terms of time and efficiency rather than money.
Authentication procedures increase response times as users attempt invalid passwords. SMTP authentication is no exception. However, server-to-server SMTP transfers, which is where spam is injected, require no authentication. Various methods have been discussed and implemented for SMTP tarpits, systems that plug into the Mail Transfer Agent (MTA, i.e. the mail server software) or sit in front of it as a proxy.
One method increases transfer time for all mails by a few seconds by delaying the initial greeting message ("greet delay"). The idea is that it will not matter if a legitimate mail takes a little longer to deliver, but due to the high volume, it will make a difference for spammers. The downside of this is that mailing lists and other legitimate mass-mailings will have to be explicitly whitelisted or they will suffer, too.
Some email systems, such as sendmail 8.13+, implement a stronger form of greet delay. This form pauses when the connection is first established and listens for traffic. If it detects any traffic prior to its own greeting (in violation of RFC 2821) it closes the connection. Since many spammers do not write their SMTP implementations to the specification, this can reduce the number of incoming spam messages.
Another method is to delay only known spammers, e.g. by using a blacklist (see Spamming, DNSBL). OpenBSD has integrated this method into their core system since OpenBSD 3.3, [2] with a special-purpose daemon (spamd) and functionality in the firewall (pf) to redirect known spammers to this tarpit.
MS Exchange can tarpit senders who send to an invalid address. Exchange can do this because the SMTP connector is connected to the authentication system.
A more subtle idea is greylisting, which, in simple terms, rejects the first connection attempt from any previously unseen IP address. The assumption is that most spammers make only one connection attempt (or a few attempts over a short period of time) to send each message, whereas legitimate mail delivery systems will keep retrying over a longer period. After they retry, they will eventually be allowed in without any further impediments.
Finally, a more elaborate method tries to glue tarpits and filtering software together, by filtering e-mail in realtime, while it is being transmitted, and adding delays to the communication in response to the filter's "spam likeliness" indicator. For example, the spam filter would make a "guess" after each line or after every x bytes received as to how likely this message is going to be spam. The more likely this is, the more the MTA will delay the transmission.
SMTP consists of requests, which are mostly four-letter words such as MAIL, and replies, which are (minimally) three-digit numbers. In the last line of the reply, the number is followed by a space; in the preceding lines it is followed by a hyphen. Thus, on determining that a message being attempted to send is spam, a mail server can reply:
451-Ophiomyia prima is an agromyzid fly 451-Ophiomyia secunda is an agromyzid fly 451-Ophiomyia tertia is an agromyzid fly 451-Ophiomyia quarta is an agromyzid fly 451-Ophiomyia quinta is an agromyzid fly 451-Ophiomyia sexta is an agromyzid fly 451-Ophiomyia septima is an agromyzid fly 451 Your IP address is listed in the DNSBL. Please try again later.
The tarpit waits fifteen or more seconds between lines (long delays are allowed in SMTP, as humans sometimes send mail manually to test mail servers). This ties up the SMTP sending process on the spammer's computer so as to limit the amount of spam it can send.
The Linux kernel can now be patched to allow tarpitting of incoming connections instead of the more usual dropping of packets. This is implemented in iptables by the addition of a TARPIT target. [3] The same packet inspection and matching features can be applied to tarpit targets as are applied to other targets.
A server can determine that a given mail message is spam, e.g. because it was addressed to a spam trap, or after trusted users' reports. The server may decide that the IP address responsible for submitting the message deserves tarpitting. Cross-checking against available DNSBLs can help to avoid including innocent forwarders in the tarpit database. A daemon exploiting Linux libipq can then check the remote address of incoming SMTP connections against that database. SpamCannibal is a GPL software designed around this idea; [4] Stockade is a similar project implemented using FreeBSD ipfirewall.
One advantage of tarpitting at the IP level is that regular TCP connections handled by an MTA are stateful. That is, although the MTA doesn't use much CPU while it sleeps, it still uses the amount of memory required to hold the state of each connection. On the opposite, LaBrea-style tarpitting is stateless, thus gaining the advantage of a reduced cost against the spammer's box. However, it has to be noted that making use of botnets, spammers can externalize most of their computer-resource costs.
It is known that a tarpitted connection may generate a significant amount of traffic towards the receiver, because the sender considers the connection as established and tries to send (and then retransmit) actual data. In practice, given current average computer botnet size, a more reasonable solution will be to drop suspicious traffic completely, without tarpitting. This way, only TCP SYN segments will be retransmitted, not the whole HTTP or HTTPS requests. [5]
As well as MS Exchange, there have been two other successful commercial implementations of the tar pit idea. The first was developed by TurnTide, a Philadelphia-based startup company, which was acquired by Symantec in 2004 for $28 million in cash. [6] The TurnTide Anti Spam Router contains a modified Linux kernel which allows it to play various tricks with TCP traffic, such as varying the TCP window size. By grouping various email senders into different traffic classes and limiting the bandwidth for each class, the amount of abusive traffic is reduced - particularly when the abusive traffic is coming from single sources which are easily identified by their high traffic volume.
After the Symantec acquisition, a Canadian startup company called MailChannels released their "Traffic Control" software, which uses a slightly different approach to achieve similar results. Traffic Control is a semi-realtime SMTP proxy. Unlike the TurnTide appliance, which applies traffic shaping at the network layer, Traffic Control applies traffic shaping to individual senders at the application layer. This approach results in a somewhat more effective handling of spam traffic originating from botnets because it allows the software to slow traffic from individual spam zombies, rather than requiring zombie traffic to be aggregated into a class.
The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typically use SMTP only for sending messages to a mail server for relaying, and typically submit outgoing email to the mail server on port 587 or 465 per RFC 8314. For retrieving messages, IMAP is standard, but proprietary servers also often implement proprietary protocols, e.g., Exchange ActiveSync.
The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonly referred to as TCP/IP. TCP provides reliable, ordered, and error-checked delivery of a stream of octets (bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the World Wide Web, email, remote administration, and file transfer rely on TCP, which is part of the Transport layer of the TCP/IP suite. SSL/TLS often runs on top of TCP.
An open mail relay is a Simple Mail Transfer Protocol (SMTP) server configured in such a way that it allows anyone on the Internet to send e-mail through it, not just mail destined to or originating from known users. This used to be the default configuration in many mail servers; indeed, it was the way the Internet was initially set up, but open mail relays have become unpopular because of their exploitation by spammers and worms. Many relays were closed, or were placed on blacklists by other servers.
A mail exchanger record specifies the mail server responsible for accepting email messages on behalf of a domain name. It is a resource record in the Domain Name System (DNS). It is possible to configure several MX records, typically pointing to an array of mail servers for load balancing and redundancy.
A Domain Name System blocklist, Domain Name System-based blackhole list, Domain Name System blacklist (DNSBL) or real-time blackhole list (RBL) is a service for operation of mail servers to perform a check via a Domain Name System (DNS) query whether a sending host's IP address is blacklisted for email spam. Most mail server software can be configured to check such lists, typically rejecting or flagging messages from such sites.
Various anti-spam techniques are used to prevent email spam.
The Distributed Sender Blackhole List was a Domain Name System-based Blackhole List that listed IP addresses of insecure e-mail hosts. DSBL could be used by server administrators to tag or block e-mail messages that came from insecure servers, which is often spam.
Greylisting is a method of defending e-mail users against spam. A mail transfer agent (MTA) using greylisting will "temporarily reject" any email from a sender it does not recognize. If the mail is legitimate, the originating server will try again after a delay, and if sufficient time has elapsed, the email will be accepted.
A bounce message or just "bounce" is an automated message from an email system, informing the sender of a previous message that the message has not been delivered. The original message is said to have "bounced".
Email authentication, or validation, is a collection of techniques aimed at providing verifiable information about the origin of email messages by validating the domain ownership of any message transfer agents (MTA) who participated in transferring and possibly modifying a message.
A message submission agent (MSA), or mail submission agent, is a computer program or software agent that receives electronic mail messages from a mail user agent (MUA) and cooperates with a mail transfer agent (MTA) for delivery of the mail. It uses ESMTP, a variant of the Simple Mail Transfer Protocol (SMTP), as specified in RFC 6409.
The Sender Rewriting Scheme (SRS) is a scheme for bypassing the Sender Policy Framework's (SPF) methods of preventing forged sender addresses. Forging a sender address is also known as email spoofing.
In networking, a black hole refers to a place in the network where incoming or outgoing traffic is silently discarded, without informing the source that the data did not reach its intended recipient.
SYN cookie is a technique used to resist SYN flood attacks. The technique's primary inventor Daniel J. Bernstein defines SYN cookies as "particular choices of initial TCP sequence numbers by TCP servers." In particular, the use of SYN cookies allows a server to avoid dropping connections when the SYN queue fills up. Instead of storing additional connections, a SYN queue entry is encoded into the sequence number sent in the SYN+ACK response. If the server then receives a subsequent ACK response from the client with the incremented sequence number, the server is able to reconstruct the SYN queue entry using information encoded in the TCP sequence number and proceed as usual with the connection.
Callback verification, also known as callout verification or Sender Address Verification, is a technique used by SMTP software in order to validate e-mail addresses. The most common target of verification is the sender address from the message envelope. It is mostly used as an anti-spam measure.
SMTP proxies are specialized mail servers that, similar to other types of proxy servers, pass simple mail transfer protocol (SMTP) sessions through to other SMTP servers without using the store-and-forward approach of a mail transfer agent (MTA). When an SMTP proxy accepts a connection, it initiates another SMTP session to a destination SMTP server. Any errors or status information from the destination server will be passed back to the sending MTA through the proxy.
Backscatter is incorrectly automated bounce messages sent by mail servers, typically as a side effect of incoming spam.
Email spammers have developed a variety of ways to deliver email spam throughout the years, such as mass-creating accounts on services such as Hotmail or using another person's network to send email spam. Many techniques to block, filter, or otherwise remove email spam from inboxes have been developed by internet users, system administrators and internet service providers. Due to this, email spammers have developed their own techniques to send email spam, which are listed below.
Stockade is a TCP-layer blocking tool written in C++. It denies TCP/IP access to registered IP addresses by using the ipfw packet filter. It targets spam prevention, but may also be used against other attackers
Haraka is an open source SMTP server. Its architecture is plugin-oriented and event-driven. The server and its plugins are written in JavaScript using the Node.js framework.