Directory harvest attack

Last updated

A directory harvest attack (DHA) is a technique used by spammers in an attempt to find valid/existent e-mail addresses at a domain by using brute force. The attack is usually carried out by way of a standard dictionary attack, where valid e-mail addresses are found by brute force guessing valid e-mail addresses at a domain using different permutations of common usernames. These attacks are more effective for finding e-mail addresses of companies since they are likely to have a standard format for official e-mail aliases (i.e. jdoe@example.domain, johnd@example.domain, or johndoe@example.domain).

There are two main techniques for generating the addresses that a DHA targets. In the first, the spammer creates a list of all possible combinations of letters and numbers up to a maximum length and then appends the domain name. This would be described as a standard brute force attack. This technique would be impractical for usernames longer than 5-7 characters. For example, one would have to try 368 (nearly 3 trillion) e-mail addresses to exhaust all 8-character sequences.

The other, more targeted technique, is to create a list that combines common first name and surnames and initials (as in the example above). This would be considered a standard dictionary attack when guessing usernames for e-mail addresses. The success of a directory harvest attack relies on the recipient e-mail server rejecting e-mail sent to invalid recipient e-mail addresses during the Simple Mail Transfer Protocol (SMTP) session. Any addresses to which email is accepted are considered valid and are added to the spammer's list (which is commonly sold between spammers). Although the attack could also rely on Delivery Status Notifications (DSNs) to be sent to the sender address to notify of delivery failures, directory harvest attacks likely don't use a valid sender e-mail address.

The actual e-mail message generated to the recipient addresses will usually be a short random phrase such as "hello", so as not to trigger a spam filter. The actual content that is to be advertised will be sent in a later campaign to just the valid email addresses.

One theory is that spammers also use DHAs to disseminate spam, and not just to collect email addresses for a later spam campaign. Using the method in this way, similar to a paper-based leaflet drop, the sender achieves the goal based on sheer volume, and not on accuracy of delivery. Using this method, the message would likely contain the content that the spammer is advertising, and not a short random phrase.

Related Research Articles

Email Mail sent using electronic means

Electronic mail is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic (digital) version of, or counterpart to, mail, at a time when "mail" meant only physical mail. Email later became a ubiquitous communication medium, to the point that in current use, an e-mail address is often treated as a basic and necessary part of many processes in business, commerce, government, education, entertainment, and other spheres of daily life in most countries. Email is the medium, and each message sent therewith is called an email.

A mail exchanger record specifies the mail server responsible for accepting email messages on behalf of a domain name. It is a resource record in the Domain Name System (DNS). It is possible to configure several MX records, typically pointing to an array of mail servers for load balancing and redundancy.

An email address identifies an email box to which messages are delivered. While early messaging systems used a variety of formats for addressing, today, email addresses follow a set of specific rules originally standardized by the Internet Engineering Task Force (IETF) in the 1980s, and updated by RFC 5322 and 6854. The term email address in this article refers to addr-spec in RFC 5322, not to address or mailbox; i.e., a raw address without a display-name.

Various anti-spam techniques are used to prevent email spam.

Email spam Unsolicited electronic advertising by e-mail

Email spam, also referred to as junk email or simply spam, is unsolicited messages sent in bulk by email (spamming).

Sender Policy Framework (SPF) is an email authentication method designed to detect forging sender addresses during the delivery of the email. SPF alone, though, is limited to detecting a forged sender claim in the envelope of the email, which is used when the mail gets bounced. Only in combination with DMARC can it be used to detect the forging of the visible sender in emails, a technique often used in phishing and email spam.

Hashcash is a proof-of-work system used to limit email spam and denial-of-service attacks, and more recently has become known for its use in bitcoin as part of the mining algorithm. Hashcash was proposed in 1997 by Adam Back and described more formally in Back's 2002 paper "Hashcash - A Denial of Service Counter-Measure".

A joe job is a spamming technique that sends out unsolicited e-mails using spoofed sender data. Early joe jobs aimed at tarnishing the reputation of the apparent sender or inducing the recipients to take action against them, but they are now typically used by commercial spammers to conceal the true origin of their messages and to trick recipients into opening emails apparently coming from a trusted source.

A bounce message or just "bounce" is an automated message from an email system, informing the sender of a previous message that the message has not been delivered. The original message is said to have "bounced".

Email authentication, or validation, is a collection of techniques aimed at providing verifiable information about the origin of email messages by validating the domain ownership of any message transfer agents (MTA) who participated in transferring and possibly modifying a message.

Message submission agent

A message submission agent (MSA), or mail submission agent, is a computer program or software agent that receives electronic mail messages from a mail user agent (MUA) and cooperates with a mail transfer agent (MTA) for delivery of the mail. It uses ESMTP, a variant of the Simple Mail Transfer Protocol (SMTP), as specified in RFC 6409.

Disposable email addressing, also known as DEA or dark mail, refers to an approach which involves a unique email address being used for every contact, entity, or for a limited number of times or uses. The benefit is that if anyone compromises the address or utilizes it in connection with email abuse, the address owner can easily cancel it without affecting any of their other contacts.

Email harvesting or scraping is the process of obtaining lists of email addresses using various methods. Typically these are then used for bulk email or spam.

Email spoofing Creating email spam or phishing messages with a forged sender identity or address

Email spoofing is the creation of email messages with a forged sender address.

A challenge–response system is a type of spam filter that automatically sends a reply with a challenge to the (alleged) sender of an incoming e-mail. It was originally designed in 1997 by Stan Weatherby, and was called Email Verification. In this reply, the purported sender is asked to perform some action to assure delivery of the original message, which would otherwise not be delivered. The action to perform typically takes relatively little effort to do once, but great effort to perform in large numbers. This effectively filters out spammers. Challenge–response systems only need to send challenges to unknown senders. Senders that have previously performed the challenging action, or who have previously been sent e-mail(s) to, would be automatically whitelisted.

DomainKeys Identified Mail (DKIM) is an email authentication method designed to detect forged sender addresses in email, a technique often used in phishing and email spam.

Callback verification Technique used with SMTP to validate e-mail addresses

Callback verification, also known as callout verification or Sender Address Verification, is a technique used by SMTP software in order to validate e-mail addresses. The most common target of verification is the sender address from the message envelope. It is mostly used as an anti-spam measure.

Backscatter is incorrectly automated bounce messages sent by mail servers, typically as a side effect of incoming spam.

People tend to be much less bothered by spam slipping through filters into their mail box, than having desired e-mail ("ham") blocked. Trying to balance false negatives vs false positives is critical for a successful anti-spam system. As servers are not able to block all spam there are some tools for individual users to help control over this balance.

A cold email is an unsolicited e-mail that is sent to a receiver without prior contact. It could also be defined as the email equivalent of cold calling. Cold emailing is a subset of email marketing and differs from transactional and warm emailing.

References