Email address

Last updated

An email address identifies an email box to which messages are delivered. While early messaging systems used a variety of formats for addressing, today, email addresses follow a set of specific rules originally standardized by the Internet Engineering Task Force (IETF) in the 1980s, and updated by RFC   5322 and 6854. The term email address in this article refers to just the addr-spec in Section 3.4 of RFC 5322. The RFC defines address more broadly as either a mailbox or group. A mailbox value can be either a name-addr, which contains a display-name and addr-spec, or the more common addr-spec alone.

Contents

An email address, such as john.smith@example.com, is made up from a local-part, the symbol @, and a domain , which may be a domain name or an IP address enclosed in brackets. Although the standard requires the local-part to be case-sensitive, [1] it also urges that receiving hosts deliver messages in a case-independent manner, [2] e.g., that the mail system in the domain example.com treat John.Smith as equivalent to john.smith; some mail systems even treat them as equivalent to johnsmith. [3] Mail systems often limit the users' choice of name to a subset of the technically permitted characters.

With the introduction of internationalized domain names, efforts are progressing to permit non-ASCII characters in email addresses.

Message transport

An email address consists of two parts, a local-part (sometimes a user name, but not always) and a domain; if the domain is a domain name rather than an IP address then the SMTP client uses the domain name to look up the mail exchange IP address. The general format of an email address is local-part@domain, e.g. jsmith@[192.168.1.2], jsmith@example.com. The SMTP client transmits the message to the mail exchange, which may forward it to another mail exchange until it eventually arrives at the host of the recipient's mail system.

The transmission of electronic mail from the author's computer and between mail hosts in the Internet uses the Simple Mail Transfer Protocol (SMTP), defined in RFC   5321 and 5322, and extensions such as RFC 6531. The mailboxes may be accessed and managed by applications on personal computers, mobile devices or webmail sites, using the SMTP protocol and either the Post Office Protocol (POP) or the Internet Message Access Protocol (IMAP).

When transmitting email messages, mail user agents (MUAs) and mail transfer agents (MTAs) use the domain name system (DNS) to look up a Resource Record (RR) for the recipient's domain. A mail exchanger resource record (MX record) contains the name of the recipient's mailserver. In absence of an MX record, an address record (A or AAAA) directly specifies the mail host.

The local-part of an email address has no significance for intermediate mail relay systems other than the final mailbox host. Email senders and intermediate relay systems must not assume it to be case-insensitive, since the final mailbox host may or may not treat it as such. A single mailbox may receive mail for multiple email addresses, if configured by the administrator. Conversely, a single email address may be the alias to a distribution list to many mailboxes. Email aliases, electronic mailing lists, sub-addressing, and catch-all addresses, the latter being mailboxes that receive messages regardless of the local-part, are common patterns for achieving a variety of delivery goals.

The addresses found in the header fields of an email message are not directly used by mail exchanges to deliver the message. An email message also contains a message envelope that contains the information for mail routing. While envelope and header addresses may be equal, forged email addresses (also called spoofed email addresses) are often seen in spam, phishing, and many other Internet-based scams. This has led to several initiatives which aim to make such forgeries of fraudulent emails easier to spot.

Syntax

The format of an email address is local-part@domain, where the local-part may be up to 64 octets long and the domain may have a maximum of 255 octets. [4] The formal definitions are in RFC 5322 (sections 3.2.3 and 3.4.1) and RFC 5321—with a more readable form given in the informational RFC 3696 (written by J. Klensin, the author of RFC 5321) and the associated errata.

An email address also may have an associated "display-name" (Display Name) for the recipient, which precedes the address specification, now surrounded by angled brackets, for example: John Smith <john.smith@example.org>. [5] Email spammers and phishers will often use "Display Name spoofing" to trick their victims, by using a false Display Name, or by using a different email address as the Display Name. [6]

Earlier forms of email addresses for other networks than the Internet included other notations, such as that required by X.400, and the UUCP bang path notation, in which the address was given in the form of a sequence of computers through which the message should be relayed. This was widely used for several years, but was superseded by the Internet standards promulgated by the Internet Engineering Task Force (IETF).

Local-part

The local-part of the email address may be unquoted or may be enclosed in quotation marks.

If unquoted, it may use any of these ASCII characters:

If quoted, it may contain Space, Horizontal Tab (HT), any ASCII graphic except Backslash and Quote and a quoted-pair consisting of a Backslash followed by HT, Space or any ASCII graphic; it may also be split between lines anywhere that HT or Space appears. In contrast to unquoted local-parts, the addresses ".John.Doe"@example.com, "John.Doe."@example.com and "John..Doe"@example.com are allowed.

The maximum total length of the local-part of an email address is 64 octets. [8]

In addition to the above ASCII characters, international characters above U+007F, encoded as UTF-8, are permitted by RFC 6531 when the EHLO specifies SMTPUTF8, though even mail systems that support SMTPUTF8 and 8BITMIME may restrict which characters to use when assigning local-parts.

A local-part is either a Dot-string or a Quoted-string; it cannot be a combination. Quoted strings and characters, however, are not commonly used.[ citation needed ]RFC 5321 also warns that "a host that expects to receive mail SHOULD avoid defining mailboxes where the Local-part requires (or uses) the Quoted-string form".

The local-part postmaster is treated specially—it is case-insensitive, and should be forwarded to the domain email administrator. Technically all other local-parts are case-sensitive, therefore johns@example.com and JohnS@example.com specify different mailboxes; however, many organizations treat uppercase and lowercase letters as equivalent. Indeed, RFC 5321 warns that "a host that expects to receive mail SHOULD avoid defining mailboxes where ... the Local-part is case-sensitive".

Despite the wide range of special characters which are technically valid, organisations, mail services, mail servers and mail clients in practice often do not accept all of them. For example, Windows Live Hotmail only allows creation of email addresses using alphanumerics, dot (.), underscore (_) and hyphen (-). [9] Common advice is to avoid using some special characters to avoid the risk of rejected emails. [10]

According to RFC 5321 2.3.11 Mailbox and Address, "the local-part MUST be interpreted and assigned semantics only by the host specified in the domain of the address". This means that no assumptions can be made about the meaning of the local-part of another mail server. It is entirely up to the configuration of the mail server.

Interpretation of the local-part is dependent on the conventions and policies implemented in the mail server. For example, case sensitivity may distinguish mailboxes differing only in capitalization of characters of the local-part, although this is not very common. [11] For example, Gmail ignores all dots in the local-part of a @gmail.com address for the purposes of determining account identity. [12]

Sub-addressing

Some mail services support a tag included in the local-part, such that the address is an alias to a prefix of the local-part. Typically the characters following a plus and less often the characters following a minus, so fred+bah@domain and fred+foo@domain might end up in the same inbox as fred+@domain or even as fred@domain. For example, the address joeuser+tag@example.com denotes the same delivery address as joeuser@example.com. RFC   5233 [13] refers to this convention as subaddressing, but it is also known as plus addressing, tagged addressing or mail extensions. This can be useful for tagging emails for sorting, and for spam control. [14]

Addresses of this form, using various separators between the base name and the tag, are supported by several email services, including Andrew Project (plus), [15] Runbox (plus), [16] Gmail (plus), [14] Rackspace (plus), Yahoo! Mail Plus (hyphen), [17] Apple's iCloud (plus), Outlook.com (plus), [18] Proton Mail (plus), [19] Fastmail (plus and Subdomain Addressing), [20] postale.io (plus), [21] Pobox (plus), [22] MeMail (plus), [23] MMDF (equals), Qmail and Courier Mail Server (hyphen). [24] [25] Postfix and Exim allow configuring an arbitrary separator from the legal character set. [26] [27]

The text of the tag may be used to apply filtering, [24] or to create single-use, or disposable email addresses. [28]

Domain

The domain name part of an email address has to conform to strict guidelines: it must match the requirements for a hostname, a list of dot-separated DNS labels, each label being limited to a length of 63 characters and consisting of: [7] :§2

This rule is known as the LDH rule (letters, digits, hyphen). In addition, the domain may be an IP address literal, surrounded by square brackets [], such as jsmith@[192.168.2.1] or jsmith@[IPv6:2001:db8::1], although this is rarely seen except in email spam. Internationalized domain names (which are encoded to comply with the requirements for a hostname) allow for presentation of non-ASCII domains. In mail systems compliant with RFC 6531 and RFC 6532 an email address may be encoded as UTF-8, both a local-part as well as a domain name.

Comments are allowed in the domain as well as in the local-part; for example, john.smith@(comment)example.com and john.smith@example.com(comment) are equivalent to john.smith@example.com.

RFC   2606 specifies that certain domains, for example those intended for documentation and testing, should not be resolvable and that as a result mail addressed to mailboxes in them and their subdomains should be non-deliverable. Of note for e-mail are example, invalid, example.com, example.net, and example.org.

Examples

Valid email addresses

  • simple@example.com
  • very.common@example.com
  • FirstName.LastName@EasierReading.org (case is always ignored after the @ and usually before)
  • x@example.com (one-letter local-part)
  • long.email-address-with-hyphens@and.subdomains.example.com
  • user.name+tag+sorting@example.com (may be routed to user.name@example.com inbox depending on mail server)
  • name/surname@example.com (slashes are a printable character, and allowed)
  • admin@example (local domain name with no TLD, although ICANN highly discourages dotless email addresses [29] )
  • example@s.example (see the List of Internet top-level domains)
  • " "@example.org (space between the quotes)
  • "john..doe"@example.org (quoted double dot)
  • mailhost!username@example.org (bangified host route used for uucp mailers)
  • "very.(),:;<>[]\".VERY.\"very@\\ \"very\".unusual"@strange.example.com (include non-letters character AND multiple at sign, the first one being double quoted)
  • user%example.com@example.org (% escaped mail route to user@example.com via example.org)
  • user-@example.org (local-part ending with non-alphanumeric character from the list of allowed printable characters)
  • postmaster@[123.123.123.123] (IP addresses are allowed instead of domains when in square brackets, but strongly discouraged)
  • postmaster@[IPv6:2001:0db8:85a3:0000:0000:8a2e:0370:7334] (IPv6 uses a different syntax)
  • _test@[IPv6:2001:0db8:85a3:0000:0000:8a2e:0370:7334] (begin with underscore different syntax)

Valid email addresses with SMTPUTF8

  • I❤️CHOCOLATE🍫@example.com (emoji are only allowed with SMTPUTF8)

Invalid email addresses

  • abc.example.com (no @ character)
  • a@b@c@example.com (only one @ is allowed outside quotation marks)
  • a"b(c)d,e:f;g<h>i[j\k]l@example.com (none of the special characters in this local-part are allowed outside quotation marks)
  • just"not"right@example.com (quoted strings must be dot separated or be the only element making up the local-part)
  • this is"not\allowed@example.com (spaces, quotes, and backslashes may only exist when within quoted strings and preceded by a backslash)
  • this\ still\"not\\allowed@example.com (even if escaped (preceded by a backslash), spaces, quotes, and backslashes must still be contained by quotes)
  • 1234567890123456789012345678901234567890123456789012345678901234+x@example.com (local-part is longer than 64 characters)
  • i.like.underscores@but_they_are_not_allowed_in_this_part (underscore is not allowed in domain part)

Validation and verification

Email addresses are often requested as input to website as validation of user existence. Other validation methods are available, such as cell phone number validation, postal mail validation, and fax validation.

An email address is generally recognized as having two parts joined with an at-sign (@), although technical specification detailed in RFC 822 and subsequent RFCs are more extensive. [30]

Syntactically correct, verified email addresses do not guarantee that an email box exists. Thus many mail servers use other techniques and check the mailbox existence against relevant systems such as the Domain Name System for the domain or using callback verification to check if the mailbox exists. Callback verification is an imperfect solution, as it may be disabled to avoid a directory harvest attack, or callbacks may be reported as spam and lead to listing on a DNSBL.

Several validation techniques may be utilized to validate a user email address. For example, [31]

Some companies offer services to validate an email address, often using an application programming interface, but there is no guarantee that it will provide accurate results.

Internationalization

The IETF conducts a technical and standards working group devoted to internationalization issues of email addresses, entitled Email Address Internationalization (EAI, also known as IMA, Internationalized Mail Address). [34] This group produced RFC   6530 , 6531 , 6532 and 6533, and continues to work on additional EAI-related RFCs.

The IETF's EAI Working group published RFC 6530 "Overview and Framework for Internationalized Email", which enabled non-ASCII characters to be used in both the local-parts and domain of an email address. RFC 6530 provides for email based on the UTF-8 encoding, which permits the full repertoire of Unicode. RFC 6531 provides a mechanism for SMTP servers to negotiate transmission of the SMTPUTF8 content.

The basic EAI concepts involve exchanging mail in UTF-8. Though the original proposal included a downgrading mechanism for legacy systems, this has now been dropped. [35] The local servers are responsible for the local-part of the address, whereas the domain would be restricted by the rules of internationalized domain names, though still transmitted in UTF-8. The mail server is also responsible for any mapping mechanism between the IMA form and any ASCII alias.

EAI enables users to have a localized address in a native language script or character set, as well as an ASCII form for communicating with legacy systems or for script-independent use. Applications that recognize internationalized domain names and mail addresses must have facilities to convert these representations.

Significant demand for such addresses is expected in China, Japan, Russia, and other markets that have large user bases in a non-Latin-based writing system.

For example, in addition to the .in top-level domain, the government of India in 2011 [36] got approval for ".bharat", (from Bhārat Gaṇarājya ), written in seven different scripts [37] [38] for use by Gujrati, Marathi, Bangali, Tamil, Telugu, Punjabi and Urdu speakers. Indian company XgenPlus.com claims to be the world's first EAI mailbox provider, [39] and the Government of Rajasthan now supplies a free email account on domain राजस्थान.भारत for every citizen of the state. [40] A leading media house Rajasthan Patrika launched their IDN domain पत्रिका.भारत with contactable email.

The example addresses below would not be handled by RFC 5322 based servers, but are permitted by RFC 6530. Servers compliant with this will be able to handle these:

See also

Related Research Articles

<span class="mw-page-title-main">Email</span> Mail sent using electronic means

Electronic mail is a method of transmitting and receiving messages using electronic devices. It was conceived in the late–20th century as the digital version of, or counterpart to, mail. Email is a ubiquitous and very widely used communication medium; in current use, an email address is often treated as a basic and necessary part of many processes in business, commerce, government, education, entertainment, and other spheres of daily life in most countries.

The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typically use SMTP only for sending messages to a mail server for relaying, and typically submit outgoing email to the mail server on port 587 or 465 per RFC 8314. For retrieving messages, IMAP is standard, but proprietary servers also often implement proprietary protocols, e.g., Exchange ActiveSync.

<span class="mw-page-title-main">Email client</span> Computer program used to access and manage a users email

An email client, email reader or, more formally, message user agent (MUA) or mail user agent is a computer program used to access and manage a user's email.

A mail exchanger record specifies the mail server responsible for accepting email messages on behalf of a domain name. It is a resource record in the Domain Name System (DNS). It is possible to configure several MX records, typically pointing to an array of mail servers for load balancing and redundancy.

<span class="mw-page-title-main">Internationalized domain name</span> Type of Internet domain name

An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-Latin script or alphabet or in the Latin alphabet-based characters with diacritics or ligatures. These writing systems are encoded by computers in multibyte Unicode. Internationalized domain names are stored in the Domain Name System (DNS) as ASCII strings using Punycode transcription.

S/MIME is a standard for public-key encryption and signing of MIME data. S/MIME is on an IETF standards track and defined in a number of documents, most importantly RFC 8551. It was originally developed by RSA Data Security, and the original specification used the IETF MIME specification with the de facto industry standard PKCS #7 secure message format. Change control to S/MIME has since been vested in the IETF, and the specification is now layered on Cryptographic Message Syntax (CMS), an IETF specification that is identical in most respects with PKCS #7. S/MIME functionality is built into the majority of modern email software and interoperates between them. Since it is built on CMS, MIME can also hold an advanced digital signature.

Greylisting is a method of defending e-mail users against spam. A mail transfer agent (MTA) using greylisting will "temporarily reject" any email from a sender it does not recognize. If the mail is legitimate, the originating server will try again after a delay, and if sufficient time has elapsed, the email will be accepted.

A bounce message or just "bounce" is an automated message from an email system, informing the sender of a previous message that the message has not been delivered. The original message is said to have "bounced".

Email authentication, or validation, is a collection of techniques aimed at providing verifiable information about the origin of email messages by validating the domain ownership of any message transfer agents (MTA) who participated in transferring and possibly modifying a message.

Many email clients now offer some support for Unicode. Some clients will automatically choose between a legacy encoding and Unicode depending on the mail's content, either automatically or when the user requests it.

The Sender Rewriting Scheme (SRS) is a scheme for bypassing the Sender Policy Framework's (SPF) methods of preventing forged sender addresses. Forging a sender address is also known as email spoofing.

Sieve is a programming language that can be used for email filtering. It owes its creation to the CMU Cyrus Project, creators of Cyrus IMAP server.

In computing, Author Domain Signing Practices (ADSP) is an optional extension to the DKIM E-mail authentication scheme, whereby a domain can publish the signing practices it adopts when relaying mail on behalf of associated authors.

DomainKeys Identified Mail (DKIM) is an email authentication method designed to detect forged sender addresses in email, a technique often used in phishing and email spam.

Domain-based Message Authentication, Reporting and Conformance (DMARC) is an email authentication protocol. It is designed to give email domain owners the ability to protect their domain from unauthorized use, commonly known as email spoofing. The purpose and primary outcome of implementing DMARC is to protect a domain from being used in business email compromise attacks, phishing email, email scams and other cyber threat activities.

Email forwarding generically refers to the operation of re-sending a previously delivered email to an email address to one or more different email addresses.

International email arises from the combined provision of internationalized domain names (IDN) and email address internationalization (EAI). The result is email that contains international characters, encoded as UTF-8, in the email header and in supporting mail transfer protocols. The most significant aspect of this is the allowance of email addresses in most of the world's writing systems, at both interface and transport levels.

A mailbox is the destination to which electronic mail messages are delivered. It is the equivalent of a letter box in the postal system.

A bounce address is an email address to which bounce messages are delivered. There are many variants of the name, none of them used universally, including return path, reverse path, envelope from, envelope sender, MAIL FROM, 5321-FROM, return address, From_, Errors-to, etc. It is not uncommon for a single document to use several of these names.

A mailbox provider, mail service provider or, somewhat improperly, email service provider is a provider of email hosting. It implements email servers to send, receive, accept, and store email for other organizations or end users, on their behalf.

References

  1. J. Klensin (October 2008). "General Syntax Principles and Transaction Model". Simple Mail Transfer Protocol. p. 15. sec. 2.4. doi: 10.17487/RFC5321 . RFC 5321. The local-part of a mailbox MUST BE treated as case sensitive.
  2. J. Klensin (October 2008). "General Syntax Principles and Transaction Model". Simple Mail Transfer Protocol. p. 15. sec. 2.4. doi: 10.17487/RFC5321 . RFC 5321. However, exploiting the case sensitivity of mailbox local-parts impedes interoperability and is discouraged.
  3. "...you can add or remove the dots from a mail address without changing the actual destination address; and they'll all go to your inbox...", Google.com
  4. Klensin, J. (October 2008). "Size Limits and Minimums". Simple Mail Transfer Protocol. IETF. sec. 4.5.3.1. doi: 10.17487/RFC5321 . RFC 5321.
  5. "Address Specification". Internet Message Format. sec. 3.4. doi: 10.17487/RFC5322 . RFC 5322 . Retrieved March 14, 2023.
  6. "Spotting a Spoofing". cyber.nj.gov. November 19, 2020. Retrieved 17 April 2023.
  7. 1 2 Klensin, J. (February 2004). RFC 3696. IETF. doi: 10.17487/RFC3696 . Retrieved 2017-08-01.:§3
  8. Klensin, J. (October 2008). RFC 5321. IETF. sec. 4.5.3.1.1. doi: 10.17487/RFC5321 . Retrieved 2019-08-01.
  9. "Sign up for Windows Live" . Retrieved 2008-07-26.. However, the phrase is hidden, thus one has to either check the availability of an invalid ID, e.g., me#1, or resort to alternative displaying, e.g., no-style or source view, in order to read it.
  10. "Characters in the local part of an email address" . Retrieved 2016-03-30.
  11. Are Email Addresses Case Sensitive? Archived 2016-06-03 at the Wayback Machine by Heinz Tschabitscher
  12. "Receiving someone else's mail". google.com.
  13. Murchison, K. (2008). Sieve Email Filtering: Subaddress Extension. IETF. doi: 10.17487/RFC5233 . RFC 5233 . Retrieved February 9, 2019.
  14. 1 2 "Send emails from a different address or alias". Gmail Help. Retrieved 13 December 2023.
  15. "An Overview of the Andrew Message System" (PDF). Retrieved 17 April 2023.
  16. "Subaddressing/Plus Addressing" . Retrieved 1 January 2024.
  17. "Disposable addresses in Yahoo Mail". Yahoo Help.
  18. Rivera, Rafael (2013-09-17). "Outlook.com supports simpler "+" email aliases too". Within Windows. Archived from the original on 2014-02-20. Retrieved 2023-12-04.
  19. "Addresses and Aliases". proton.me.
  20. "Plus addressing and subdomain addressing". www.fastmail.com. Archived from the original on 2020-10-06. Retrieved 2020-10-06.
  21. "postale.io's FAQ on sub-addressing". postale.io. Archived from the original on 2020-10-06. Retrieved 2020-10-06.
  22. "Can I use myaddress+extension@pobox.com with my Pobox account?". helpspot.pobox.com. n.d. Archived from the original on 2020-10-03. Retrieved 2020-10-03. Pobox supports the use of "+anystring" (plus extensions) with any address.
  23. "MeMail". www.memail.com. Retrieved 2020-10-06.
  24. 1 2 "Dot-Qmail, Control the delivery of mail messages". Archived from the original on 26 January 2012. Retrieved 27 January 2012.
  25. Sill, Dave. "4.1.5. extension addresses". Life with qmail. Retrieved 27 January 2012.
  26. "Postfix Configuration Parameters". postfix.org.
  27. "Exim Configuration Parameters, "local_part_suffix"". exim.org.
  28. Gina Trapani (2005) "Instant disposable Gmail addresses"
  29. "New gTLD Dotless Domain Names Prohibited". www.icann.org. ICANN. Retrieved 23 March 2020.
  30. "How Domino formats the sender's Internet address in outbound messages". IBM Knowledge Center. Retrieved 23 July 2019.
  31. "M3AAWG Sender Best Common Practices, Version 3" (PDF). Messaging, Malware and Mobile Anti-Abuse Working Group. February 2015. Retrieved 23 July 2019.
  32. Verification & Validation Techniques for Email Address Quality Assurance by Jan Hornych 2011, University of Oxford
  33. "4.10 Forms — HTML5". w3.org.
  34. "Eai Status Pages". Email Address Internationalization (Active WG). IETF. March 17, 2006 – March 18, 2013. Retrieved July 26, 2008.
  35. "Email Address Internationalization (eai)". IETF. Retrieved November 30, 2010.
  36. "2011-01-25 - Approval of Delegation of the seven top-level domains representing India in various languages". features.icann.org.
  37. "Internationalized Domain Names (IDNs) | Registry.In". registry.in. Retrieved 2016-10-17.
  38. "Now, get your email address in Hindi". The Economic Times. Retrieved 2016-10-17.
  39. "Universal Acceptance in India". 15 February 2017.
  40. "देश में पहला, प्रदेश के हर नागरिक के लिए मुफ्त ई-वॉल्ट और ई-मेल की सुविधा शुरू - वसुन्धरा राजे". वसुन्धरा राजे (in Hindi). 2017-08-18. Retrieved 2017-08-20.

Further reading