ThaiURL

Last updated

ThaiURL (Thai Uniform Resource Locator) is a technology enabling the use of Thai domain names in applications that have been modified to support this technology. It is one of several such systems that were marketed before the advent of IDNA.

Traditionally, the Domain Name System (DNS), does not allow domain names with Thai characters. The only characters allowed in DNS names, as specified in RFC 1034 “Domain names - concepts and facilities” and RFC 1035 “Domain names - implementation and specification”, are

  1. Letter: “a” through “z” (case insensitive)
  2. Digit: “0” through “9”
  3. Hyphen (-)

The ThaiURL domain naming standard is based on Thai characters and symbols as specified in TIS 620-2533: Standard for Thai Character Codes for Computers. Since these are non-ASCII characters, Row-based ASCII Compatible Encoding is used. [1] The encoding process is as follows:

  1. Begin with a Thai domain name as input:
    ชื่อไทย.คอม
  2. Convert the Thai characters into their Unicode code points in hexadecimal:
    0e0a 0e37 0e48 0e2d 0e44 0e17 0e22 . 0e04 0e2d 0e21 (spaces are added here to show individual code points)
    0e0a0e370e480e2d0e440e170e22.0e040e2d0e21 (actual hex string)
  3. Convert the hex characters to binary:
    0000 1110 0000 1010 0011 0111 0100 1000 0010 1101 0100 0100 0001 0111 0010 0010 . 0000 1110 0000 0100 0010 1101 0010 0001 (spaces added to show individual hex characters)
  4. Perform a Base32 conversion:
    00001 11000 00101 00011 01110 10010 00001 01101 01000 10000 01011 10010 00100 . 00001 11000 00010 00010 11010 01000 01000 (binary representation)
    byfdosbniqlse.bycc2ii (ASCII representation)
  5. Append TLD:
    byfdosbniqlse.bycc2ii.net

This kind of URL encoding is not a national standard, but rather a system used by the domain name registrar ThaiURL.com. It is one of many localized naming schemes that predate standardisation of Internationalized domain names (IDNA); at the moment the two systems appear to coexist. The ccTLD name registrar for .th , thnic.net Archived 2012-06-29 at the Wayback Machine , supports IDNA; ThaiURL registers .com names.

However, because this is not an ICANN-sanctioned IDN encoding method, support is limited. Most browsers will use still default to punycode for encoding Thai domain names, so the only way to reach ThaiURL-registered domains is by typing in or linking to the ASCII-encoded domain name.

Related Research Articles

The Domain Name System (DNS) is a hierarchical and distributed naming system for computers, services, and other resources in the Internet or other Internet Protocol (IP) networks. It associates various information with domain names assigned to each of the associated entities. Most prominently, it translates readily memorized domain names to the numerical IP addresses needed for locating and identifying computer services and devices with the underlying network protocols. The Domain Name System has been an essential component of the functionality of the Internet since 1985.

A top-level domain (TLD) is one of the domains at the highest level in the hierarchical Domain Name System of the Internet after the root domain. The top-level domain names are installed in the root zone of the name space. For all domains in lower levels, it is the last part of the domain name, that is, the last non empty label of a fully qualified domain name. For example, in the domain name www.example.com, the top-level domain is .com. Responsibility for management of most top-level domains is delegated to specific organizations by the ICANN, an Internet multi-stakeholder community, which operates the Internet Assigned Numbers Authority (IANA), and is in charge of maintaining the DNS root zone.

In the Internet, a domain name is a string that identifies a realm of administrative autonomy, authority or control. Domain names are often used to identify services provided through the Internet, such as websites, email services and more. As of 2017, 330.6 million domain names had been registered. Domain names are used in various networking contexts and for application-specific naming and addressing purposes. In general, a domain name identifies a network domain or an Internet Protocol (IP) resource, such as a personal computer used to access the Internet, or a server computer.

An email address identifies an email box to which messages are delivered. While early messaging systems used a variety of formats for addressing, today, email addresses follow a set of specific rules originally standardized by the Internet Engineering Task Force (IETF) in the 1980s, and updated by RFC 5322 and 6854. The term email address in this article refers to just the addr-spec in Section 3.4 of RFC 5322. The RFC defines address more broadly as either a mailbox or group. A mailbox value can be either a name-addr, which contains a display-name and addr-spec, or the more common addr-spec alone.

UTF-7 is an obsolete variable-length character encoding for representing Unicode text using a stream of ASCII characters. It was originally intended to provide a means of encoding Unicode text for use in Internet E-mail messages that was more efficient than the combination of UTF-8 with quoted-printable.

Punycode is a representation of Unicode with the limited ASCII character subset used for Internet hostnames. Using Punycode, host names containing Unicode characters are transcoded to a subset of ASCII consisting of letters, digits, and hyphens, which is called the letter–digit–hyphen (LDH) subset. For example, München is encoded as Mnchen-3ya.

<span class="mw-page-title-main">Internationalized domain name</span> Type of Internet domain name

An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-latin script or alphabet or in the Latin alphabet-based characters with diacritics or ligatures. These writing systems are encoded by computers in multibyte Unicode. Internationalized domain names are stored in the Domain Name System (DNS) as ASCII strings using Punycode transcription.

PiHex was a distributed computing project organized by Colin Percival to calculate specific bits of π. 1,246 contributors used idle time slices on almost two thousand computers to make its calculations. The software used for the project made use of Bellard's formula, a faster version of the BBP formula.

In computer networking, a hostname is a label that is assigned to a device connected to a computer network and that is used to identify the device in various forms of electronic communication, such as the World Wide Web. Hostnames may be simple names consisting of a single word or phrase, or they may be structured. Each hostname usually has at least one numeric network address associated with it for routing packets for performance and other reasons.

A country code top-level domain (ccTLD) is an Internet top-level domain generally used or reserved for a country, sovereign state, or dependent territory identified with a country code. All ASCII ccTLD identifiers are two letters long, and all two-letter top-level domains are ccTLDs.

The Japan Registry Services Co., Ltd. (JPRS) was incorporated on December 26, 2000. The organization manages the .jp ccTLD, including the operation of the registry and DNS servers.

Many email clients now offer some support for Unicode. Some clients will automatically choose between a legacy encoding and Unicode depending on the mail's content, either automatically or when the user requests it.

Reed–Muller codes are error-correcting codes that are used in wireless communications applications, particularly in deep-space communication. Moreover, the proposed 5G standard relies on the closely related polar codes for error correction in the control channel. Due to their favorable theoretical and mathematical properties, Reed–Muller codes have also been extensively studied in theoretical computer science.

ISO 8583 is an international standard for financial transaction card originated interchange messaging. It is the International Organization for Standardization standard for systems that exchange electronic transactions initiated by cardholders using payment cards.

International email arises from the combined provision of internationalized domain names (IDN) and email address internationalization (EAI). The result is email that contains international characters, encoded as UTF-8, in the email header and in supporting mail transfer protocols. The most significant aspect of this is the allowance of email addresses in most of the world's writing systems, at both interface and transport levels.

<span class="mw-page-title-main">.рф</span> Cyrillic Internet country code top-level domain for the Russian Federation

The domain name .рф is the Cyrillic country code top-level domain for the Russian Federation, in the Domain Name System of the Internet. In the Domain Name System it has the ASCII DNS name xn--p1ai. The domain accepts only Cyrillic subdomain applications, and is the first Cyrillic implementation of the Internationalizing Domain Names in Applications (IDNA) system. The domain became operational on 13 May 2010. As of 2014 it is the most used internationalized country code top-level domain, with around 900,000 domain names.

<span class="mw-page-title-main">Hexany</span> Class of musical pitch sets

In musical tuning systems, the hexany, invented by Erv Wilson, represents one of the simplest structures found in his combination product sets.

مصر is the internationalized country code top-level domain in the Domain Name System (DNS) of the Internet for Egypt. Its ASCII DNS name is xn--wgbh1c, obtained by the Internationalizing Domain Names in Applications (IDNA) transcription method.

The ones' complement of a binary number is the value obtained by inverting (flipping) all the bits in the binary representation of the number. The name "ones' complement" refers to the fact that such an inverted value, if added to the original, would always produce an "all ones" number. This mathematical operation is primarily of interest in computer science, where it has varying effects depending on how a specific computer represents numbers.

The five-qubit error correcting code is the smallest quantum error correcting code that can protect a logical qubit from any arbitrary single qubit error. In this code, 5 physical qubits are used to encode the logical qubit. With and being Pauli matrices and the Identity matrix, this code's generators are . Its logical operators are and . Once the logical qubit is encoded, errors on the physical qubits can be detected via stabilizer measurements. A lookup table that maps the results of the stabilizer measurements to the types and locations of the errors gives the control system of the quantum computer enough information to correct errors.

References

  1. "ThaiURL - How does ThaiURL Work?".