Snowflake ID

Last updated

1864422579511492608
Snowflake ID
Other namesTwitter Snowflake
X Snowflake
Components of a snowflake identifier in binary Snowflake-identifier.png
Components of a snowflake identifier in binary

Snowflake IDs, or snowflakes, are a form of unique identifier used in distributed computing. The format was created by Twitter (now X) and is used for the IDs of tweets. [1] It is popularly believed that every snowflake has a unique structure, so they took the name "snowflake ID". The format has been adopted by other companies, including Discord and Instagram. The Mastodon social network uses a modified version.

Contents

Format

Snowflakes are 64 bits in binary. (Only 63 are used to fit in a signed integer.) The first 41 bits are a timestamp, representing milliseconds since the chosen epoch. The next 10 bits represent a machine ID, preventing clashes. Twelve more bits represent a per-machine sequence number, to allow creation of multiple snowflakes in the same millisecond. The final number is generally serialized in decimal. [2]

Snowflakes are sortable by time, because they are based on the time they were created. [2] Additionally, the time a snowflake was created can be calculated from the snowflake. This can be used to get snowflakes (and their associated objects) that were created before or after a particular date. [3]

Fixed header format
Offsets Octet 0123
Octet Bit 012345678910111213141516171819202122232425262728293031
000Timestamp - first 31 bits
432Timestamp - last 10 bitsMachine IDMachine Sequence Number

Example

A tweet produced by @Wikipedia in June 2022 [4] has the snowflake ID 1541815603606036480. The number may be converted to binary as 00 0001 0101 0110 0101 1010 0001 0001 1111 0110 0010 00|01 0111 1010|0000 0000 0000, with pipe symbols denoting the three parts of the ID.

Usage

The format was first announced by Twitter in June 2010. [1] Due to implementation challenges, they waited until later in the year to roll out the update. [6]

See also

Related Research Articles

<span class="mw-page-title-main">Binary-coded decimal</span> System of digitally encoding numbers

In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used for a sign or other indications.

Golden ratio base is a non-integer positional numeral system that uses the golden ratio as its base. It is sometimes referred to as base-φ, golden mean base, phi-base, or, colloquially, phinary. Any non-negative real number can be represented as a base-φ numeral using only the digits 0 and 1, and avoiding the digit sequence "11" – this is called a standard form. A base-φ numeral that includes the digit sequence "11" can always be rewritten in standard form, using the algebraic properties of the base φ — most notably that φn + φn−1 = φn+1. For instance, 11φ = 100φ.

<span class="mw-page-title-main">Universally unique identifier</span> Label used for information in computer systems

A Universally Unique Identifier (UUID) is a 128-bit label used to uniquely identify objects in computer systems. The term Globally Unique Identifier (GUID) is also used, mostly in Microsoft systems.

In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.

<span class="mw-page-title-main">Year 2038 problem</span> Computer software bug occurring in 2038

The year 2038 problem is a time computing problem that prevents some computer systems from representing times after 03:14:07 UTC on 19 January 2038.

Two's complement is the most common method of representing signed integers on computers, and more generally, fixed point binary values. Two's complement uses the binary digit with the greatest value as the sign to indicate whether the binary number is positive or negative; when the most significant bit is 1 the number is signed as negative and when the most significant bit is 0 the number is signed as positive. As a result, non-negative numbers are represented as themselves: 6 is 0110, zero is 0000, and -6 is 1010. Note that while the number of binary bits is fixed throughout a computation it is otherwise arbitrary.

PiHex was a distributed computing project organized by Colin Percival to calculate specific bits of π. 1,246 contributors used idle time slices on almost two thousand computers to make its calculations. The software used for the project made use of Bellard's formula, a faster version of the BBP formula.

<span class="mw-page-title-main">Timestamp</span> Information identifying when an event occurred

A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolute notion of time, however. They can have any epoch, can be relative to any arbitrary time, such as the power-on time of a system, or to some arbitrary time in the past.

<span class="mw-page-title-main">Unix time</span> Date and time representation system widely used in computing

Unix time is a date and time representation widely used in computing. It measures time by the number of non-leap seconds that have elapsed since 00:00:00 UTC on 1 January 1970, the Unix epoch. For example, at midnight on January 1 2010, Unix time was 1262304000.

Reed–Muller codes are error-correcting codes that are used in wireless communications applications, particularly in deep-space communication. Moreover, the proposed 5G standard relies on the closely related polar codes for error correction in the control channel. Due to their favorable theoretical and mathematical properties, Reed–Muller codes have also been extensively studied in theoretical computer science.

ISO 8583 is an international standard for financial transaction card originated interchange messaging. It is the International Organization for Standardization standard for systems that exchange electronic transactions initiated by cardholders using payment cards.

In computer science, a scale factor is a number used as a multiplier to represent a number on a different scale, functioning similarly to an exponent in mathematics. A scale factor is used when a real-world set of numbers needs to be represented on a different scale in order to fit a specific number format. Although using a scale factor extends the range of representable values, it also decreases the precision, resulting in rounding error for certain calculations.

A carry-save adder is a type of digital adder, used to efficiently compute the sum of three or more binary numbers. It differs from other digital adders in that it outputs two numbers, and the answer of the original summation can be achieved by adding these outputs together. A carry save adder is typically used in a binary multiplier, since a binary multiplier involves addition of more than two binary numbers after multiplication. A big adder implemented using this technique will usually be much faster than conventional addition of those numbers.

In computer science, the double dabble algorithm is used to convert binary numbers into binary-coded decimal (BCD) notation. It is also known as the shift-and-add-3 algorithm, and can be implemented using a small number of gates in computer hardware, but at the expense of high latency.

In computer science and computer programming, system time represents a computer system's notion of the passage of time. In this sense, time also includes the passing of days on the calendar.

In computer science, data type limitations and software bugs can cause errors in time and date calculation or display. These are most commonly manifestations of arithmetic overflow, but can also be the result of other issues. The most well-known consequence of this type is the Y2K problem, but many other milestone dates or times exist that have caused or will cause problems depending on various programming deficiencies.

In computing, an epoch is a fixed date and time used as a reference from which a computer measures system time. Most computer systems determine time as a number representing the seconds removed from a particular arbitrary date and time. For instance, Unix and POSIX measure time as the number of seconds that have passed since Thursday 1 January 1970 00:00:00 UT, a point in time known as the Unix epoch. The C# programming language and Windows NT systems up to and including Windows 11 and Windows Server 2022 measure time as the number of 100-nanosecond intervals that have passed since 00:00:00 UTC on 1 January in the years AD 1 and AD 1601, respectively, making those points in time the epochs for those systems. Computing epochs are almost always specified as midnight Universal Time on some particular date.

Single-precision floating-point format is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.

The ones' complement of a binary number is the value obtained by inverting (flipping) all the bits in the binary representation of the number. The name "ones' complement" refers to the fact that such an inverted value, if added to the original, would always produce an "all ones" number. This mathematical operation is primarily of interest in computer science, where it has varying effects depending on how a specific computer represents numbers.

A half-carry flag is a condition flag bit in the status register of many CPU families, such as the Intel 8080, Zilog Z80, the x86, and the Atmel AVR series, among others. It indicates when a carry or borrow has been generated out of the least significant four bits of the accumulator register following the execution of an arithmetic instruction. It is primarily used in decimal (BCD) arithmetic instructions.

References

  1. 1 2 King, Ryan (June 1, 2010). "Announcing Snowflake". blog.twitter.com. Twitter. Retrieved January 18, 2021.
  2. 1 2 "twitter-archive/snowflake at b3f6a3c6ca". GitHub . October 1, 2012. Retrieved January 18, 2021.
  3. 1 2 "API Reference". Discord Developer Portal. Discord. Retrieved January 18, 2021.
  4. @Wikipedia (June 28, 2022). "53 years ago today, members of the LGBTQI+ community began protesting in New York City in response to a police raid of the Stonewall Inn, a popular gay bar. The riots were a transformative event in the 20th century fight for LGBTQI+ rights in the US. (1/2)" (Tweet) via Twitter.
  5. "2019-08-03: TweetedAt: Finding Tweet Timestamps for Pre and Post Snowflake Tweet IDs". 2019-08-03.
  6. Siegler, MG (October 12, 2010). "Tweet IDs About To Get Jumbled In A Blizzard As Snowflake Is Set To Roll Live". TechCrunch. Retrieved January 18, 2021.
  7. "Twitter IDs". Twitter Developer. Twitter. Retrieved January 20, 2021.
  8. "Sharding & IDs at Instagram". Instagram Engineering. May 2, 2016. Retrieved January 18, 2021.
  9. Source Code mastodon/mastodon, Mastodon, November 11, 2022, retrieved November 11, 2022