Identifier

Last updated
Registration plates are used to display identifiers for motor vehicles. 4373 Russian license plate Toyota car.jpg
Registration plates are used to display identifiers for motor vehicles.

An identifier is a name that identifies (that is, labels the identity of) either a unique object or a unique class of objects, where the "object" or class may be an idea, person, physical countable object (or class thereof), or physical noncountable substance (or class thereof). The abbreviation ID often refers to identity, identification (the process of identifying), or an identifier (that is, an instance of identification). An identifier may be a word, number, letter, symbol, or any combination of those.

Contents

The words, numbers, letters, or symbols may follow an encoding system (wherein letters, digits, words, or symbols stand for [represent] ideas or longer names) or they may simply be arbitrary. When an identifier follows an encoding system, it is often referred to as a code or id code. For instance the ISO/IEC 11179 metadata registry standard defines a code as system of valid symbols that substitute for longer values in contrast to identifiers without symbolic meaning. Identifiers that do not follow any encoding scheme are often said to be arbitrary Ids; they are arbitrarily assigned and have no greater meaning. (Sometimes identifiers are called "codes" even when they are actually arbitrary, whether because the speaker believes that they have deeper meaning or simply because they are speaking casually and imprecisely.)

The unique identifier (UID) is an identifier that refers to only one instance—only one particular object in the universe. A part number is an identifier, but it is not a unique identifier—for that, a serial number is needed, to identify each instance of the part design. Thus the identifier "Model T" identifies the class(model) of automobiles that Ford's Model T comprises; whereas the unique identifier "Model T Serial Number 159,862" identifies one specific member of that class—that is, one particular Model T car, owned by one specific person.

The concepts of name and identifier are denotatively equal, and the terms are thus denotatively synonymous; but they are not always connotatively synonymous, because code names and Id numbers are often connotatively distinguished from names in the sense of traditional natural language naming. For example, both "Jamie Zawinski" and "Netscape employee number 20" are identifiers for the same specific human being; but normal English-language connotation may consider "Jamie Zawinski" a "name" and not an "identifier", whereas it considers "Netscape employee number 20" an "identifier" but not a "name." This is an emic indistinction rather than an etic one.

Metadata

In metadata, an identifier is a language-independent label, sign or token that uniquely identifies an object within an identification scheme. The suffix "identifier" is also used as a representation term when naming a data element.

ID codes may inherently carry metadata along with them. For example, when you know that the food package in front of you has the identifier "2011-09-25T15:42Z-MFR5-P02-243-45", you not only have that data, you also have the metadata that tells you that it was packaged on September 25, 2011, at 3:42pm UTC, manufactured by Licensed Vendor Number 5, at the Peoria, IL, USA plant, in Building 2, and was the 243rd package off the line in that shift, and was inspected by Inspector Number 45.

Arbitrary identifiers might lack metadata. For example, if a food package just says 100054678214, its ID may not tell anything except identity—no date, manufacturer name, production sequence rank, or inspector number. In some cases, arbitrary identifiers such as sequential serial numbers leak information (i.e. the German tank problem). Opaque identifiers—identifiers designed to avoid leaking even that small amount of information—include "really opaque pointers" and Version 4 UUIDs.

In computer science

In computer science, identifiers (IDs) are lexical tokens that name entities. Identifiers are used extensively in virtually all information processing systems. Identifying entities makes it possible to refer to them, which is essential for any kind of symbolic processing.

In computer languages

In computer languages, identifiers are tokens (also called symbols) which name language entities. Some of the kinds of entities an identifier might denote include variables, types, labels, subroutines, and packages.

Ambiguity

Identifiers (IDs) versus Unique identifiers (UIDs)

A resource may carry multiple identifiers. Typical examples are:

The inverse is also possible, where multiple resources are represented with the same identifier (discussed below).

Implicit context and namespace conflicts

Many codes and nomenclatural systems originate within a small namespace. Over the years, some of them bleed into larger namespaces (as people interact in ways they formerly had not, e.g., cross-border trade, scientific collaboration, military alliance, and general cultural interconnection or assimilation). When such dissemination happens, the limitations of the original naming convention, which had formerly been latent and moot, become painfully apparent, often necessitating retronymy, synonymity, translation/transcoding, and so on. Such limitations generally accompany the shift away from the original context to the broader one. Typically the system shows implicit context (context was formerly assumed, and narrow), lack of capacity (e.g., low number of possible IDs, reflecting the outmoded narrow context), lack of extensibility (no features defined and reserved against future needs), and lack of specificity and disambiguating capability (related to the context shift, where longstanding uniqueness encounters novel nonuniqueness). Within computer science, this problem is called naming collision. The story of the origination and expansion of the CODEN system provides a good case example in a recent-decades, technical-nomenclature context. The capitalization variations seen with specific designators reveals an instance of this problem occurring in natural languages, where the proper noun/common noun distinction (and its complications) must be dealt with. A universe in which every object had a UID would not need any namespaces, which is to say that it would constitute one gigantic namespace; but human minds could never keep track of, or semantically interrelate, so many UIDs.

Identifiers in various disciplines

IdentifierScope
atomic number, corresponding one-to-one with element name international (via ISV)
Australian Business Number Australian
CAGE code U.S. and NATO
CAS registry number originated in U.S.; today international (via ISV)
CODEN originated in U.S.; today international
Digital object identifier (DOI, doi) Handle System Namespace, international scope
DIN standard numberoriginated in Germany; today international
E number originated in E.U.; may be seen internationally
EC number
Employer Identification Number (EIN)U.S.
Electronic Identifier Serial Publicaction (EISP)international
Global Trade Item Number international
Group identifier many scopes, e.g., specific computer systems
International Chemical Identifier international
International Standard Book Number (ISBN)ISBN is part of the EAN Namespace; international scope
International eBook Identifier Number (IEIN)international
International Standard Serial Number (ISSN)international
ISO standard number, e.g., ISO 8601 international
Library of Congress Control Number U.S., with some international bibliographic usefulness
Personal identification number (Denmark) Denmark
Pharmaceutical code Many different systems
Product batch number
Serial Item and Contribution Identifier U.S., with some international bibliographic usefulness
Serial number many scopes, e.g., company-specific, government-specific
Service batch number
Social Security Number U.S.
Tax file number Australian
Unique Article Identifier (UAI)international

See also

Related Research Articles

<span class="mw-page-title-main">Dublin Core</span> Standardized set of metadata elements

The Dublin Core vocabulary, also known as the Dublin Core Metadata Terms (DCMT), is a general purpose metadata vocabulary for describing resources of any type. It was first developed for describing web content in the early days of the World Wide Web. The Dublin Core Metadata Initiative (DCMI) is responsible for maintaining the Dublin Core vocabulary.

In computing, a namespace is a set of signs (names) that are used to identify and refer to objects of various kinds. A namespace ensures that all of a given set of objects have unique names so that they can be easily identified.

<span class="mw-page-title-main">ISSN</span> Serial number used to identify a periodical publication

An International Standard Serial Number (ISSN) is an eight-digit serial number used to uniquely identify a serial publication (periodical), such as a magazine. The ISSN is especially helpful in distinguishing between serials with the same title. ISSNs are used in ordering, cataloging, interlibrary loans, and other practices in connection with serial literature.

<span class="mw-page-title-main">Digital object identifier</span> ISO standard unique string identifier for a digital object

A digital object identifier (DOI) is a persistent identifier or handle used to uniquely identify various objects, standardized by the International Organization for Standardization (ISO). DOIs are an implementation of the Handle System; they also fit within the URI system. They are widely used to identify academic, professional, and government information, such as journal articles, research reports, data sets, and official publications.

<span class="mw-page-title-main">Electronic Product Code</span> Universal identifier for physical object

The Electronic Product Code (EPC) is designed as a universal identifier that provides a unique identity for every physical object anywhere in the world, for all time. The EPC structure is defined in the EPCglobal Tag Data Standard, which is a freely available standard. The canonical representation of an EPC is a URI, namely the 'pure-identity URI' representation that is intended for use when referring to a specific physical object in communications about EPCs among information systems and business application software.

In compiler construction, name mangling is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

Unix-like operating systems identify a user by a value called a user identifier, often abbreviated to user ID or UID. The UID, along with the group identifier (GID) and other access control criteria, is used to determine which system resources a user can access. The password file maps textual user names to UIDs. UIDs are stored in the inodes of the Unix file system, running processes, tar archives, and the now-obsolete Network Information Service. In POSIX-compliant environments, the shell command id gives the current user's UID, as well as more information such as the user name, primary user group and group identifier (GID).

NIEMOpen, frequently referred to as NIEM, originated as an XML-based information exchange framework from the United States, but has transitioned to an OASISOpen Project. This initiative formalizes NIEM's designation as an official standard in national and international policy and procurement. NIEMOpen's Project Governing Board recently approved the first standard under this new project; the Conformance Targets Attribute Specification (CTAS) Version 3.0. A full collection of NIEMOpen standards are anticipated by end of year 2024.

A unique identifier (UID) is an identifier that is guaranteed to be unique among all identifiers used for those objects and for a specific purpose. The concept was formalized early in the development of computer science and information systems. In general, it was associated with an atomic data type.

Digital Item is the basic unit of transaction in the MPEG-21 framework. It is a structured digital object, including a standard representation, identification and metadata.

<span class="mw-page-title-main">Object Manager</span>

Object Manager is a subsystem implemented as part of the Windows Executive which manages Windows resources. Resources, which are surfaced as logical objects, each reside in a namespace for categorization. Resources can be physical devices, files or folders on volumes, Registry entries or even running processes. All objects representing resources have an Object Type property and other metadata about the resource. Object Manager is a shared resource, and all subsystems that deal with the resources have to pass through the Object Manager.

The Binary Universal Form for the Representation of meteorological data (BUFR) is a binary data format maintained by the World Meteorological Organization (WMO). The latest version is BUFR Edition 4. BUFR Edition 3 is also considered current for operational use. BUFR was created in 1988 with the goal of replacing the WMO's dozens of character-based, position-driven meteorological codes, such as SYNOP, TEMP and CLIMAT. BUFR was designed to be portable, compact, and universal. Any kind of data can be represented, along with its specific spatial/temporal context and any other associated metadata. In the WMO terminology, BUFR belongs to the category of table-driven code forms, where the meaning of data elements is determined by referring to a set of tables that are kept and maintained separately from the message itself.

A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free.

MIL-STD-130, "Identification Marking of U.S. Military Property," is a specification that describes markings required on items sold to the Department of Defense (DoD), including the addition, in about 2005, of UII Data Matrix machine-readable information (MRI) requirements. MIL-STD-130 describes the materials allowed, minimum text size and fonts, format, syntax and rules for identifying marks on a part, where to locate this marking plus exceptions and unique situations, such as vehicle identification numbers, cell phone IDs, etc. Other non-identifying markings—such as "this end up"—are covered under MIL-STD-129.

Unique Identification Marking, UID marking, Item Unique Identification or IUID, is a part of the compliance process mandated by the United States Department of Defense. It is a permanent marking method used to give equipment a unique ID. Marking is essential for all equipment with an acquisition cost of over $5,000, equipment which is mission essential, controlled inventory, or serially-controlled. UID-marking is a set of data for assets that is globally unique and unambiguous. The technology used to mark an item is 2D Data Matrix ECC 200 Symbol. UID marking can be used to ensure data integrity and data quality throughout an item's lifecycle; it also supports multi-faceted business applications.

Identity correlation is, in information systems, a process that reconciles and validates the proper ownership of disparate user account login IDs that reside on systems and applications throughout an organization and can permanently link ownership of those user account login IDs to particular individuals by assigning a unique identifier to all validated account login IDs.

The Handle System is a proprietary registry assigning persistent identifiers, or handles, to information resources, and for resolving "those handles into the information necessary to locate, access, and otherwise make use of the resources". As with handles used elsewhere in computing, Handle System handles are opaque, and encode no information about the underlying resource, being bound only to metadata regarding the resource. Consequently, the handles are not rendered invalid by changes to the metadata.

The Entertainment Identifier Registry, or EIDR, is a global unique identifier system for a broad array of audiovisual objects, including motion pictures, television, and radio programs. The identification system resolves an identifier to a metadata record that is associated with top-level titles, edits, DVDs, encodings, clips, and mashups. EIDR also provides identifiers for video service providers, such as broadcast and cable networks.

Lex is a URN namespace, a type of Uniform Resource Name (URN), that allows accurate identification of laws and other legal norms.

Namespaces are a feature of the Linux kernel that partition kernel resources such that one set of processes sees one set of resources, while another set of processes sees a different set of resources. The feature works by having the same namespace for a set of resources and processes, but those namespaces refer to distinct resources. Resources may exist in multiple namespaces. Examples of such resources are process IDs, host-names, user IDs, file names, some names associated with network access, and Inter-process communication.

References

  1. University of Glasgow. "Procedure for Applying Identifiers to Documents". Archived from the original on 5 June 2011. Retrieved 28 April 2009.
  2. University of Pennsylvania. "Information on Chemical Nomenclature". Archived from the original on 4 January 2009. Retrieved 28 April 2009.