Developer(s) | ISO/IEC, Ecma International |
---|---|
Variants | ISO 13490 |
Introduced | 1988 |
Limits | |
Max volume size | 8 TB |
Other | |
Supported operating systems | Cross platform |
Optical discs |
---|
ISO 9660 (also known as ECMA-119) is a file system for optical disc media. The file system is an international standard available from the International Organization for Standardization (ISO). Since the specification is available for anybody to purchase, [1] implementations have been written for many operating systems.
ISO 9660 traces its roots to the High Sierra Format, [2] which arranged file information in a dense, sequential layout to minimize nonsequential access by using a hierarchical (eight levels of directories deep) tree file system arrangement, similar to UNIX and FAT. To facilitate cross platform compatibility, it defined a minimal set of common file attributes (directory or ordinary file and time of recording) and name attributes (name, extension, and version), and used a separate system use area where future optional extensions for each file may be specified. High Sierra was adopted in December 1986 (with changes) as an international standard by Ecma International as ECMA-119 [3] and submitted for fast tracking to the ISO, where it was eventually accepted as ISO 9660:1988. [4] Subsequent amendments to the standard were published in 2013 and 2020.
The first 16 sectors of the file system are empty and reserved for other uses. The rest begins with a volume descriptor set (a header block which describes the subsequent layout) and then the path tables, directories and files on the disc. An ISO 9660 compliant disc must contain at least one primary volume descriptor describing the file system and a volume descriptor set terminator which is a volume descriptor that marks the end of the descriptor set. The primary volume descriptor provides information about the volume, characteristics and metadata, including a root directory record that indicates in which sector the root directory is located. Other fields contain metadata such as the volume's name and creator, along with the size and number of logical blocks used by the file system. Path tables summarize the directory structure of the relevant directory hierarchy. For each directory in the image, the path table provides the directory identifier, the location of the extent in which the directory is recorded, the length of any extended attributes associated with the directory, and the index of its parent directory path table entry.
There are several extensions to ISO 9660 that relax some of its limitations. Notable examples include Rock Ridge (Unix-style permissions and longer names), Joliet (Unicode, allowing non-Latin scripts to be used), El Torito (enables CDs to be bootable) and the Apple ISO 9660 Extensions (file characteristics specific to the classic Mac OS and macOS, such as resource forks, file backup date and more).
Compact discs were originally developed for recording musical data, but soon were used for storing additional digital data types because they were equally effective for archival mass data storage. Called CD-ROMs, the lowest level format for these type of compact discs was defined in the Yellow Book specification in 1983. However, this book did not define any format for organizing data on CD-ROMs into logical units such as files, which led to every CD-ROM maker creating its own format. In order to develop a CD-ROM file system standard (Z39.60 - Volume and File Structure of CDROM for Information Interchange), the National Information Standards Organization (NISO) set up Standards Committee SC EE (Compact Disc Data Format) in July 1985. [5] In September/ [6] October 1985 several companies invited experts to participate in the development of a working paper for such a standard.
In November 1985, representatives of computer hardware manufacturers gathered at the High Sierra Hotel and Casino (currently called the Golden Nugget Lake Tahoe) in Stateline, Nevada. [7] This group became known as the High Sierra Group (HSG). Present at the meeting were representatives from Apple Computer, AT&T,[ citation needed ] Digital Equipment Corporation (DEC), Hitachi, LaserData, Microware,[ citation needed ] Microsoft, 3M, Philips, Reference Technology Inc., Sony Corporation, TMS Inc., VideoTools (later Meridian [8] ), Xebec, and Yelick.[ citation needed ] The meeting report evolved from the Yellow Book CD-ROM standard, which was so open ended it was leading to diversification and creation of many incompatible data storage methods. The High Sierra Group Proposal (HSGP) was released in May 1986, defining a file system for CD-ROMs commonly known as the High Sierra Format.
A draft version of this proposal was submitted to the European Computer Manufacturers Association (ECMA) for standardization. With some changes, this led to the issue of the initial edition of the ECMA-119 standard in December 1986. [9] The ECMA submitted their standard to the International Standards Organization (ISO) for fast tracking, where it was further refined into the ISO 9660 standard. For compatibility the second edition of ECMA-119 was revised to be equivalent to ISO 9660 in December 1987. [10] [11] [12] ISO 9660:1988 was published in 1988. The main changes from the High Sierra Format in the ECMA-119 and ISO 9660 standards were international extensions to allow the format to work better on non-US markets.
In order not to create incompatibilities, NISO suspended further work on Z39.60, which had been adopted by NISO members on 28 May 1987. It was withdrawn before final approval, in favour of ISO 9660. [5]
JIS X 0606:1998 was passed in Japan in 1998 with much-relaxed file name rules using a new "enhanced volume descriptor" data structure. The standard was submitted for ISO 9660:1999 and supposedly fast-tracked, but nothing came out of it. [13] Nevertheless, several operating systems and disc authoring tools (such as Nero Burning ROM, mkisofs and ImgBurn) now support the addition, under such names as "ISO 9660:1999", "ISO 9660 v2", or "ISO 9660 Level 4". In 2013, the proposal was finally formalized in the form of ISO 9660/Amendment 1, intended to "bring harmonization between ISO 9660 and widely used 'Joliet Specification'." [14] In December 2017, a 3rd Edition of ECMA-119 was published that is technically identical with ISO 9660, Amendment 1. [15]
In 2019, ECMA published a 4th version of ECMA-119, integrating the Joliet text as "Annex C".
In 2020, ISO published Amendment 2, which adds some minor clarifying matter, but does not add or correct any technical information of the standard. [16]
The following is the rough overall structure of the ISO 9660 file system.
Multi-byte values can be stored in three different formats: little-endian, big-endian, and in a concatenation of both types in what the specification calls "both-byte" order. Both-byte order is required in several fields in the volume descriptors and directory records, while path tables can be either little-endian or big-endian. [17]
System area (32,768 B) | Unused by ISO 9660 |
Data area | |
Volume descriptor set | |
Path tables, directories and files |
The system area, the first 32,768 data bytes of the disc (16 sectors of 2,048 bytes each), is unused by ISO 9660 and therefore available for other uses. [17] While it is suggested that they are reserved for use by bootable media, [18] a CD-ROM may contain an alternative file system descriptor in this area, and it is often used by hybrid CDs to offer classic Mac OS-specific and macOS-specific content.[ citation needed ]
The data area begins with the volume descriptor set, a set of one or more volume descriptors terminated with a volume descriptor set terminator. These collectively act as a header for the data area, describing its content (similar to the BIOS parameter block used by FAT, HPFS and NTFS formatted disks).
Volume descriptor #1 |
... |
Volume descriptor #N |
Volume descriptor set terminator |
Each volume descriptor is 2048 bytes in size, fitting perfectly into a single Mode 1 or Mode 2 Form 1 sector. They have the following structure:
Part | Type | Identifier | Version | Data |
---|---|---|---|---|
Size | 1 byte | 5 bytes (always 'CD001') | 1 byte (always 0x01) | 2,041 bytes |
The data field of a volume descriptor may be subdivided into several fields, with the exact content depending on the type. Redundant copies of each volume descriptor can also be included in case the first copy of the descriptor becomes corrupt.
Standard volume descriptor types are the following:
Value | Type |
---|---|
0 | Boot record volume descriptor |
1 | Primary volume descriptor |
2 | Supplementary volume descriptor, or enhanced volume descriptor |
3 | Volume partition descriptor |
255 | Volume descriptor set terminator |
An ISO 9660 compliant disc must contain at least one primary volume descriptor describing the file system and a volume descriptor set terminator for indicating the end of the descriptor sequence. The volume descriptor set terminator is simply a particular type of volume descriptor with the purpose of marking the end of this set of structures. The primary volume descriptor provides information about the volume, characteristics and metadata, including a root directory record that indicates in which sector the root directory is located. Other fields contain the description or name of the volume, and information about who created it and with which application. The size of the logical blocks which the file system uses to segment the volume is also stored in a field inside the primary volume descriptor, as well as the amount of space occupied by the volume (measured in number of logical blocks).
In addition to the primary volume descriptor(s), supplementary volume descriptors or enhanced volume descriptors may be present.
Path tables summarize the directory structure of the relevant directory hierarchy. For each directory in the image, the path table provides the directory identifier, the location of the extent in which the directory is recorded, the length of any extended attributes associated with the directory, and the index of its parent directory path table entry. The parent directory number is a 16-bit number, limiting its range from 1 to 65,535. [19]
Directory entries are stored following the location of the root directory entry, where evaluation of filenames is begun. Both directories and files are stored as extents, which are sequential series of sectors. Files and directories are differentiated only by a file attribute that indicates its nature (similar to Unix). The attributes of a file are stored in the directory entry that describes the file, and optionally in the extended attribute record. To locate a file, the directory names in the file's path can be checked sequentially, going to the location of each directory to obtain the location of the subsequent subdirectory. However, a file can also be located through the path table provided by the file system. This path table stores information about each directory, its parent, and its location on disc. Since the path table is stored in a contiguous region, it can be searched much faster than jumping to the particular locations of each directory in the file's path, thus reducing seek time.
The standard specifies three nested levels of interchange (paraphrased from section 10):
Additional restrictions in the body of the standard: The depth of the directory hierarchy must not exceed 8 (root directory being at level 1), and the path length of any file must not exceed 255. (section 6.8.2.1).
The standard also specifies the following name restrictions (sections 7.5 and 7.6): [4]
A CD-ROM producer may choose one of the lower Levels of Interchange specified in chapter 10 of the standard, and further restrict file name length from 30 characters to only 8+3 in file identifiers, and 8 in directory identifiers in order to promote interchangeability with implementations that do not implement the full standard.[ citation needed ]
All numbers in ISO 9660 file systems except the single byte value used for the GMT offset are unsigned numbers. As the length of a file's extent on disc is stored in a 32 bit value, [20] it allows for a maximum length of just over 4.2 GB (more precisely, one byte less than 4 GiB). It is possible to circumvent this limitation by using the multi-extent (fragmentation) feature of ISO 9660 Level 3 to create ISO 9660 file systems and single files up to 8 TB. With this, files larger than 4 GiB can be split up into multiple extents (sequential series of sectors), each not exceeding the 4 GiB limit. For example, the free software such as InfraRecorder, ImgBurn and mkisofs as well as Roxio Toast are able to create ISO 9660 file systems that use multi-extent files to store files larger than 4 GiB on appropriate media such as recordable DVDs.[ citation needed ] Linux supports multiple extents. [21]
Since amendment 1 (or ECMA-119 3rd edition, or "JIS X 0606:1998 / ISO 9660:1999"), a much wider variety of file trees can be expressed by the EVD system. There is no longer any character limit (even 8-bit characters are allowed), nor any depth limit or path length limit. There still is a limit on name length, at 207. The character set is no longer enforced, so both sides of the disc interchange need to agree via a different channel. [15]
There are several extensions to ISO 9660 that relax some of its limitations. Notable examples include Rock Ridge (Unix-style permissions and longer names), Joliet (Unicode, allowing non-Latin scripts to be used), El Torito (enables CDs to be bootable) and the Apple ISO 9660 Extensions (file characteristics specific to the classic Mac OS and macOS, such as resource forks, file backup date and more).
System Use Sharing Protocol (SUSP, IEEE P1281) provides a generic way of including additional properties for any directory entry reachable from the primary volume descriptor (PVD). In an ISO 9660 volume, every directory entry has an optional system use area whose contents are undefined and left to be interpreted by the system. SUSP defines a method to subdivide that area into multiple system use fields, each identified by a two-character signature tag. The idea behind SUSP was that it would enable any number of independent extensions to ISO 9660 to be created and included on a volume without conflicting. It also allows for the inclusion of property data that would otherwise be too large to fit within the limits of the system use area.
SUSP defines several common tags and system use fields:
CE
: Continuation areaPD
: Padding fieldSP
: System use sharing protocol indicatorST
: System use sharing protocol terminatorER
: Extensions referenceES
: Extension selectorOther known SUSP fields include:
AA
: Apple extension, preferredBA
: Apple extension, old (length attribute is missing)AS
: Amiga file propertiesZF
: zisofs compressed file, usually produced by program mkzftree or by libisofs. Transparently decompressed by Linux kernel if built with CONFIG_ZISOFS. [22] AL
: records Extended File Attributes, including ACLs. Proposed by libburnia, supported by libisofs. [23] The Apple extensions do not technically follow the SUSP standard; however the basic structure of the AA and AB fields defined by Apple are forward compatible with SUSP; so that, with care, a volume can use both Apple extensions as well as RRIP extensions.
The Rock Ridge Interchange Protocol (RRIP, IEEE P1282) is an extension which adds POSIX file system semantics. The availability of these extension properties allows for better integration with Unix and Unix-like operating systems. [24] The standard takes its name from the fictional town Rock Ridge in Mel Brooks' film Blazing Saddles . [25] The RRIP extensions are, briefly:
The RRIP extensions are built upon SUSP, defining additional tags for support of POSIX semantics, along with the format and meaning of the corresponding system use fields:
RR
: Rock Ridge extensions in-use indicator (note: dropped from standard after version 1.09)PX
: POSIX file attributesPN
: POSIX device numbersSL
: symbolic linkNM
: alternate nameCL
: child linkPL
: parent linkRE
: relocated directoryTF
: time stampSF
: sparse file dataAmiga Rock Ridge is similar to RRIP, except it provides additional properties used by AmigaOS. It too is built on the SUSP standard by defining an "AS"-tagged system use field. Thus both Amiga Rock Ridge and the POSIX RRIP may be used simultaneously on the same volume. Some of the specific properties supported by this extension are the additional Amiga-bits for files. There is support for attribute "P" that stands for "pure" bit (indicating re-entrant command) and attribute "S" for script bit (indicating batch file). This includes the protection flags plus an optional comment field. These extensions were introduced by Angela Schmidt with the help of Andrew Young, the primary author of the Rock Ridge Interchange Protocol and System Use Sharing Protocol. The first publicly available software to master a CD-ROM with Amiga extensions was MakeCD, an Amiga software which Angela Schmidt developed together with Patrick Ohly. [26]
This section is missing information about relationship to FAT filesystem, operation under UEFI.(September 2023) |
El Torito is an extension designed to allow booting a computer from a CD-ROM. It was announced in November 1994 [27] and first issued in January 1995 as a joint proposal by IBM and BIOS manufacturer Phoenix Technologies. According to legend, the El Torito CD/DVD extension to ISO 9660 got its name because its design originated in an El Torito restaurant in Irvine, California ( 33°41′05″N117°51′09″W / 33.684722°N 117.852547°W ). [28] The initial two authors were Curtis Stevens, of Phoenix Technologies, and Stan Merkin, of IBM. [28]
A 32-bit PC BIOS will search for boot code on an ISO 9660 CD-ROM. The standard allows for booting in two different modes. Either in hard disk emulation when the boot information can be accessed directly from the CD media, or in floppy emulation mode where the boot information is stored in an image file of a floppy disk, which is loaded from the CD and then behaves as a virtual floppy disk. This is useful for computers that were designed to boot only from a floppy drive. For modern computers the "no emulation" mode is generally the more reliable method. The BIOS will assign a BIOS drive number to the CD drive. The drive number (for INT 13H) assigned is any of 80hex (hard disk emulation), 00hex (floppy disk emulation) or an arbitrary number if the BIOS should not provide emulation. Emulation is useful for booting older operating systems from a CD, by making it appear to them as if they were booted from a hard or floppy disk.
El Torito can also be used to produce CDs which can boot up Linux operating systems, by including the GRUB bootloader on the CD and following the Multiboot Specification. [29] While the El Torito spec alludes to a "Mac" platform ID, PowerPC-based Apple Macintosh computers don't use it. [30]
Joliet is an extension specified and endorsed by Microsoft and has been supported by all versions of its Windows operating system since Windows 95 [31] and Windows NT 4.0. [32] Its primary focus is the relaxation of the filename restrictions inherent with full ISO 9660 compliance. Joliet accomplishes this by supplying an additional set of filenames that are encoded in UCS-2BE (UTF-16BE in practice since Windows 2000). These filenames are stored in a special supplementary volume descriptor, that is safely ignored by ISO 9660-compliant software, thus preserving backward compatibility. [31] The specification only allows filenames to be up to 64 Unicode characters in length. However, the documentation for mkisofs states filenames up to 103 characters in length do not appear to cause problems. [33] Microsoft has documented it "can use up to 110 characters." [34] The difference lies in whether CDXA extension space is used. [33]
Joliet allows Unicode characters to be used for all text fields, which includes file names and the volume name. A "Secondary" volume descriptor with type 2 contains the same information as the Primary one (sector 16 offset 40 bytes), but in UCS-2BE in sector 17, offset 40 bytes. As a result of this, the volume name is limited to 16 characters.
Many current PC operating systems are able to read Joliet-formatted media, thus allowing exchange of files between those operating systems even if non-Roman characters are involved (such as Arabic, Japanese or Cyrillic), which was formerly not possible with plain ISO 9660-formatted media. Operating systems which can read Joliet media include:
Romeo was developed by Adaptec and allows the use of long filenames up to 128 characters, written directly into the primary volume descriptor using the current code page. This format is built around the workings of Windows 9x and Windows NT "CDFS" drivers. [41] When a Windows installation of a different language opens a Romeo disk, the lack of code page indication will cause non-ASCII characters in file names to become Mojibake. For example, "ü" may become "³". A different OS may encounter a similar problem or refuse to recognize these noncompliant names outright.
The same code page problem technically exists in standard ISO 9660, which allows open interpretation of the supplemental and enhanced volume descriptors to any character encoding subject to agreement. However, the primary volume descriptor is guaranteed to be a small subset of ASCII.
Apple Computer authored a set of extensions that add ProDOS or HFS/HFS+ (the primary contemporary file systems for the classic Mac OS) properties to the filesystem. Some of the additional metadata properties include: [42]
In order to allow non-Macintosh systems to access Macintosh files on CD-ROMs, Apple chose to use an extension of the standard ISO 9660 format. Most of the data, other than the Apple specific metadata, remains visible to operating systems that are able to read ISO 9660.
For operating systems which do not support any extensions, a name translation file TRANS.TBL
must be used. The TRANS.TBL
file is a plain ASCII text file. Each line contains three fields, separated by an arbitrary amount of whitespace:
Most implementations that create TRANS.TBL files put a single space between the file type and ISO 9660 name and some arbitrary number of tabs between the ISO 9660 filename and the extended filename.
Native support for using TRANS.TBL
still exists in many ISO 9660 implementations, particularly those related to Unix. However, it has long since been superseded by other extensions, and modern utilities that create ISO 9660 images either cannot create TRANS.TBL files at all, or no longer create them unless explicitly requested by the user. Since a TRANS.TBL file has no special identification other than its name, it can also be created separately and included in the directory before filesystem creation.
The ISO 13490 standard is an extension to the ISO 9660 format that adds support for multiple sessions on a disc. Since ISO 9660 is by design a read-only, pre-mastered file system, all the data has to be written in one go or "session" to the medium. Once written, there is no provision for altering the stored content. ISO 13490 was created to allow adding more files to a writeable disc such as CD-R in multiple sessions.
The ISO 13346/ECMA-167 standard was designed in conjunction to the ISO 13490 standard. This new format addresses most of the shortcomings of ISO 9660, and a subset of it evolved into the Universal Disk Format (UDF), which was adopted for DVDs. The volume descriptor table retains the ISO9660 layout, but the identifier has been updated. [43] [44]
Optical disc images are a common way to electronically transfer the contents of CD-ROMs. They often have the filename extension .iso
(.iso9660
is less common, but also in use) and are commonly referred to as "ISOs". [45]
Most operating systems support reading of ISO 9660 formatted discs, and most new versions support the extensions such as Rock Ridge and Joliet. Operating systems that do not support the extensions usually show the basic (non-extended) features of a plain ISO 9660 disc.
Operating systems that support ISO 9660 and its extensions include the following:
Universal Disk Format (UDF) is an open, vendor-neutral file system for computer data storage for a broad range of media. In practice, it has been most widely used for DVDs and newer optical disc formats, supplanting ISO 9660. Due to its design, it is very well suited to incremental updates on both write-once and re-writable optical media. UDF was developed and maintained by the Optical Storage Technology Association (OSTA).
File Allocation Table (FAT) is a file system developed for personal computers and was the default filesystem for MS-DOS and Windows 9x operating systems. Originally developed in 1977 for use on floppy disks, it was adapted for use on hard disks and other devices. The increase in disk drives capacity required four major variants: FAT12, FAT16, FAT32, and ExFAT. FAT was replaced with NTFS as the default file system on Microsoft operating systems starting with Windows XP. Nevertheless, FAT continues to be used on flash and other solid-state memory cards and modules, many portable and embedded devices because of its compatibility and ease of implementation.
An 8.3 filename is one that obeys the filename convention used by old versions of DOS and versions of Microsoft Windows prior to Windows 95 and Windows NT 3.5. It is also used in modern Microsoft operating systems as an alternate filename to the long filename, to provide compatibility with legacy programs. The filename convention is limited by the FAT file system. Similar 8.3 file naming schemes have also existed on earlier CP/M, TRS-80, Atari, and some Data General and Digital Equipment Corporation minicomputer operating systems.
ISO/IEC 13490 is the successor to ISO 9660, intended to describe the file system of a CD-ROM or CD-R.
An optical disc image is a disk image that contains everything that would be written to an optical disc, disk sector by disc sector, including the optical disc file system. ISO images contain the binary image of an optical media file system, including the data in its files in binary format, copied exactly as they were stored on the disc. The data inside the ISO image will be structured according to the file system that was used on the optical disc from which it was created.
The Syslinux Project is a suite of five different boot loaders for starting up Linux distros on computers. It was primarily developed by H. Peter Anvin.
Optical disc authoring, including CD, DVD, and Blu-ray Disc authoring, is the process of assembling source material—video, audio or other data—into the proper logical volume format to then be recorded ("burned") onto an optical disc. This act is sometimes done illegally, by pirating copyrighted material without permission from the original artists.
The Rainbow Books are a collection of CD format specifications generally written and published by standards bodies including the ISO, IEC, and ECMA.
Files-11 is the file system used in the RSX-11 and OpenVMS operating systems from Digital Equipment Corporation. It supports record-oriented I/O, remote network access, and file versioning. The original ODS-1 layer is a flat file system; the ODS-2 version is a hierarchical file system, with support for access control lists,.
The Amiga Fast File System is a file system used on the Amiga personal computer. The previous Amiga filesystem was never given a specific name and known originally simply as "DOS" or AmigaDOS. Upon the release of FFS, the original filesystem became known as Amiga Old File System (OFS). OFS, which was primarily designed for use with floppy disks, had been proving slow to keep up with hard drives of the era. FFS was designed as a full replacement for the original Amiga filesystem. FFS differs from its predecessor mainly in the removal of redundant information. Data blocks contain nothing but data, allowing the filesystem to manage the transfer of large chunks of data directly from the host adapter to the final destination.
Ecma International is a nonprofit standards organization for information and communication systems. It acquired its current name in 1994, when the European Computer Manufacturers Association (ECMA) changed its name to reflect the organization's global reach and activities. As a consequence, the name is no longer considered an acronym and no longer uses full capitalization.
In computer data storage, a volume or logical drive is a single accessible storage area with a single file system, typically resident on a single partition of a hard disk. Although a volume might be different from a physical disk drive, it can still be accessed with an operating system's logical interface. However, a volume differs from a partition.
IMG, in computing, refers to binary files with the .img
filename extension that store raw disk images of floppy disks, hard drives, and optical discs or a bitmap image – .img
.
Apple Partition Map (APM) is a partition scheme used to define the low-level organization of data on disks formatted for use with 68k and PowerPC Macintosh computers. It was introduced with the Macintosh II.
A hybrid disc is a disc, such as CD-ROM or Blu-ray, which contains multiple types of data which can be used differently on different devices. These include CD-ROM music albums containing video files viewable on a personal computer, or feature film Blu-rays containing interactive content when used with a PlayStation 3 game console.
Sidecar files, also known as buddy files or connected files, are computer files that store data which is not supported by the format of a source file.
A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free.
A CD-ROM is a type of read-only memory consisting of a pre-pressed optical compact disc that contains data computers can read—but not write or erase—CD-ROMs. Some CDs, called enhanced CDs, hold both computer data and audio with the latter capable of being played on a CD player, while data is only usable on a computer.
Compact File Set (CFS) is an open archive file format and software distribution container file format.
The FAT file system is a file system used on MS-DOS and Windows 9x family of operating systems. It continues to be used on mobile devices and embedded systems, and thus is a well suited file system for data exchange between computers and devices of almost any type and age from 1981 through the present.