CFS (file format)

Last updated
Compact File Set
Filename extension
.cfs
Internet media type
application/x-cfs-compressed
Developed by Joe Lowe (Pismo Technic Inc.)
Type of format Archive format
Open format?Yes (See cfs.h file)

Compact File Set (CFS) is an open archive file format and software distribution container file format.

Contents

Overview

The Compact File Set (CFS) is an open archive file format and software distribution container file format. [1] Basic CFS files are compatible with ISO files. It is intended to be similar enough to ISO-9660 that many systems and applications will be able to read CFS, and other applications will require only minor modifications. It is based on:

It is available for use in free or commercial applications without charge. It is supposed that no parts of the format are covered by patents. [2]

The primary application is expected to be container files for various archiving and distribution applications, but CFS may be useful when written directly to CD/DVD media.

Goals

Main differences of CFS from ISO-9660

Technical Information

Media header

The first 20 blocks (40K) of the logical image is the media header. The layout of the media header is compatible with the various descriptor and directory structures for ISO-9660. The first block of file data is stored in block 20, immediately following the media header.

The media header has the following layout:

block 0-11
all zero
block 12
compatibility readme file text
block 13
compatibility root folder
block 14
compatibility little-endian path table
block 15
compatibility big-endian path table
block 16
ISO-9660 compatibility primary volume descriptor
block 17
ISO-9660 supplementary volume descriptor
block 18
ISO-9660 terminating descriptor
block 19
all zero

The primary volume descriptor in the media header references the fixed compatibility root folder and readme, to help users identify applications and systems that do not use the supplementary volume descriptor. The supplementary volume descriptor indicates the UCS-2 character set and references the real directory structure. The media header should be initialized exactly as is done in the logic in this header file. No additional application data, system data, comments, dates, text, etc., should be added to the media header.

Unicode file names

All file names and the system ID and volume ID fields of the supplementary volume descriptor are encoded as UCS-2, big-endian.

File name lengths are limited by the 8 bit file record size to 110 16 bit characters. No arbitrary limits are imposed on directory hierarchy depth or combined length of a file name and included folder name components. Readers will need to choose an appropriate limit for their environment and perform checks as necessary. As in ISO-9660-1999, version numbers are not added to file names. As in ISO-9660-1999, special meaning of the '.' and ';' characters during file name sorting is eliminated.

Optional path tables

Path tables consume media space with redundant information, and restrict media to a maximum of 64k folders. Readers should not reference path tables. Writers may choose to generate path tables to increase compatibility with ISO-9660 readers. Path tables must be written with the directory data (folder extents), beyond the last block of file data. Note that correct path tables cannot be generated for media containing more than 64K folders. Writers that are modifying an existing media may choose to remove existing path tables. If path tables are not present then the three related volume descriptor fields in the supplementary volume descriptor must be set to zero.

Extended attributes

Extended attributes are reserved for future extensions to CFS. Writers must not create extended attributes. Readers must gracefully handle extended attributes if they exist. File data must be contiguous, and restricted use of duplicate file records for multi-extent files. All data for each file must exist in one contiguous extent. This is true even when the files are represented using multiple file records. Interleaved files must not be created. Associated files must not be created.

Duplicate file records are to be used only to allow representing files with data extents that are larger than 4GiB-2048. Duplicate file records are not to be used to represent files with fragmented data. When duplicate file records are used, the multi-extent flag must also be used as indicated in ISO-9660-1999 specification. Duplicate file records should not be created unless the total data size of the file is greater than 4Gib-2048. When duplicate file records exist for a file, all but the last file record must have a data extent that is exactly 4Gib-2048 bytes in size.

Location of directory data on media

All file data must precede all folder extents and path tables on media. The intent is that an image modifying application can read the entire directory into memory, add new file data to the image, and rewrite an updated directory after the new file data. Writers will need to determine the last block of file data after reading the entire directory.

Media header patch area

When the media header is modified, either at the end of image creation or as part of later modifications to an existing image, only some specific fields are to be updated. These fields exist entirely within the media header patch area. Only the media header patch area should be re-written. This allows more options when dealing with image container file formats or transports with limited seeking or overwrite capability (compressed formats, pipes, sockets).

Format extensions and compound file systems

All files and folders written in the image must be accessible through the single directory structure referenced from the supplementary volume descriptor. Compound file systems, such as including UDF or HFS structures, are not allowed. Rockridge and other ISO-9660 extensions are not allowed.

Extensions for archiving system specific attributes

Future versions of CFS may include extensions to allow storing system specific attributes such as time fields, security descriptors, access control lists, resource forks, symbolic links etc.. Developers with a need for these extensions should contact Pismo Technic with requirements and/or suggestions.

Media formats

CFS images are either written to CD/DVD media, or are stored in a media container file. The media container file can be a raw dump of the CFS image, referred to here as DD, but more commonly known as ISO files. Also, the media container file can be a more structured container format that provides additional features such as compression and spanning. CFS images are only compliant with this specification when they are stored in DD or CISO (Compact ISO) format media files. When burned to CD/DVD media or when stored in other media container file formats such as NRG or DAA, the combination is not CFS compliant and should not be referred to as a CFS file.

Note: Compact ISO is not the same format as the [[compressed ISO]] format common in PlayStation Portable homebrew development. The PSP compressed ISO format is also referred to as CISO, but the file extension is CSO.

CFS writing applications should default to writing DD format media container files unless the user has specified container file options that require CISO (spanning, compression, ...). This provides more intuitive interchange with systems and applications that support DD CD/DVD images but do not support CFS.

See also

Related Research Articles

In computing, a computer file is a resource for recording data on a computer storage device, primarily identified by its filename. Just as words can be written on paper, so too can data be written to a computer file. Files can be shared with and transferred between computers and mobile devices via removable media, networks, or the Internet.

<span class="mw-page-title-main">ISO 9660</span> File system for CD-R and CD-ROM optical discs

ISO 9660 is a file system for optical disc media. The file system is an international standard available from the International Organization for Standardization (ISO). Since the specification is available for anybody to purchase, implementations have been written for many operating systems.

NT File System (NTFS) is a proprietary journaling file system developed by Microsoft in the 1990s.

Universal Disk Format (UDF) is an open, vendor-neutral file system for computer data storage for a broad range of media. In practice, it has been most widely used for DVDs and newer optical disc formats, supplanting ISO 9660. Due to its design, it is very well suited to incremental updates on both write-once and re-writable optical media. UDF was developed and maintained by the Optical Storage Technology Association (OSTA).

An 8.3 filename is one that obeys the filename convention used by old versions of DOS and versions of Microsoft Windows prior to Windows 95 and Windows NT 3.5. It is also used in modern Microsoft operating systems as an alternate filename to the long filename, to provide compatibility with legacy programs. The filename convention is limited by the FAT file system. Similar 8.3 file naming schemes have also existed on earlier CP/M, TRS-80, Atari, and some Data General and Digital Equipment Corporation minicomputer operating systems.

The JPEG File Interchange Format (JFIF) is an image file format standard published as ITU-T Recommendation T.871 and ISO/IEC 10918-5. It defines supplementary specifications for the container format that contains the image data encoded with the JPEG algorithm. The base specifications for a JPEG container format are defined in Annex B of the JPEG standard, known as JPEG Interchange Format (JIF). JFIF builds over JIF to solve some of JIF's limitations, including unnecessary complexity, component sample registration, resolution, aspect ratio, and color space. Because JFIF is not the original JPG standard, one might expect another MIME type. However, it is still registered as "image/jpeg".

Tag Image File Format or Tagged Image File Format, commonly known by the abbreviations TIFF or TIF, is an image file format for storing raster graphics images, popular among graphic artists, the publishing industry, and photographers. TIFF is widely supported by scanning, faxing, word processing, optical character recognition, image manipulation, desktop publishing, and page-layout applications. The format was created by the Aldus Corporation for use in desktop publishing. It published the latest version 6.0 in 1992, subsequently updated with an Adobe Systems copyright after the latter acquired Aldus in 1994. Several Aldus or Adobe technical notes have been published with minor extensions to the format, and several specifications have been based on TIFF 6.0, including TIFF/EP, TIFF/IT, TIFF-F and TIFF-FX.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own, such as devices that use magnetic tape. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed. The ZIP file format permits a number of compression algorithms, though DEFLATE is the most common. This format was originally created in 1989 and was first implemented in PKWARE, Inc.'s PKZIP utility, as a replacement for the previous ARC compression format by Thom Henderson. The ZIP format was then quickly supported by many software utilities other than PKZIP. Microsoft has included built-in ZIP support in versions of Microsoft Windows since 1998 via the "Plus! 98" addon for Windows 98. Native support was added as of the year 2000 in Windows ME. Apple has included built-in ZIP support in Mac OS X 10.3 and later. Most free operating systems have built in support for ZIP in similar manners to Windows and macOS.

<span class="mw-page-title-main">ISO 13490</span> Successor to ISO 9660, an optical disc recording standard

ISO/IEC 13490 is the successor to ISO 9660, intended to describe the file system of a CD-ROM or CD-R.

An optical disc image is a disk image that contains everything that would be written to an optical disc, disk sector by disc sector, including the optical disc file system. ISO images contain the binary image of an optical media file system, including the data in its files in binary format, copied exactly as they were stored on the disc. The data inside the ISO image will be structured according to the file system that was used on the optical disc from which it was created.

<span class="mw-page-title-main">Optical disc authoring</span> Content publishing on optical disks

Optical disc authoring, including CD, DVD, and Blu-ray Disc authoring, is the process of assembling source material—video, audio or other data—into the proper logical volume format to then be recorded ("burned") onto an optical disc. This act is sometimes done illegally, by pirating copyrighted material without permission from the original artists.

Files-11 is the file system used in the RSX-11 and OpenVMS operating systems from Digital Equipment Corporation. It supports record-oriented I/O, remote network access, and file versioning. The original ODS-1 layer is a flat file system; the ODS-2 version is a hierarchical file system, with support for access control lists,.

A container format or metafile is a file format that allows multiple data streams to be embedded into a single file, usually along with metadata for identifying and further detailing those streams. Notable examples of container formats include archive files and formats used for multimedia playback. Among the earliest cross-platform container formats were Distinguished Encoding Rules and the 1985 Interchange File Format.

Apple Partition Map (APM) is a partition scheme used to define the low-level organization of data on disks formatted for use with 68k and PowerPC Macintosh computers. It was introduced with the Macintosh II.

A hybrid disc is a disc, such as CD-ROM or Blu-ray, which contains multiple types of data which can be used differently on different devices. These include CD-ROM music albums containing video files viewable on a personal computer, or feature film Blu-rays containing interactive content when used with a PlayStation 3 game console.

The Open Packaging Conventions (OPC) is a container-file technology initially created by Microsoft to store a combination of XML and non-XML files that together form a single entity such as an Open XML Paper Specification (OpenXPS) document. OPC-based file formats combine the advantages of leaving the independent file entities embedded in the document intact and resulting in much smaller files compared to normal use of XML.

A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free.

<span class="mw-page-title-main">Apple Disk Image</span> File format developed by Apple and used by macOS

AppleDisk Image is a disk image format commonly used by the macOS operating system. When opened, an Apple Disk Image is mounted as a volume within the Finder.

The FAT file system is a file system used on MS-DOS and Windows 9x family of operating systems. It continues to be used on mobile devices and embedded systems, and thus is a well-suited file system for data exchange between computers and devices of almost any type and age from 1981 through to the present.

References

  1. Joe Lowe, Joe Lowe. "cfs.h". CFS source code header file. Retrieved 2008-04-06.
  2. "Pismo Technic Web Page - CFS information".