CFS (file format)

Last updated
Compact File Set
Filename extension
.cfs
Internet media type
application/x-cfs-compressed
Developed by Joe Lowe (Pismo Technic Inc.)
Type of format Archive format
Open format?Yes (See cfs.h file)

Compact File Set (CFS) is an open archive file format and software distribution container file format.

Contents

Overview

The Compact File Set (CFS) is an open archive file format and software distribution container file format. [1] Basic CFS files are compatible with ISO files. It is intended to be similar enough to ISO-9660 that many systems and applications will be able to read CFS, and other applications will require only minor modifications. It is based on:

It is available for use in free or commercial applications without charge. It is supposed that no parts of the format are covered by patents. [2]

The primary application is expected to be container files for various archiving and distribution applications, but CFS may be useful when written directly to CD/DVD media.

Goals

Main differences of CFS from ISO-9660

Technical Information

Media header

The first 20 blocks (40K) of the logical image is the media header. The layout of the media header is compatible with the various descriptor and directory structures for ISO-9660. The first block of file data is stored in block 20, immediately following the media header.

The media header has the following layout:

block 0-11
all zero
block 12
compatibility readme file text
block 13
compatibility root folder
block 14
compatibility little-endian path table
block 15
compatibility big-endian path table
block 16
ISO-9660 compatibility primary volume descriptor
block 17
ISO-9660 supplementary volume descriptor
block 18
ISO-9660 terminating descriptor
block 19
all zero

The primary volume descriptor in the media header references the fixed compatibility root folder and readme, to help users identify applications and systems that do not use the supplementary volume descriptor. The supplementary volume descriptor indicates the UCS-2 character set and references the real directory structure. The media header should be initialized exactly as is done in the logic in this header file. No additional application data, system data, comments, dates, text, etc., should be added to the media header.

Unicode file names

All file names and the system ID and volume ID fields of the supplementary volume descriptor are encoded as UCS-2, big-endian.

File name lengths are limited by the 8 bit file record size to 110 16 bit characters. No arbitrary limits are imposed on directory hierarchy depth or combined length of a file name and included folder name components. Readers will need to choose an appropriate limit for their environment and perform checks as necessary. As in ISO-9660-1999, version numbers are not added to file names. As in ISO-9660-1999, special meaning of the '.' and ';' characters during file name sorting is eliminated.

Optional path tables

Path tables consume media space with redundant information, and restrict media to a maximum of 64k folders. Readers should not reference path tables. Writers may choose to generate path tables to increase compatibility with ISO-9660 readers. Path tables must be written with the directory data (folder extents), beyond the last block of file data. Note that correct path tables cannot be generated for media containing more than 64K folders. Writers that are modifying an existing media may choose to remove existing path tables. If path tables are not present then the three related volume descriptor fields in the supplementary volume descriptor must be set to zero.

Extended attributes

Extended attributes are reserved for future extensions to CFS. Writers must not create extended attributes. Readers must gracefully handle extended attributes if they exist. File data must be contiguous, and restricted use of duplicate file records for multi-extent files. All data for each file must exist in one contiguous extent. This is true even when the files are represented using multiple file records. Interleaved files must not be created. Associated files must not be created.

Duplicate file records are to be used only to allow representing files with data extents that are larger than 4GiB-2048. Duplicate file records are not to be used to represent files with fragmented data. When duplicate file records are used, the multi-extent flag must also be used as indicated in ISO-9660-1999 specification. Duplicate file records should not be created unless the total data size of the file is greater than 4Gib-2048. When duplicate file records exist for a file, all but the last file record must have a data extent that is exactly 4Gib-2048 bytes in size.

Location of directory data on media

All file data must precede all folder extents and path tables on media. The intent is that an image modifying application can read the entire directory into memory, add new file data to the image, and rewrite an updated directory after the new file data. Writers will need to determine the last block of file data after reading the entire directory.

Media header patch area

When the media header is modified, either at the end of image creation or as part of later modifications to an existing image, only some specific fields are to be updated. These fields exist entirely within the media header patch area. Only the media header patch area should be re-written. This allows more options when dealing with image container file formats or transports with limited seeking or overwrite capability (compressed formats, pipes, sockets).

Format extensions and compound file systems

All files and folders written in the image must be accessible through the single directory structure referenced from the supplementary volume descriptor. Compound file systems, such as including UDF or HFS structures, are not allowed. Rockridge and other ISO-9660 extensions are not allowed.

Extensions for archiving system specific attributes

Future versions of CFS may include extensions to allow storing system specific attributes such as time fields, security descriptors, access control lists, resource forks, symbolic links etc.. Developers with a need for these extensions should contact Pismo Technic with requirements and/or suggestions.

Media formats

CFS images are either written to CD/DVD media, or are stored in a media container file. The media container file can be a raw dump of the CFS image, referred to here as DD, but more commonly known as ISO files. Also, the media container file can be a more structured container format that provides additional features such as compression and spanning. CFS images are only compliant with this specification when they are stored in DD or CISO (Compact ISO) format media files. When burned to CD/DVD media or when stored in other media container file formats such as NRG or DAA, the combination is not CFS compliant and should not be referred to as a CFS file.

Note: Compact ISO is not the same format as the [[compressed ISO]] format common in PlayStation Portable homebrew development. The PSP compressed ISO format is also referred to as CISO, but the file extension is CSO.

CFS writing applications should default to writing DD format media container files unless the user has specified container file options that require CISO (spanning, compression, ...). This provides more intuitive interchange with systems and applications that support DD CD/DVD images but do not support CFS.

See also

Related Research Articles

A computer file is a computer resource for recording data in a computer storage device, primarily identified by its file name. Just as words can be written to paper, so can data be written to a computer file. Files can be shared with and transferred between computers and mobile devices via removable media, networks, or the Internet.

ISO 9660 File system for CD-R and CD-ROM optical discs

ISO 9660 is a file system for optical disc media. Being sold by the International Organization for Standardization (ISO) the file system is considered an international technical standard. Since the specification is available for anybody to purchase, implementations have been written for many operating systems.

New Technology File System (NTFS) is a proprietary journaling file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows NT family. It superseded File Allocation Table (FAT) as the preferred filesystem on Windows and is supported in Linux and BSD as well. NTFS reading and writing support is provided using a free and open-source kernel implementation known as NTFS3 in Linux and the NTFS-3G driver in BSD. By using the convert command, Windows can convert FAT32/16/12 into NTFS without the need to rewrite all files. NTFS uses several files typically hidden from the user to store metadata about other files stored on the drive which can help improve speed and performance when reading data. Unlike FAT and High Performance File System (HPFS), NTFS supports access control lists (ACLs), filesystem encryption, transparent compression, sparse files and file system journaling. NTFS also supports shadow copy to allow backups of a system while it is running, but the functionality of the shadow copies varies between different versions of Windows.

Universal Disk Format (UDF) is an open, vendor-neutral file system for computer data storage for a broad range of media. In practice, it has been most widely used for DVDs and newer optical disc formats, supplanting ISO 9660. Due to its design, it is very well suited to incremental updates on both recordable and (re)writable optical media. UDF was developed and maintained by the Optical Storage Technology Association (OSTA).

An 8.3 filename is a filename convention used by old versions of DOS and versions of Microsoft Windows prior to Windows 95 and Windows NT 3.5. It is also used in modern Microsoft operating systems as an alternate filename to the long filename for compatibility with legacy programs. The filename convention is limited by the FAT file system. Similar 8.3 file naming schemes have also existed on earlier CP/M, TRS-80, Atari, and some Data General and Digital Equipment Corporation minicomputer operating systems.

The JPEG File Interchange Format (JFIF) is an image file format standard published as ITU-T Recommendation T.871 and ISO/IEC 10918-5. It defines supplementary specifications for the container format that contains the image data encoded with the JPEG algorithm. The base specifications for a JPEG container format are defined in Annex B of the JPEG standard, known as JPEG Interchange Format (JIF). JFIF builds over JIF to solve some of JIF's limitations, including unnecessary complexity, component sample registration, resolution, aspect ratio, and color space. Because JFIF is not the original JPG standard, one may expect another mime-type, but somehow it's still registered as "image/jpeg".

TIFF Series of image file formats

Tag Image File Format, abbreviated TIFF or TIF, is an image file format for storing raster graphics images, popular among graphic artists, the publishing industry, and photographers. TIFF is widely supported by scanning, faxing, word processing, optical character recognition, image manipulation, desktop publishing, and page-layout applications. The format was created by the Aldus Corporation for use in desktop publishing. It published the latest version 6.0 in 1992, subsequently updated with an Adobe Systems copyright after the latter acquired Aldus in 1994. Several Aldus or Adobe technical notes have been published with minor extensions to the format, and several specifications have been based on TIFF 6.0, including TIFF/EP, TIFF/IT, TIFF-F and TIFF-FX.

ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed. The ZIP file format permits a number of compression algorithms, though DEFLATE is the most common. This format was originally created in 1989 and was first implemented in PKWARE, Inc.'s PKZIP utility, as a replacement for the previous ARC compression format by Thom Henderson. The ZIP format was then quickly supported by many software utilities other than PKZIP. Microsoft has included built-in ZIP support in versions of Microsoft Windows since 1998 via the "Windows Plus!" addon for Windows 98. Native support was added as of the year 2000 in Windows ME. Apple has included built-in ZIP support in Mac OS X 10.3 and later. Most free operating systems have built in support for ZIP in similar manners to Windows and Mac OS X.

ISO 13490 Successor to ISO 9660, an optical disc recording standard

ISO/IEC 13490 is the successor to ISO 9660, intended to describe the file system of a CD-ROM or CD-R.

Optical disc authoring Content publishing on optical disks

Optical disc authoring, including DVD and Blu-ray Disc authoring, is the process of assembling source material—video, audio or other data—into the proper logical volume format to then be recorded ("burned") onto an optical disc.

Files-11 is the file system used by Digital Equipment Corporation OpenVMS operating system, and also by the older RSX-11. It is a hierarchical file system, with support for access control lists, record-oriented I/O, remote network access, and file versioning.

HFS Plus or HFS+ is a journaling file system developed by Apple Inc. It replaced the Hierarchical File System (HFS) as the primary file system of Apple computers with the 1998 release of Mac OS 8.1. HFS+ continued as the primary Mac OS X file system until it was itself replaced with the Apple File System (APFS), released with macOS High Sierra in 2017. HFS+ is also one of the formats used by the iPod digital music player.

A container format or metafile is a file format that allows multiple data streams to be embedded into a single file, usually along with metadata for identifying and further detailing those streams. Notable examples of container formats include archive files and formats used for multimedia playback. Among the earliest cross-platform container formats were Distinguished Encoding Rules and the 1985 Interchange File Format.

In computer data storage, a volume or logical drive is a single accessible storage area with a single file system, typically resident on a single partition of a hard disk. Although a volume might be different from a physical disk drive, it can still be accessed with an operating system's logical interface. However, a volume differs from a partition.

Apple Partition Map (APM) is a partition scheme used to define the low-level organization of data on disks formatted for use with 68k and PowerPC Macintosh computers. It was introduced with the Macintosh II.

A hybrid disc is a disc, such as CD-ROM or Blu-ray, which contains multiple types of data which can be used differently on different devices. These include CD-ROM music albums containing video files viewable on a personal computer, or feature film Blu-rays containing interactive content when used with a PlayStation 3 game console.

The Open Packaging Conventions (OPC) is a container-file technology initially created by Microsoft to store a combination of XML and non-XML files that together form a single entity such as an Open XML Paper Specification (OpenXPS) document. OPC-based file formats combine the advantages of leaving the independent file entities embedded in the document intact and resulting in much smaller files compared to normal use of XML.

A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free.

Apple Disk Image Disk image file format developed by Apple and commonly used by macOS

AppleDisk Image is a disk image format commonly used by the macOS operating system. When opened, an Apple Disk Image is mounted as a volume within the Finder.

A FAT file system is a specific type of computer file system architecture and a family of industry-standard file systems utilizing it.

References

  1. Joe Lowe, Joe Lowe. "cfs.h". CFS source code header file. Retrieved 2008-04-06.
  2. "Pismo Technic Web Page - CFS information".