Compound File Binary Format (CFBF), also called Compound File, Compound Document format, [1] or Composite Document File V2 [2] (CDF), is a compound document file format for storing numerous files and streams within a single file on a disk. CFBF is developed by Microsoft and is an implementation of Microsoft COM Structured Storage. [3] [4] [5] The file format is used for storing storage objects and stream objects in a hierarchical structure within a single file. [6]
Microsoft has opened the format for use by others and it is now used in a variety of programs from Microsoft Word and Microsoft Access to Business Objects.[ citation needed ] It also forms the basis of the Advanced Authoring Format. [7]
At its simplest, the Compound File Binary Format is a container, with little restriction on what can be stored within it.
A CFBF file structure loosely resembles a FAT filesystem. The file is partitioned into Sectors which are chained together with a File Allocation Table (not to be mistaken with the file system of the same name) which contains chains of sectors related to each file, a Directory holds information for contained files with a Sector ID (SID) for the starting sector of a chain and so on.
The CFBF file consists of a 512-byte header record followed by a number of Sectors whose size is defined in the header. The literature defines Sectors to be either 512 or 4096 bytes in length, although the format is potentially capable of supporting sectors ranging in size from 128 bytes upwards, in powers of two (128, 256, 512, 1024, etc.). The lower limit of 128 is the minimum required to fit a single directory entry in a Directory Sector.[ relevant? ]
There are several types of sector that may be present in a CFBF file:
More detail is given below for the header and each sector type.
The CFBF Header occupies the first 512 bytes of the file and information required to interpret the rest of the file. The C-Style structure declaration below (extracted from the AAFA's Low-Level Container Specification) shows the members of the CFBF header and their purpose:
typedefunsignedlongULONG;// 4 BytestypedefunsignedshortUSHORT;// 2 BytestypedefshortOFFSET;// 2 BytestypedefULONGSECT;// 4 BytestypedefULONGFSINDEX;// 4 BytestypedefUSHORTFSOFFSET;// 2 BytestypedefUSHORTWCHAR;// 2 BytestypedefULONGDFSIGNATURE;// 4 BytestypedefunsignedcharBYTE;// 1 BytetypedefunsignedshortWORD;// 2 BytestypedefunsignedlongDWORD;// 4 BytestypedefULONGSID;// 4 BytestypedefGUIDCLSID;// 16 BytesstructStructuredStorageHeader{// [offset from start (bytes), length (bytes)]BYTE_abSig[8];// [00H,08] {0xd0, 0xcf, 0x11, 0xe0, 0xa1, 0xb1,// 0x1a, 0xe1} for current versionCLSID_clsid;// [08H,16] reserved must be zero (WriteClassStg/// GetClassFile uses root directory class id)USHORT_uMinorVersion;// [18H,02] minor version of the format: 33 is// written by reference implementationUSHORT_uDllVersion;// [1AH,02] major version of the dll/format: 3 for// 512-byte sectors, 4 for 4 KB sectorsUSHORT_uByteOrder;// [1CH,02] 0xFFFE: indicates Intel byte-orderingUSHORT_uSectorShift;// [1EH,02] size of sectors in power-of-two;// typically 9 indicating 512-byte sectorsUSHORT_uMiniSectorShift;// [20H,02] size of mini-sectors in power-of-two;// typically 6 indicating 64-byte mini-sectorsUSHORT_usReserved;// [22H,02] reserved, must be zeroULONG_ulReserved1;// [24H,04] reserved, must be zeroFSINDEX_csectDir;// [28H,04] must be zero for 512-byte sectors,// number of SECTs in directory chain for 4 KB// sectorsFSINDEX_csectFat;// [2CH,04] number of SECTs in the FAT chainSECT_sectDirStart;// [30H,04] first SECT in the directory chainDFSIGNATURE_signature;// [34H,04] signature used for transactions; must// be zero. The reference implementation// does not support transactionsULONG_ulMiniSectorCutoff;// [38H,04] maximum size for a mini stream;// typically 4096 bytesSECT_sectMiniFatStart;// [3CH,04] first SECT in the MiniFAT chainFSINDEX_csectMiniFat;// [40H,04] number of SECTs in the MiniFAT chainSECT_sectDifStart;// [44H,04] first SECT in the DIFAT chainFSINDEX_csectDif;// [48H,04] number of SECTs in the DIFAT chainSECT_sectFat[109];// [4CH,436] the SECTs of first 109 FAT sectors};
When taken together as a single stream the collection of FAT sectors define the status and linkage of every sector in the file. Each entry in the FAT is 4 bytes in length and contains the sector number of the next sector in a FAT chain or one of the following special values:
![]() | This section needs expansion. You can help by adding to it. (November 2009) |
The Range Lock Sector must exist in files greater than 2GB in size, and must not exist in files smaller than 2GB. The Range Lock Sector must contain the byte range 0x7FFFFF00 to 0x7FFFFFFF in the file. This area is reserved by Microsoft's COM implementation for storing byte-range locking information for concurrent access.