Open Packaging Conventions

Last updated
Open Packaging Conventions (OPC)
AbbreviationOPC
Native name
Office Open XML File Formats - Open Packaging Conventions
StatusPublished
First publishedDecember 7, 2006 (2006-12-07)
Latest versionISO/IEC 29500-2:2021
August 2021 (2021-08)
Organization Microsoft, Ecma, ISO/IEC
Base standardsECMA-376, ISO/IEC 29500-2
Related standards XML, ZIP
Domain Electronic documents
Website ECMA-376,
ISO/IEC 29500-2:2012

The Open Packaging Conventions (OPC) is a container-file technology initially created by Microsoft to store a combination of XML and non-XML files that together form a single entity such as an Open XML Paper Specification (OpenXPS) document. OPC-based file formats combine the advantages of leaving the independent file entities embedded in the document intact and resulting in much smaller files compared to normal use of XML.[ dubious ]

Contents

Specifications

The OPC is specified in Part 2 of the Office Open XML standards ISO/IEC 29500:2008 and ECMA-376. [1] [2]

The ISO/IEC 29500-2:2008 specification and the second edition of ECMA-376 makes a normative reference to PKWARE, Inc.'s .ZIP File Format Specification version 6.2.0 (2004), and supplements it with a normative set of clarifications. Note: The older first edition of ECMA-376 makes an informative (i.e., non-normative) reference to the newer PKWARE Inc's ".ZIP File Format Specification" version 6.2.1 (2005). [1] The ZIP format is not specified by any international standard but has widespread community and developer acceptance.

Microsoft submitted a draft in 2006 to the Internet Engineering Task Force for a "pack" URI Scheme (pack://) to be used for URI references to OPC-based packages. The draft expired in 2009, the specified syntax is incompatible with the Internet Standard for URI schemes (STD 66, RFC 3986). [3] The scheme is now listed as historical. [4]

The ISO 19165:1-2018 recommends the use of the Open Packaging Conventions to implement the Geospatial Package defined in the Open Archival Information System.

Usage

Both the XML Paper Specification (XPS) [5] and Office Open XML (OOXML) use Open Packaging Conventions (OPC), which provide a profile of the common ZIP format. In addition to data and document content in XML markup, files in the ZIP package can include other text and binary files in formats such as PNG, BMP, AVI, PDF, RTF, or even an already packaged ODF file. OPC also defines some naming conventions and an indirection method to allow position independence of binary and XML files in the ZIP archive.

OPC files can be opened using common ZIP utilities. OPC allow indirection, chunking and relative indirection. [6]

File formats using the OPC

The OPC is the foundation technology for many new file formats: [7]

File format Filename extension ContentStandard
3MF Consortium 3D Manufacturing Format (3MF) file format [8] .3mf CAD design data for additive manufacturing (3D printing)
Autodesk AutoCAD Design Web Format (DWFX) file format [9] .dwfx CAD design data (2D/3D computer graphics and technical drawings)
AutomationML container format.amlxPlant engineering information
Circuit Diagram Document [10] .cddx Circuit diagram containing layout, connections and embedded components
Family.Show file format [11] .familyx genealogical family data, stories, and photos
Field Device Integration FDI Packages [12] [13] .fdixField Device Integration information IEC 62769-4:2015
Microsoft Application Virtualization file format.appvPortable application
Microsoft Power BI report file format.pbix Data and information visualization report file
Microsoft Power BI template file format.pbitData and information visualization template file
Microsoft Semblio file format.semblioInteractive learning material, such as e-books containing images, audio, and video
Microsoft Visual Studio 2010+ Extensions file format.vsix Integrated development environment extension
Microsoft Visio 2013 drawing file format.vsdxReplaces .vsd (Visio binary file) and .vdx (Visio XML Drawing) formats used in earlier versions [14]
Microsoft Windows 8, Windows 8.1 and Windows Phone 8.1 App Package [15] .appx Software package for applications listed on Microsoft's Windows Store and Windows Phone Store [16]
Microsoft Windows 8.1 and Windows Phone 8.1 App Bundle [17] .appxbundle Software package that bundles hardware platforms, languages, and resources for an application listed on Microsoft's Windows Store and Windows Phone Store
Microsoft Windows Azure C# Package.cspkg Cloud platform data
Microsoft XML Paper Specification .xpsFixed document for document exchange
MiraMon open compressed map.mmzxGeographic information (Geospatial Raster graphics, vector graphics and tabular data, symbolization and metadata in files, links to geoservices, etc.) ISO 19165-1:2018
NuGet Package.nupkg Software package for a package management system
Office Open XML Document.docx Word processing documentECMA-376, ISO/IEC 29500:2008
Office Open XML Presentation.pptx Presentation fileECMA-376, ISO/IEC 29500:2008
Office Open XML Workbook.xlsx Spreadsheet workbookECMA-376, ISO/IEC 29500:2008
Open XML Paper Specification .oxpsFixed document for document exchangeECMA-388
Platform Industrie 4.0 - Administrative Asset Shell [18] .aasxPackage file format for Administrative Asset Shells (AAS)
Siemens Digital Industries Software file format.jtx
MathWorks Simulink model file.slxDynamic system specification for Model-based design
SMPTE Media Package.smpkStorage format for distribution and playback of multimedia video and audio files SMPTE ST 2053-2011
SpaceClaim 3D solid model file [19] .scdocEmbedded 3D CAD data files include Standard ACIS Binary (SAB) solid model files
Microsoft XAML PackageNot a specification. Function supported by .NET Framework only for saving WPF FlowDocument with images [20]

Programming

OPC is natively supported in Microsoft .NET Framework 3.0 by the System.IO.Packaging namespace. Open source libraries exist for other languages.

Since Windows 7, OPC is also natively supported in the Windows API through a set of COM interfaces, collectively referred to as Packaging API.

Alternatively, ZIP libraries can be used to create and open OPC files, as long as the correct files are included in the ZIP and the conventions followed.

Package, parts, and relationships

Container structure of Part 2 of the Ecma Office Open XML standard, ECMA-376 Open Packaging Convention.png
Container structure of Part 2 of the Ecma Office Open XML standard, ECMA-376

In OPC terminology, the term package corresponds to a ZIP archive and the term part corresponds to a file stored within the ZIP. Every part in a package has a unique URI-compliant part name along with a specified content-type expressed in the form of a MIME media type. A part's content-type explicitly defines the type of data stored in the part and reduces duplication and ambiguity issues inherent with file extensions.

OPC packages can also include relationships that define associations between the package, parts, and external resources. In addition to a hierarchy of directories and parts, OPC packages commonly use relationships to access content through a directed graph of relationship associations. Relationships are composed of four elements:

  • an identifier (ID)
  • an optional source (the package or a part within the package)
  • a relationship type (a URI-style expression that defines the type of the relationship)
  • a target (a URI to another part within the package or to an external resource)

OPC packages can store parts that contain any type of data (text, images, XML, binary, whatever). The extension ".rels", however, is reserved for storing relationships metadata within "/_rels" subfolders. The subfolder name "_rels", the file extension ".rels" within such directory, and the filename "[Content_Types].xml" in any folder are the only three reserved names for files stored in an OPC package.

/[Content_Types].xml file
This file defines the MIME media types for all the parts stored in the package. The "/[Content_Types].xml" file defines default mappings based on file extensions, along with overrides for specific parts with content-types that are different from the file extension defaults. For example, one of these defined MIME types is:
<DefaultExtension="rels"ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
/_rels
The root level "/_rels" folder stores the relationships for the package as a whole. The "/_rels" folder normally contains a file named ".rels". "/_rels/.rels" is an XML file where the starting package-level relationships are stored. Normally when opening an OPC-based file, applications start by accessing to the "/_rels/.rels" file to read the starting package-level relationships.
[partname].rels
Each part may have its own relationships. The _rels folders are where one goes to find the relationships for any given part within the package. To find the relationships for a specific part, one looks in the "_rels" folder that is a sibling of that part: If the part has relationships, the "_rels" folder will contain a file that has one's original part name with a ".rels" appended to it. For example, if the content types part file had any relationships, there would be a file called "[Content_Types].xml.rels" inside the "/_rels" folder.

All relationships (including the relations associated to the root package) are represented as XML files. If you open a ".rels" file in a text editor, you can view the actual XML markup that defines all the relationships targeted from that part. A typical relationships file contains XML code like this:

<Relationshipsxmlns="http://schemas.openxmlformats.org/package/2006/relationships"><RelationshipId="R0"Type="http://schemas.microsoft.com/xps/2005/06/fixedrepresentation"Target="/FixedDocumentSequence.fdseq"/><RelationshipId="R1"Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/thumbnail"Target="/Documents/1/Metadata/Page1_Thumbnail.JPG"/></Relationships>

which defines two relations for the root package, the first one being considered as the root package (here for an early Microsoft XPS document, before it was standardized as Open XML Paper Specification within the openxmlformats collection), and the other one being used to reference an alternate form (here a thumbnail rendered image of the first page of the document).

The main parts of the embedded documents are often stored within a folder named "/Document" (which may contain subdirectories itself, if the file contains several related documents each of them with various parts), and the optional metadata parts that are not needed for processing the main parts of the document are stored in a folder named "/Metadata"; however these actual folder names are actually specified within the XML-formatted data in "[partname].rels" relationship files and the OPC specification allows any folder organisation that is convenient for the application and these two folder names are not required.

Chunking

It encourages documents to be split into small chunks. This is better for reducing the effect of file corruption. [21] And better for data access: for example, all the style information in one XML part, each separate worksheet or table in their own different parts. This allows faster access and less object creation for clients and makes it easier for multiple processes to be working on the same document.

Relative indirection

In the Open Packaging Conventions, each file that has reference has its own _rels file with the indirection lists. This makes it easier to cut and paste some information with all its associated resources in some cases, provides name scoping to remove the chance of name clashing between files, and so on.

Related Research Articles

ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed. The ZIP file format permits a number of compression algorithms, though DEFLATE is the most common. This format was originally created in 1989 and was first implemented in PKWARE, Inc.'s PKZIP utility, as a replacement for the previous ARC compression format by Thom Henderson. The ZIP format was then quickly supported by many software utilities other than PKZIP. Microsoft has included built-in ZIP support in versions of Microsoft Windows since 1998 via the "Plus! 98" addon for Windows 98. Native support was added as of the year 2000 in Windows ME. Apple has included built-in ZIP support in Mac OS X 10.3 and later. Most free operating systems have built in support for ZIP in similar manners to Windows and macOS.

A document file format is a text or binary file format for storing documents on a storage media, especially for use by computers. There currently exist a multitude of incompatible document file formats.

The Open Document Format for Office Applications (ODF), also known as OpenDocument, is an open file format for word processing documents, spreadsheets, presentations and graphics and using ZIP-compressed XML files. It was developed with the aim of providing an open, XML-based file format specification for office applications.

Ecma International is a nonprofit standards organization for information and communication systems. It acquired its current name in 1994, when the European Computer Manufacturers Association (ECMA) changed its name to reflect the organization's global reach and activities. As a consequence, the name is no longer considered an acronym and no longer uses full capitalization.

A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.

A representation term is a word, or a combination of words, that semantically represent the data type of a data element. A representation term is commonly referred to as a class word by those familiar with data dictionaries. ISO/IEC 11179-5:2005 defines representation term as a designation of an instance of a representation class As used in ISO/IEC 11179, the representation term is that part of a data element name that provides a semantic pointer to the underlying data type. A Representation class is a class of representations. This representation class provides a way to classify or group data elements.

Open XML Paper Specification is an open specification for a page description language and a fixed-document format. Microsoft developed it as the XML Paper Specification (XPS). In June 2009, Ecma International adopted it as international standard ECMA-388.

Office Open XML is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version as ECMA-376. ISO and IEC standardized later versions as ISO/IEC 29500.

This article describes the technical specifications of the OpenDocument office document standard, as developed by the OASIS industry consortium. A variety of organizations developed the standard publicly and make it publicly accessible, meaning it can be implemented by anyone without restriction. The OpenDocument format aims to provide an open alternative to proprietary document formats.

Design Web Format (DWF) is a file format developed by Autodesk for the efficient distribution and communication of rich design data to anyone who needs to view, review, or print design files. Because DWF files are highly compressed, they are smaller and faster to transmit than design files, without the overhead associated with complex CAD drawings. With DWF functionality, publishers of design data can limit the specific design data and plot styles to only what they want recipients to see and can publish multisheet drawing sets from multiple AutoCAD drawings in a single DWF file. They can also publish 3D models from most Autodesk design applications.

Digital Item is the basic unit of transaction in the MPEG-21 framework. It is a structured digital object, including a standard representation, identification and metadata.

The following is a comparison of e-book formats used to create and publish e-books.

The Office Open XML file formats, also known as OOXML, were standardised between December 2006 and November 2008, first by the Ecma International consortium, and subsequently, after a contentious standardization process, by the ISO/IEC's Joint Technical Committee 1.

A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free.

This is a comparison of the Office Open XML document file format with the OpenDocument file format.

<span class="mw-page-title-main">EPUB</span> E-book format

EPUB is an e-book file format that uses the ".epub" file extension. The term is short for electronic publication and is sometimes stylized as ePub. EPUB is supported by many e-readers, and compatible software is available for most smartphones, tablets, and computers. EPUB is a technical standard published by the International Digital Publishing Forum (IDPF). It became an official standard of the IDPF in September 2007, superseding the older Open eBook (OEB) standard.

The Office Open XML file formats are a set of file formats that can be used to represent electronic office documents. There are formats for word processing documents, spreadsheets and presentations as well as specific formats for material such as mathematical formulas, graphics, bibliographies etc.

In computing, Open Data Protocol (OData) is an open protocol that allows the creation and consumption of queryable and interoperable Web service APIs in a standard way. Microsoft initiated OData in 2007. Versions 1.0, 2.0, and 3.0 are released under the Microsoft Open Specification Promise. Version 4.0 was standardized at OASIS, with a release in March 2014. In April 2015 OASIS submitted OData v4 and OData JSON Format v4 to ISO/IEC JTC 1 for approval as an international standard. In December 2016, ISO/IEC published OData 4.0 Core as ISO/IEC 20802-1:2016 and the OData JSON Format as ISO/IEC 20802-2:2016.

References

  1. 1 2 ISO/IEC 29500-2:2008 - Information technology -- Document description and processing languages -- Office Open XML File Formats -- Part 2: Open Packaging Conventions, ISO
  2. Ecma International TC45 (December 2006). "Standard ECMA-376 Office Open XML File Formats". Ecma International. Retrieved 2007-04-04.{{cite web}}: CS1 maint: numeric names: authors list (link)
  3. "pack Status: historical". IANA. 2011-10-04. Retrieved 2013-05-12.
  4. "Uniform Resource Identifier (URI) Schemes". Protocol Registries . IANA . Retrieved 2013-05-12.{{cite web}}: External link in |work= (help)
  5. XPS team (2006-09-01). "Open Packaging Conventions & Open XML Markup Compatibility". XPS team blog. Retrieved 2007-04-04.
  6. Rick Jeliffe (2007-07-29). "Comment on Can a file be ODF and Open XML at the same time?". O'Reilly net XML blogs.
  7. Adventures in Packaging - Episode 1, May 18, 2009, by Jack Davis, Microsoft Packaging Team Blog: Open Packaging Conventions
  8. "Archived copy" (PDF). Archived from the original (PDF) on 2016-08-07. Retrieved 2016-05-26.{{cite web}}: CS1 maint: archived copy as title (link)
  9. "What's AutoCAD DWF file | DWG to DGN". Archived from the original on 2014-09-03. Retrieved 2014-08-30.
  10. "CDDX File Format - Circuit Diagram". www.circuit-diagram.org.
  11. "CodePlex Archive". CodePlex Archive.
  12. "Technology - FDI-Cooperation". www.fdi-cooperation.com. Archived from the original on 2014-09-19.
  13. "IEC 62769-4:2015 | IEC Webstore". webstore.iec.ch.
  14. "Developer tools, technical documentation and coding examples".
  15. "App packages and deployment (Windows Runtime apps) - Windows app development". 6 October 2015.
  16. Warren, Tom (February 11, 2014). "Windows Phone 8.1 includes universal apps and lots of feature updates". The Verge.
  17. "Content Moved (Windows)". Archived from the original on 2016-10-18. Retrieved 2015-01-26.
  18. Details of the Asset Administration Shell - Part 1
  19. "SpaceClaim file format". 2013-05-06. Archived from the original on 2013-09-15. Retrieved 2017-08-14.
  20. "DataFormats.XamlPackage Field (System.Windows)".
  21. "Using OPC to Store Your Own Data: Page 3". www.devx.com.