Developer(s) | Apache Software Foundation |
---|---|
Stable release | 5.2.5 / November 25, 2023 [1] |
Repository | POI Repository |
Written in | Java |
Operating system | Cross-platform |
Type | API to access Microsoft Office formats |
License | Apache License 2.0 |
Website | poi |
Apache POI, a project run by the Apache Software Foundation, and previously a sub-project of the Jakarta Project, provides pure Java libraries for reading and writing files in Microsoft Office formats, such as Word, PowerPoint and Excel.
The name was originally an acronym for "Poor Obfuscation Implementation", [2] referring humorously to the fact that the file formats seemed to be deliberately obfuscated, but poorly, since they were successfully reverse-engineered. This explanation – and those of the similar names for the various sub-projects – were removed from the official web pages in order to better market the tools to businesses who would not consider such humor appropriate. The original authors (Andrew C. Oliver and Marc Johnson) also noted the existence of the Hawaiian poi dish, made of mashed taro root, which had similarly derogatory connotations. [3]
POI supports the ISO/IEC 29500:2008 Office Open XML file formats since version 3.5. A significant contribution for OOXML support came from Sourcesense, [4] an open source company which was commissioned by Microsoft to develop this contribution. [5] This link spurred controversy, some POI contributors questioning POI OOXML patent protection regarding Microsoft's Open Specification Promise patent license. [6]
The Apache POI project contains the following subcomponents (meaning of acronyms is taken from old documentation):
The HSSF component is the most advanced feature of the library. [11] Other components (HPSF, HWPF, and HSLF) are usable, but less full-featured. [12] [13]
The POI library is also provided as a Ruby [14] or ColdFusion extension.
There are modules for Big Data platforms (e.g. Apache Hive/Apache Flink/Apache Spark), which provide certain functionality of Apache POI, such as the processing of Excel files. [15] [16]
Legend: | Old version, not maintained | Older version, still maintained | Current stable version | Future release |
---|
Version number | Date of release |
---|---|
5.2.5 | 25. November 2023 |
5.2.4 | 29. September 2023 |
5.2.3 | 16. September 2022 |
5.2.2 | 19. March 2022 |
5.2.1 | 03. March 2022 |
5.2.0 | 14. January 2022 |
5.1.0 | 01. November 2021 |
5.0.0 | 20. January 2021 |
4.1.2 | 14. February 2020 |
4.1.1 | 20. October 2019 |
4.1.0 | 09. April 2019 |
4.0.0 | 07. September 2018 |
3.17 | 15. September 2017 |
3.16 | 19. April 2017 |
3.15 | 21. September 2016 |
3.14 | 2. March 2016 |
3.13 | 29. September 2015 |
3.12 | 11. May 2015 |
3.11 | 21. December 2014 |
3.10.1 | 18. August 2014 |
3.10 | 8. February 2014 |
3.9 | 3. December 2012 |
3.8 | 26. March 2012 |
3.7 | 29. October 2010 |
3.6 | 14. December 2009 |
3.5 | 28. September 2009 |
3.2 | 19. October 2008 |
3.1 | 29. June 2008 |
3.0.2 | 4. February 2008 |
3.0.1 | 5. July 2007 |
3.0 | 18. May 2007 |
2.5.1 | 29. February 2004 |
2.5 | 29. February 2004 |
2.0 | 26. January 2004 |
1.5.1 | 16. June 2002 |
1.5 | 6. May 2002 |
1.2.0 | 19. January 2002 |
1.1.0 | 4. January 2002 |
1.0.2 | 11. January 2002 |
1.0.1 | 4. January 2002 |
1.0.0 | 30. December 2001 |
The Rich Text Format is a proprietary document file format with published specification developed by Microsoft Corporation from 1987 until 2008 for cross-platform document interchange with Microsoft products. Prior to 2008, Microsoft published updated specifications for RTF with major revisions of Microsoft Word and Office versions.
In computing, serialization is the process of translating a data structure or object state into a format that can be stored or transmitted and reconstructed later. When the resulting series of bits is reread according to the serialization format, it can be used to create a semantically identical clone of the original object. For many complex objects, such as those that make extensive use of references, this process is not straightforward. Serialization of object-oriented objects does not include any of their associated methods with which they were previously linked.
.doc is a filename extension used for word processing documents stored on Microsoft's proprietary Microsoft Word Binary File Format. Microsoft has used the extension since 1983.
Vector Markup Language (VML) is an obsolete XML-based file format for two-dimensional vector graphics. It was specified in Part 4 of the Office Open XML standards ISO/IEC 29500 and ECMA-376. According to the specification, VML is a deprecated format included in Office Open XML for legacy reasons only.
A document file format is a text or binary file format for storing documents on a storage media, especially for use by computers. There currently exist a multitude of incompatible document file formats.
The Open Document Format for Office Applications (ODF), also known as OpenDocument, is an open file format for word processing documents, spreadsheets, presentations and graphics and using ZIP-compressed XML files. It was developed with the aim of providing an open, XML-based file format specification for office applications.
OpenOffice or open office may refer to:
NeoOffice is an office suite for the macOS operating system developed by Planamesa Inc. It is a commercial fork of the free and open source LibreOffice office suite, including a word processor, spreadsheet, presentation program and graphics program, it adds some features not present in the macOS versions of LibreOffice and Apache OpenOffice. Current versions are based on LibreOffice 4.4, which was released mid-2014.
The following tables compare general and technical information for a number of office suites:
Office Open XML is a zipped, XML-based file format developed by Microsoft for representing spreadsheets, charts, presentations and word processing documents. Ecma International standardized the initial version as ECMA-376. ISO and IEC standardized later versions as ISO/IEC 29500.
COM Structured Storage is a technology developed by Microsoft as part of its Windows operating system for storing hierarchical data within a single file. Strictly speaking, the term structured storage refers to a set of COM interfaces that a conforming implementation must provide, and not to a specific implementation, nor to a specific file format. In addition to providing a hierarchical structure for data, structured storage may also provide a limited form of transactional support for data access. Microsoft provides an implementation that supports transactions, as well as one that does not.
Uniform Office Format, sometimes known as Unified Office Format, is an open standard for office applications developed in China. It includes word processing, presentation, and spreadsheet modules, and is made up of GUI, API, and format specifications. The document format described uses XML contained in a compressed file container, similar to OpenDocument and Office Open XML.
The Office Open XML file formats, also known as OOXML, were standardised between December 2006 and November 2008, first by the Ecma International consortium, and subsequently, after a contentious standardization process, by the ISO/IEC's Joint Technical Committee 1.
This is a comparison of the Office Open XML document file format with the OpenDocument file format.
The Office Open XML format (OOXML), is an open and free document file format for saving and exchanging editable office documents such as text documents, spreadsheets, charts, and presentations.
The Office Open XML file formats are a set of file formats that can be used to represent electronic office documents. There are formats for word processing documents, spreadsheets and presentations as well as specific formats for material such as mathematical formulas, graphics, bibliographies etc.
Microsoft Office password protection is a security feature that allows Microsoft Office documents to be protected with a user-provided password.