PPML

Last updated

PPML (Personalized Print Markup Language) is an XML-based industry standard printer language for variable data printing defined by PODi. The industry-wide consortium of 13 companies was initially formed to create PPML, and now has more than 400 member companies.

Contents

Overview

PPML is an open, inter-operable, device-independent standard first released in 2000 to enable the widespread use of personalized print applications. PPML is made to enable efficient production printing of variable data; rather than sending 300 copies of the same data with only a name changed, PPML is designed to allow all the data to be sent to the printer at once, allowing for much faster printing, as data does not need to be transferred to the printer for each copy.

High-volume print jobs are getting more complex due to higher demands for the layout, content and personalization of documents. This is particularly true in the case of "image-swapping", where different images are selected and replaced on a record-by-record basis. At the same time pressure on the operators at the machines is increasing. A third development relates to the rise of XML, as a neutral basis for multi-channel communication of documents to fax, internet, e-mail, electronic archive and printer.

Personalized Print Markup Language (PPML) is the print industry's answer to these developments. PPML strongly reduces the complexity of the print-job, especially when color, images and personalized elements are being used. The RIP (describing the contents of a page in a rasterized image) is a lot faster.

The Printing On Demand Initiative (PODi) is responsible for the concept and development of this new PPML standard. This platform combines all major suppliers in this market, with the initial development completed by Adobe Systems, EFI, CreoScitex, Hewlett-Packard, Kodak Nexpress, Xerox, IBM, Lexmark, Océ, XMPie, PageFlex, Printable, QuarkXPress, Kodak GCG Inkjet Printing Systems, and Xeikon working together as members of PODI.

Reusable Content

The traditional printer languages retrieve a page, examine what is on it and start to create rasterized images to tell the printer device what is where and how it should be put on paper. This is repeated for every single page. High-volume print jobs easily contain tens of thousands of pages that all have to be RIPped. RIPping can become a problem if one realizes that a page with a color photo and a logo can reach a size of as much as 20 MB in PostScript. This costs an exceptional amount of processing power and memory space and is the most important cause of print processes running aground. This is why rated engine speeds are often not met and machines may be RIPping all night to be able to produce at a reasonable speed during the day.

This bottleneck in printing can be solved by specifying reusable content. Reusable content items are things that are used on many of the pages. Reusable content can be fonts (letter types), logos (in all sorts of formats), signatures (for policies), diagrams (research results), images (advertising) and the like. An object that is reusable is often called a resource. PPML was designed to make this reuse of resources explicit and allows the printer to know which resources are needed at a particular point in the job. This allows a resource to be rasterized once and used many times instead of being rasterized on every page on which it is used.

Resource Management

Reuse of resources solves only part of the problem. Ensuring that all the required resources are available on the printer is another big problem. In PPML this problem is solved by allowing references to resources via URLs (uniform resource locator). Now the printer can retrieve the resource via the URL if it doesn't have that particular resource yet. This eliminates the need to send all the needed resources along with the print job. The printer will simply retrieve those resources that it needs on the fly. If it already has the resource in its cache it does need not retrieve the resource. This works in the same way as a browser that gains speed by loading (parts of) a webpage from its cache.

Not including resources in a print job leads to the potential problem of version control. PPML solves this problem by allowing the producer of the print job to specify a checksum for each resource that is referenced. A checksum is a large number that is calculated from the contents of a resource. By comparing a given checksum against the checksum of the resource in the cache the printer can check that it has the correct version of the resource.

Multiple format resources

The print industry already has many formats to describe images, fonts and pages. Instead of defining new PPML-specific formats for resources, the choice was made to allow any existing format to be used directly. Therefore, PPML only describes how existing resources are combined to create pages, documents and jobs. This description uses XML to avoid inventing yet another format.

Although this approach makes PPML very easy to generate, it does complicate the task of the PPML RIP (a.k.a. consumer). Of course not all consumers will implement every possible resource format on this earth. To create compatibility the Graphics Arts Conformance subset was defined.

Graphics Art Conformance

The Graphics Art Conformance level (PPML/GA) defines a level of PPML for increased interoperability. This conformance level requires a Graphics Art Conformant PPML consumer to support: PostScript, PDF, TIFF and JPEG resources, and to process these files in a standardized manner. A PPML producer that generates a PPML dataset that conforms to the Graphics Art Conformance level (PPML/GA) can then be printed using any Graphics Art Conforming consumer device. Conformance of a PPML/GA dataset can be validated with the CheckPPML tool (which also acts as a viewer).

Archiving

An electronic archive can store PPML documents efficiently. Each individual data element only needs to be stored once. The rest of the PPML based archive consists mainly of structure descriptions. This is very different from an electronic archive based on TIFF or PDF, in which every document contains all the page elements and the company logo may have been stored a million times. This also applies to the standard end to a letter, for the standard terms of payment or the standard policy conditions; there may be millions of copies stored. Each resource is probably no larger in size than a few Kb. But with multiple copies the size increases quickly, especially when color images have entered into the electronic company communication.

Viewer

To view PPML documents special software will be needed. For instance, if someone wants to retrieve a document out of a PPML archive, the document will have to be converted to an image by a PPML RIP (just as a PPML printer would) This "as printed" image is shown on screen by the PPML viewer software.

Several such viewers exist, including ones from EFI, Hewlett-Packard, Xeikon, and Edmond R&D. PODi also provides a viewer which is widely accepted as the reference implementation for testing PPML output. "CheckPPML" (the PODi viewer) is a virtual PPML consumer that provides error-checking and PDF output in addition to viewing. A CheckPPML that checks and verifies conformance for up to 100 pages is freely available. [1] (The paid version supports unlimited pages.)

Printers

Xeikon was the first hardware supplier whose printers could print with PPML. Then, IBM (now InfoPrint Solutions Company) included PPML support in the controller software for their printers (InfoPrint Manager) allowing an enormous installed base of IPDS-printers to process PPML data streams.

Today, production printers from many manufacturers support printing of PPML documents.

See also

Related Research Articles

<span class="mw-page-title-main">Printer (computing)</span> Computer peripheral that prints text or graphics

In the field of computing, a printer is considered a peripheral device that serves the purpose of creating a permanent representation of text or graphics, usually on paper. While the majority of outputs produced by printers are readable by humans, there are instances where barcode printers have found a utility beyond this traditional use. Different types of printers are available for use, including inkjet printers, thermal printers, laser printers, and 3D printers.

<span class="mw-page-title-main">PDF</span> Portable Document Format, a digital file format

Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991. PDF was standardized as ISO 32000 in 2008. The last edition as ISO 32000-2:2020 was published in December 2020.

<span class="mw-page-title-main">PostScript</span> File format and programming language

PostScript is a page description language and dynamically typed, stack-based programming language. It is most commonly used in the electronic publishing and desktop publishing realm, but as a Turing complete programming language, it can be used for many other purposes as well. PostScript was created at Adobe Systems by John Warnock, Charles Geschke, Doug Brotz, Ed Taft and Bill Paxton from 1982 to 1984. The most recent version, PostScript 3, was released in 1997.

<span class="mw-page-title-main">Raster graphics</span> Matrix-based data structure

In computer graphics and digital photography, a raster graphic represents a two-dimensional picture as a rectangular matrix or grid of pixels, viewable via a computer display, paper, or other display medium. A raster is technically characterized by the width and height of the image in pixels and by the number of bits per pixel. Raster images are stored in image files with varying dissemination, production, generation, and acquisition formats.

<span class="mw-page-title-main">Vector graphics</span> Computer graphics images defined by points, lines and curves

Vector graphics are a form of computer graphics in which visual images are created directly from geometric shapes defined on a Cartesian plane, such as points, lines, curves and polygons. The associated mechanisms may include vector display and printing hardware, vector data models and file formats, as well as the software based on these data models. Vector graphics is an alternative to raster or bitmap graphics, with each having advantages and disadvantages in specific situations.

Encapsulated PostScript (EPS) is a Document Structuring Convention (DSC) conforming PostScript document format usable as a graphics file format. The format was developed as early as 1987 by John Warnock and Chuck Geschke, the founders of Adobe, together with Aldus. The basis of early versions of the Adobe Illustrator Artwork file format is formed by EPS together with the DSC Open Structuring Conventions.

<span class="mw-page-title-main">Preview (macOS)</span> Image and PDF viewer software by Apple

Preview is the built-in image viewer and PDF viewer of the macOS operating system. In addition to viewing and printing digital images and Portable Document Format (PDF) files, it can also edit these media types. It employs the Aqua graphical user interface, the Quartz graphics layer, and the ImageIO and Core Image frameworks.

The Internet Printing Protocol (IPP) is a specialized communication protocol for communication between client devices and printers. It allows clients to submit one or more print jobs to the network-attached printer or print server, and perform tasks such as querying the status of a printer, obtaining the status of print jobs, or cancelling individual print jobs.

<span class="mw-page-title-main">CUPS</span> Computer printing system

CUPS is a modular printing system for Unix-like computer operating systems which allows a computer to act as a print server. A computer running CUPS is a host that can accept print jobs from client computers, process them, and send them to the appropriate printer.

JDF is a technical standard developed by the graphic arts industry to facilitate cross-vendor workflow implementations of the application domain. It is an XML format about job ticket, message description, and message interchange. JDF is managed by CIP4, the International Cooperation for the Integration of Processes in Prepress, Press and Postpress Organization. JDF was initiated by Adobe Systems, Agfa, Heidelberg and MAN Roland in 1999 but handed over to CIP3 at Drupa 2000. CIP3 then renamed itself CIP4.

<span class="mw-page-title-main">Digital printing</span> Method of printing

Digital printing is a method of printing from a digital-based image directly to a variety of media. It usually refers to professional printing where small-run jobs from desktop publishing and other digital sources are printed using large-format and/or high-volume laser or inkjet printers.

Prepress is the term used in the printing and publishing industries for the processes and procedures that occur between the creation of a print layout and the final printing. The prepress process includes the preparation of artwork for press, media selection, proofing, quality control checks and the production of printing plates if required. The artwork is often provided by the customer as a print-ready PDF file created in desktop publishing.

Advanced Function Presentation (AFP) is a presentation architecture and family of associated printer software and hardware that provides for document and information presentation independent of specific applications and devices.

Variable data printing (VDP) is a form of digital printing, including on-demand printing, in which elements such as text, graphics and images may be changed from one printed piece to the next, without stopping or slowing down the printing process and using information from a database or external file. For example, a set of personalized letters, each with the same basic layout, can be printed with a different name and address on each letter. Variable data printing is mainly used for direct marketing, customer relationship management, advertising, invoicing and applying addressing on selfmailers, brochures or postcard campaigns.

An image file format is a file format for a digital image. There are many formats that can be used, such as JPEG, PNG, and GIF. Most formats up until 2022 were for storing 2D images, not 3D ones. The data stored in an image file format may be compressed or uncompressed. If the data is compressed, it may be done so using lossy compression or lossless compression. For graphic design applications, vector formats are often used. Some image file formats support transparency.

Open XML Paper Specification is an open specification for a page description language and a fixed-document format. Microsoft developed it as the XML Paper Specification (XPS). In June 2009, Ecma International adopted it as international standard ECMA-388.

PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for use in the archiving and long-term preservation of electronic documents. PDF/A differs from PDF by prohibiting features unsuitable for long-term archiving, such as font linking and encryption. The ISO requirements for PDF/A file viewers include color management guidelines, support for embedded fonts, and a user interface for reading embedded annotations.

In printing, Preflight is the process of confirming that the digital files required for the printing process are all present, valid, correctly formatted, and of the desired type. The basic idea is to prepare the files to make them feasible for the correct process such as offset printing and eliminate costly errors and facilitate a smooth production. It is a standard prepress procedure in the printing industry. The term originates from the preflight checklists used by pilots. The term was first used in a presentation at the Color Connections conference in 1990 by consultant Chuck Weger, and Professor Ron Bertolina was a pioneer for solutions to preflighting in the 1990s.

"Harlequin (software)" is a raster image processor first released in 1990 under the name ScriptWorks running as a command-line application to render PostScript language files under Unix. It was developed by Harlequin, a software company based in Cambridge, England.

PDF/VT is an international standard published by ISO in August 2010 as ISO 16612-2. It defines the use of PDF as an exchange format optimized for variable and transactional printing. Built on top of PDF/X-4, it is the first variable-data printing (VDP) format which ensures modern International Color Consortium-based (ICC) color management through the use of ICC Output Intents. It adds the notion of encapsulated groups of graphic objects to support optimized efficient processing for repeating text, graphic or image content. Introducing the concept of document part metadata (DPM), it enables reliable and dynamic management of pages for High Volume Transactional Output (HVTO) print data, like record selection or postage optimization based on metadata.

References