ABBYY FineReader

Last updated

FineReader PDF
Developer(s) ABBYY
Initial releaseJuly 1993;30 years ago (1993-07)
Stable release
16.0.13.4766 [1] / 10 November 2022;12 months ago (2022-11-10)
Operating system Windows, macOS, Linux
Type OCR
License Commercial proprietary software (Retail or volume licensing)
Website pdf.abbyy.com OOjs UI icon edit-ltr-progressive.svg

ABBYY FineReader PDF is an optical character recognition (OCR) application developed by ABBYY, [2] [3] with support for PDF file editing since v15. The program runs under Microsoft Windows 7 or later, and (without PDF editing) Apple macOS 10.12 Sierra or later. The first version was released in 1993. [2]

The program allows the conversion of image documents (photos, scans, PDF files) and screen captures into editable file formats, including Microsoft Word, Microsoft Excel, Microsoft PowerPoint, Rich Text Format, HTML, PDF/A, searchable PDF, CSV and txt (plain text) files. [4] From version 11 files can be saved in the DjVu format. Version 15 supports recognition of text in 192 languages and has a built-in spell check for 48 of them.

FineReader recognizes new characters by: training characters so that they are added to the recognition alphabet; selecting additional characters from a list and adding them to the alphabet of a selected language (for example, adding certain Icelandic characters to a German alphabet for a German text describing Iceland); and adding domain-specific vocabulary to the FineReader’s built-in lexicon. [5] The program also allows users to compare documents, add annotations and comments, and schedule batch processing. [6] [7]

As of 2015, there were more than 20 million users of ABBYY FineReader worldwide. [8] [2] [9] Based on FineReader optical character recognition, ABBYY licenses the technology to companies including Fujitsu, Panasonic, Xerox, Plustek and Samsung. [10] [11]

In February 2020, version 15 of the software was rated "Highest-quality OCR on the market" by PC Magazine . [12]

Related Research Articles

<span class="mw-page-title-main">PDF</span> Portable Document Format, a digital file format

Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF has its roots in "The Camelot Project" initiated by Adobe co-founder John Warnock in 1991. PDF was standardized as ISO 32000 in 2008. The last edition as ISO 32000-2:2020 was published in December 2020.

<span class="mw-page-title-main">Optical character recognition</span> Computer recognition of visual text

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo or from subtitle text superimposed on an image.

<span class="mw-page-title-main">Adobe Acrobat</span> Set of application software to view, edit and manage files in Portable Document Format (PDF)

Adobe Acrobat is a family of application software and Web services developed by Adobe Inc. to view, create, manipulate, print and manage Portable Document Format (PDF) files.

<span class="mw-page-title-main">Microsoft OneNote</span> Free-form note-taking app for personal computers and smartphones

Microsoft OneNote is a note-taking software developed by Microsoft. It is available as part of the Microsoft 365 suite and since 2014 has been free on all platforms outside the suite. OneNote is designed for free-form information gathering and multi-user collaboration. It gathers users' notes, drawings, screen clippings, and audio commentaries. Notes can be shared with other OneNote users over the Internet or a network.

capella is a musical notation program or scorewriter developed by the German company Capella Software AG, running on Microsoft Windows or corresponding emulators in other operating systems, like Wine on Linux and others on Apple Macintosh. Capella requires to be activated after a trial period of 30 days. The publisher writes the name in lower case letters only. The program was initially created by Hartmut Ring, and is now maintained and developed by Bernd Jungmann.

Evernote is a note-taking and task-management application developed by the Evernote Corporation. It is intended for archiving and creating notes with embedded photos, audio, and saved web content. Notes are stored in virtual "notebooks" and can be tagged, annotated, edited, searched, and exported.

PaperPort is commercial document management software published by Kofax, used for working with scanned documents. It uses a built-in optical character recognition to create files in searchable Portable Document Format (PDF); text in these files is indexed and can be searched for with appropriate software, such as Microsoft's Windows Search. Earlier versions of PaperPort used OmniPage to provide this function. It provides image editing tools for these files.

<span class="mw-page-title-main">Tesseract (software)</span> Free optical character recognition engine

Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License. Originally developed by Hewlett-Packard as proprietary software in the 1980s, it was released as open source in 2005 and development has been sponsored by Google since 2006.

CuneiForm Cognitive OpenOCR is a freely distributed open-source OCR system developed by Russian software company Cognitive Technologies.

<span class="mw-page-title-main">OCR-A</span> Typeface designed for early computer OCR

OCR-A is a font issued in 1966 and first implemented in 1968. A special font was needed in the early days of computer optical character recognition, when there was a need for a font that could be recognized not only by the computers of that day, but also by humans. OCR-A uses simple, thick strokes to form recognizable characters. The font is monospaced (fixed-width), with the printer required to place glyphs 0.254 cm apart, and the reader required to accept any spacing between 0.2286 cm and 0.4572 cm.

hOCR is an open standard of data representation for formatted text obtained from optical character recognition (OCR). The definition encodes text, style, layout information, recognition confidence metrics and other information using Extensible Markup Language (XML) in the form of Hypertext Markup Language (HTML) or XHTML.

This comparison of optical character recognition software includes:

Document Capture Software refers to applications that provide the ability and feature set to automate the process of scanning paper documents or importing electronic documents, often for the purposes of feeding advanced document classification and data collection processes. Most scanning hardware, both scanners and copiers, provides the basic ability to scan to any number of image file formats, including: PDF, TIFF, JPG, BMP, etc. This basic functionality is augmented by document capture software, which can add efficiency and standardization to the process.

<span class="mw-page-title-main">OCRFeeder</span>

OCRFeeder is an optical character recognition suite for GNOME, which also supports virtually any command-line OCR engine, such as CuneiForm, GOCR, Ocrad and Tesseract. It converts paper documents to digital document files and can serve to make them accessible to visually impaired users.

Microsoft Office shared tools are software components that are included in all Microsoft Office products.

<span class="mw-page-title-main">Solid PDF Tools</span>

Solid PDF Tools is a document reconstruction software product which allows users to convert PDFs into editable documents and create PDFs from a variety of file sources. The same technology used in the software's Solid Framework SDK is licensed by Adobe for Acrobat X

Asprise OCR is a commercial optical character recognition and barcode recognition SDK library that provides an API to recognize text as well as barcodes from images and output in formats like plain text, xml and searchable PDF.

<span class="mw-page-title-main">Project Naptha</span>

Project Naptha is a browser extension software for Google Chrome that allows users to highlight, copy, edit and translate text from within images. It was created by developer Kevin Kwok, and released in April 2014 as a Chrome add-on. This software was first made available only on Google Chrome, downloadable from the Chrome Web Store. It was then made available on Mozilla Firefox, downloadable from the Mozilla Firefox add-ons repository but was soon removed. The reason behind the removal remains unknown.

<span class="mw-page-title-main">ABBYY</span> American digital intelligence company

ABBYY is a US-based company that develops solutions in the fields of intelligent document processing, data capture, process intelligence and optical character recognition (OCR). The company serves clients worldwide. One of ABBYY's best-known products is the ABBYY FineReader — an OCR application.

References

  1. "Release 2 build 16.0.13.4766". ABBYY. New versions added as released.
  2. 1 2 3 Вектор модернизации: обзор обновленного ABBYY FineReader 12, 2014
  3. "ABBYY FineReader Pro is an unparalleled OCR solution". Engadget. 16 June 2014. Retrieved 30 December 2021.
  4. "ABBYY FineReader Pro is an unparalleled OCR solution". Engadget. 16 June 2014. Retrieved 30 December 2021.
  5. Sporleder, Caroline; Bosch, Antal van den; Zervanou, Kalliopi (7 July 2011). Language Technology for Cultural Heritage: Selected Papers from the LaTeCH Workshop Series. Springer Science & Business Media. ISBN   978-3-642-20227-8.
  6. Nield, David; DeMuro, Jonas P.; Turner, Brian (11 October 2021). "Best OCR software of 2021: free and paid options". TechRadar. Retrieved 22 December 2021.
  7. Dalton, Will; DeMuro, Jonas P.; Turner, Brian (6 December 2021). "Best scanning software of 2022". TechRadar. Retrieved 22 December 2021.
  8. "ABBYY выпустила 12 версию своего флагманского продукта FineReader". 2015. Archived from the original on 18 May 2015.
  9. Группа компаний ABBYY, 2014
  10. Radyuhin, Vladimir (19 January 2008). "IT opportunities and challenges in Russia". The Hindu. Archived from the original on 16 July 2014.
  11. "С технологией ABBYY смартфон Samsung Galaxy S4 распознает текст с фотографий". Archived from the original on 14 May 2015.
  12. Mendelson, Edward (6 February 2020). "ABBYY FineReader Review". PC Magazine. Review updated from time to time.