Asprise OCR

Last updated
Asprise OCR SDK for Java, C# VB.NET, Python, C/C++ and Delphi
Developer(s) Asprise
Initial release1998;26 years ago (1998)
Stable release
15
Written in Java, C#, VB.NET, C, C++, Objective-C, Delphi, Python
Operating system Windows XP, 7, 8, 10; Linux; Mac OS X; Solaris; AIX
Type OCR
License proprietary, commercial
Website asprise.com

Asprise OCR is a commercial optical character recognition and barcode recognition SDK library that provides an API to recognize text as well as barcodes from images (in formats like JPEG, PNG, TIFF, PDF, etc.) and output in formats like plain text, xml and searchable PDF.

Asprise OCR has been in active development since 1997. Version 2.1 of the software has been reviewed by PC World. [1]

Many researchers have used Asprise OCR along with ABBYY FineReader to benchmark OCR performance. Paweł Łupkowski and Mariusz Urbanski from Adam Mickiewicz University in Poznań uses Asprise OCR version 4 and ABBYY FineReader to perform CAPTCHA recognition. [2] Shuai Yuan from Cornell University implemented an image based room schedule retrieval system using Asprise OCR. [3] Seongwook Youn from University of Southern California found "By running a sample of 200 image e-mails, we determined that Asprise OCR was performing with an accuracy of 95%. It had the best detection rate among the approaches we analyzed; hence we decided to go with Asprise OCR for our research.". [4] Adil Farooq from the University of Engineering and Technology in Taxila implemented a speech based interface system for visually impaired persons. [5]

Hsieh analyzes the workflow of Asprise OCR engine and applies it to detect scoreboard for baseball videos. [6] Chaisri discusses how Asprise OCR can be used for imaging analysis of fax documents in IT Convergence and Services: ITCS & IRoA 2011. [7] Petra demonstrates how to perform feature selection for anti-spam using Asprise OCR. [8]

The following languages are supported by Asprise version 5: [9] Croatian, Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hungarian, Icelandic, Indonesian, Italian, Malay, Maltese, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish or Turkish. MRZ and MICR are supported.

The latest version of Asprise OCR SDK is v15. [10]

In 2020, Asprise OCR offers a cloud based real-time instant receipt OCR API.

Related Research Articles

<span class="mw-page-title-main">Java (programming language)</span> Object-oriented programming language

Java is a high-level, class-based, object-oriented programming language that is designed to have as few implementation dependencies as possible. It is a general-purpose programming language intended to let programmers write once, run anywhere (WORA), meaning that compiled Java code can run on all platforms that support Java without the need to recompile. Java applications are typically compiled to bytecode that can run on any Java virtual machine (JVM) regardless of the underlying computer architecture. The syntax of Java is similar to C and C++, but has fewer low-level facilities than either of them. The Java runtime provides dynamic capabilities that are typically not available in traditional compiled languages.

<span class="mw-page-title-main">Optical character recognition</span> Computer recognition of visual text

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo or from subtitle text superimposed on an image.

In computing, cross-platform software is computer software that is designed to work in several computing platforms. Some cross-platform software requires a separate build for each platform, but some can be directly run on any platform without special preparation, being written in an interpreted language or compiled to portable bytecode for which the interpreters or run-time packages are common or standard components of all supported platforms.

<span class="mw-page-title-main">Windows API</span> Microsofts core set of application programming interfaces on Windows

The Windows API, informally WinAPI, is the foundational application programming interface (API) that allows a computer program to access the features of the Microsoft Windows operating system in which the program is running.

A CAPTCHA is a type of challenge–response test used in computing to determine whether the user is human in order to deter bot attacks and spam.

<span class="mw-page-title-main">Avinash Kak</span> Indian American mathematician

Avinash C. Kak is a professor of Electrical and Computer Engineering at Purdue University who has conducted pioneering research in several areas of information processing. His most noteworthy contributions deal with algorithms, languages, and systems related to networks, robotics, and computer vision. Born in Srinagar, Kashmir, he did his Bachelors in BE at University of Madras and Phd in Indian Institute of Technology Delhi. He joined the faculty of Purdue University in 1971.

DocuShare is a content management system developed by Xerox Corporation. DocuShare makes use of open standards and allows for managing content, integrating it with other business systems, and developing customized and packaged software applications.

The Speech Application Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows applications. To date, a number of versions of the API have been released, which have shipped either as part of a Speech SDK or as part of the Windows OS itself. Applications that use SAPI include Microsoft Office, Microsoft Agent and Microsoft Speech Server.

In Microsoft Windows applications programming, OLE Automation is an inter-process communication mechanism created by Microsoft. It is based on a subset of Component Object Model (COM) that was intended for use by scripting languages – originally Visual Basic – but now is used by several languages on Windows. All automation objects are required to implement the IDispatch interface. It provides an infrastructure whereby applications called automation controllers can access and manipulate shared automation objects that are exported by other applications. It supersedes Dynamic Data Exchange (DDE), an older mechanism for applications to control one another. As with DDE, in OLE Automation the automation controller is the "client" and the application exporting the automation objects is the "server".

ABBYY FineReader PDF is an optical character recognition (OCR) application developed by ABBYY, with support for PDF file editing since v15. The program runs under Microsoft Windows 7 or later, and Apple macOS 10.12 Sierra or later. The first version was released in 1993.

reCAPTCHA CAPTCHA implementation owned by Google

reCAPTCHA Inc. is a CAPTCHA system owned by Google. It enables web hosts to distinguish between human and automated access to websites. The original version asked users to decipher hard to read text or match images. Version 2 also asked users to decipher text or match images if the analysis of cookies and canvas rendering suggested the page was being downloaded automatically. Since version 3, reCAPTCHA will never interrupt users and is intended to run automatically when users load pages or click buttons.

<span class="mw-page-title-main">OCRopus</span>

OCRopus is a free document analysis and optical character recognition (OCR) system released under the Apache License v2.0 with a very modular design using command-line interfaces.

This comparison of optical character recognition software includes:

RemObjects Software is an American software company founded in 2002 by Alessandro Federici and Marc Hoffman. It develops and offers tools and libraries for software developers on a variety of development platforms, including Embarcadero Delphi, Microsoft .NET, Mono, and Apple's Xcode.

NuCaptcha is an early fraud detection service which utilises behavior analytics to provision threat appropriate, animated video CAPTCHAs. NuCaptcha is developed and operated by Canada-based firm NuData Security.

XRumer is a piece of software made for spamming online forums and comment sections. It is marketed as a program for search engine optimization and was created by BotmasterLabs. It is able to register and post to forums with the aim of boosting search engine rankings. The program is able to bypass security techniques commonly used by many forums and blogs to deter automated spam, such as account registration, client detection, many forms of CAPTCHAs, and e-mail activation before posting. The program utilises SOCKS and HTTP proxies in an attempt to make it more difficult for administrators to block posts by source IP, and features a proxy checking tool to verify the integrity and anonymity of the proxies used.

<span class="mw-page-title-main">LogicalDOC</span> Document management system

LogicalDOC is a proprietary cloud-based document management system that is designed to handle and share documents within an organization. LogicalDOC is a content repository, with Lucene indexing, Activiti workflow, and a set of automatic import procedures. The system was developed using Java technology.

Windows Runtime (WinRT) is a platform-agnostic component and application architecture first introduced in Windows 8 and Windows Server 2012 in 2012. It is implemented in C++ and officially supports development in C++, Rust/WinRT, Python/WinRT, JavaScript-TypeScript, and the managed code languages C# and Visual Basic .NET (VB.NET).

Barcode library or Barcode SDK is a software library that can be used to add barcode features to desktop, web, mobile or embedded applications. Barcode library presents sets of subroutines or objects which allow to create barcode images and put them on surfaces or recognize machine-encoded text / data from scanned or captured by camera images with embedded barcodes. The library can support two modes: generation and recognition mode, some libraries support barcode reading and writing in the same way, but some libraries support only one mode.

References

  1. Asprise OCR SDK for Java by PC World Archived April 3, 2015, at the Wayback Machine
  2. SemCAPTCHA—user-friendly alternative for OCR-based CAPTCHA systems (Proceedings of the International Multiconference on Computer Science and Information Technology pp. 325–329)
  3. Image Based Room Schedule Retrieval System
  4. "Improved Spam Filtering by Extraction of Information from Text Embedded Image E-mail, SAC '09 Proceedings of the 2009 ACM symposium on Applied Computing" (PDF). Archived from the original (PDF) on 2014-08-09. Retrieved 2015-03-24.
  5. Implementation of a Speech Based Interface System for Visually Impaired Persons, Life Sci J 2013;10(9s):398-400
  6. Advanced Intelligent Computing Theories and Applications, Springer, ISBN   3540874402, Page 337-346
  7. IT Convergence and Services: ITCS & IRoA 2011, Springer, ISBN   9400725981, Page 616
  8. Advances in Data Mining: Applications in Medicine, Web Mining ISBN   3540360360
  9. Asprise OCR Library SDK API for Java, C#, VB.NET
  10. Asprise OCR API Library v15 for Java, C# VB.NET