Smart data capture

Last updated

Smart data capture (SDC), also known as 'intelligent data capture' or 'automated data capture', describes the branch of technology concerned with using computer vision techniques like optical character recognition (OCR), barcode scanning, object recognition and other similar technologies to extract and process information from semi-structured and unstructured data sources. IDC characterize smart data capture as an integrated hardware, software, and connectivity strategy to help organizations enable the capture of data in an efficient, repeatable, scalable, and future-proof way. [1] Data is captured visually from barcodes, text, IDs and other objects - often from many sources simultaneously - before being converted and prepared for digital use, typically by artificial intelligence-powered software. [2] An important feature of SDC is that it focuses not just on capturing data more efficiently but serving up easy-to-access, actionable insights at the instant of data collection to both frontline and desk-based workers, aiding decision-making and making it a two-way process.

Contents

Smart data capture automates and accelerates capture, applying insights in real time and automating processes based on extracted input. Smart data capture is designed to be repeatable and scalable to reduce low-level manual tasks and eliminate human error. To achieve this goal, smart data capture solutions are often made available using specialist software installed on commodity hardware such as smartphones. [3] However, some solutions may rely on specialized hardware such as dedicated scanning devices, wearables [4] or shop floor robots. [5]

Differences from OCR

Optical character recognition applications are typically concerned with the actual data capture process; they are intended to faithfully reproduce text, words, letters and symbols from a printed document. Smart data capture is multimodal, [6] capable of extracting data from a wider range of semi-structured and unstructured sources, going beyond basic text recognition to offer a wider scope of applications. By extending functionality to provide actionable insights at the point of capture, SDC is also a two-way process (capture-display), while OCR is more commonly one-way (capture only), primarily used for data input. [7]

Smart data capture solutions typically have two parts:

Applications

Smart data capture can be applied to almost any industry and application that requires visual information capture and interpretation. This may include:

Notable Smart Data Capture Vendors

Notes

Historically, PriceWaterhouseCoopers described smart data capture as a combination of robotic process automation and intelligent character recognition. [13] This description is no longer sufficient because it is focused purely on text-based capture systems (automated OCR).

See also

Related Research Articles

<span class="mw-page-title-main">Optical character recognition</span> Computer recognition of visual text

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo or from subtitle text superimposed on an image.

<span class="mw-page-title-main">Machine vision</span> Technology and methods used to provide imaging-based automatic inspection and analysis

Machine vision (MV) is the technology and methods used to provide imaging-based automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision refers to many technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of computer science. It attempts to integrate existing technologies in new ways and apply them to solve real world problems. The term is the prevalent one for these functions in industrial automation environments but is also used for these functions in other environment vehicle guidance.

Automatic identification and data capture (AIDC) refers to the methods of automatically identifying objects, collecting data about them, and entering them directly into computer systems, without human involvement. Technologies typically considered as part of AIDC include QR codes, bar codes, radio frequency identification (RFID), biometrics, magnetic stripes, optical character recognition (OCR), smart cards, and voice recognition. AIDC is also commonly referred to as "Automatic Identification", "Auto-ID" and "Automatic Data Capture".

<span class="mw-page-title-main">Smart camera</span> Machine vision system

A smart camera (sensor) or intelligent camera (sensor) or (smart) vision sensor or intelligent vision sensor or smart optical sensor or intelligent optical sensor or smart visual sensor or intelligent visual sensor is a machine vision system which, in addition to image capture circuitry, is capable of extracting application-specific information from the captured images, along with generating event descriptions or making decisions that are used in an intelligent and automated system. A smart camera is a self-contained, standalone vision system with built-in image sensor in the housing of an industrial video camera. The vision system and the image sensor can be integrated into one single piece of hardware known as intelligent image sensor or smart image sensor. It contains all necessary communication interfaces, e.g. Ethernet, as well as industry-proof 24V I/O lines for connection to a PLC, actuators, relays or pneumatic valves, and can be either static or mobile. It is not necessarily larger than an industrial or surveillance camera. A capability in machine vision generally means a degree of development such that these capabilities are ready for use on individual applications. This architecture has the advantage of a more compact volume compared to PC-based vision systems and often achieves lower cost, at the expense of a somewhat simpler (or omitted) user interface. Smart cameras are also referred to by the more general term smart sensors.

Enterprise content management (ECM) extends the concept of content management by adding a timeline for each content item and, possibly, enforcing processes for its creation, approval, and distribution. Systems using ECM generally provide a secure repository for managed items, analog or digital. They also include one methods for importing content to bring manage new items, and several presentation methods to make items available for use. Although ECM content may be protected by digital rights management (DRM), it is not required. ECM is distinguished from general content management by its cognizance of the processes and procedures of the enterprise for which it is created.

DocuShare is a content management system developed by Xerox Corporation. DocuShare makes use of open standards and allows for managing content, integrating it with other business systems, and developing customized and packaged software applications.

A multiline optical-character reader, or MLOCR, is a type of mail sorting machine that uses optical character recognition (OCR) technology to determine how to route mail through the postal system.

Intelligent character recognition (ICR) is used to extract handwritten text from image images using ICR, also referred to as intelligent OCR. It is a more sophisticated type of OCR technology that recognizes different handwriting styles and fonts to intelligently interpret data on forms and physical documents.

TeleForm is a form of processing applications originally developed by Cardiff Software and now is owned by OpenText.

A paperless office is a work environment in which the use of paper is eliminated or greatly reduced. This is done by converting documents and other papers into digital form, a process known as digitization. Proponents claim that "going paperless" can save money, boost productivity, save space, make documentation and information sharing easier, keep personal information more secure, and help the environment. The concept can be extended to communications outside the office as well.

Forms processing is a process by which one can capture information entered into data fields and convert it into an electronic format. This can be done manually or automatically, but the general process is that hard copy data is filled out by humans and then "captured" from their respective fields and entered into a database or other electronic format.

Document Capture Software refers to applications that provide the ability and feature set to automate the process of scanning paper documents or importing electronic documents, often for the purposes of feeding advanced document classification and data collection processes. Most scanning hardware, both scanners and copiers, provides the basic ability to scan to any number of image file formats, including: PDF, TIFF, JPG, BMP, etc. This basic functionality is augmented by document capture software, which can add efficiency and standardization to the process.

Enterprise forms automation is a company-wide computer system or set of systems for managing, distributing, completing, and processing paper-based forms, applications, surveys, contracts, and other documents. It plays a vital role in the concept of a paperless office.

<span class="mw-page-title-main">Digital mailroom</span> Automation of incoming mail processes

Digital mailroom is the automation of incoming mail processes. Using document scanning and document capture technologies, companies can digitise incoming mail and automate the classification and distribution of mail within the organization. Both paper and electronic mail (email) can be managed through the same process allowing companies to standardize their internal mail distribution procedures and adhere to company compliance policies.

Datacap, a privately owned company, manufactures and sells computer software, and services. Datacap's first product, Paper Keyboard, was a "forms processing" product and shipped in 1989. In August 2010, IBM announced that it had acquired Datacap for an undisclosed amount.

Dynamsoft Corp. is a Canadian software development company with its headquarter in Vancouver, Canada. It provides software development kit (SDK) for document capture and barcode applications for various usage scenarios, sometimes known as smart data capture. These SDKs help developers meet document imaging, scanning and barcode reader application requirements when developing web, desktop, or mobile document management applications.

Scan-Optics LLC, founded in 1968, is an enterprise content management services company and optical character recognition (OCR) and image scanner manufacturer headquartered in Manchester, Connecticut.

Asprise OCR is a commercial optical character recognition and barcode recognition SDK library that provides an API to recognize text as well as barcodes from images and output in formats like plain text, xml and searchable PDF.

Barcode library or Barcode SDK is a software library that can be used to add barcode features to desktop, web, mobile or embedded applications. Barcode library presents sets of subroutines or objects which allow to create barcode images and put them on surfaces or recognize machine-encoded text / data from scanned or captured by camera images with embedded barcodes. The library can support two modes: generation and recognition mode, some libraries support barcode reading and writing in the same way, but some libraries support only one mode.

Scandit AG, commonly referred to as Scandit, is a Swiss technology company that provides smart data capture software. Their technology allows any smart device equipped with a camera to scan barcodes, IDs and text and to perform additional functions using augmented reality and advanced analytics.

References

  1. Arcaro, Matt (January 2023). Smart Data Capture: A Technology Strategy to Scale Data Intelligence (PDF). IDC (Report).
  2. Mueller, Samuel (17 November 2022). "What Companies Should Know About Smart Data Capture And Last-Mile Delivery". Forbes Technology Council.
  3. "How smart data capture solutions on Samsung Galaxy rugged devices are helping transform business operations". Samsung. 27 October 2022.
  4. Bauer, Dennis; Wutzke, Rolf; Bauernhansl, Thomas (2016). "Wear@Work – A New Approach for Data Acquisition Using Wearables". Procedia Cirp. 50: 529–534. doi: 10.1016/j.procir.2016.04.121 . S2CID   114410108.
  5. Anstee, James (14 January 2022). "Scandit launches smart shelf management for retailers". Electronic Specifier.
  6. "9 Principles of a Smart Data Capture Strategy". iCrunchData. 8 June 2023.
  7. BasuMallick, Chiradeep (30 January 2023). "What Is OCR (Optical Character Recognition)? Meaning, Working, and Software". Spiceworks.
  8. Pressley, Alix (19 January 2023). "Smart data capture unlocks uplifted employee and customer experience". Intelligent CIO.
  9. "Why is Mobile Data Capture Important for Transport Logistic Firms". Dynamsoft. 28 December 2022.
  10. Flannery, Ellen (13 March 2023). "The nurse's journey: how smart data capture will revolutionise hospital processes". Intelligent Health.tech.
  11. "SAS (Scandinavian Airlines) Improves Customer Service and Cuts Costs with Scandit's Barcode Scanning on Smartphones". Business Wire. 9 December 2019.
  12. Vala, Melanie (24 January 2023). "Why the Mobile Experience Is Important for E-Commerce". AIthority.
  13. Kamra, Nitin (2018). Robotic process automation and intelligent character recognition: Smart data capture (PDF). PriceWaterhouseCooper (Report).