Handwritten biometric recognition

Last updated
Example of handwritting of a sequence of digits. Its dynamic information is shown on the right. It is interesting to enphasize that movements in the air are also acquired by the digitizing tablet. These movements can be identified because pressure is equal to zero. Numeros dinamicos.jpg
Example of handwritting of a sequence of digits. Its dynamic information is shown on the right. It is interesting to enphasize that movements in the air are also acquired by the digitizing tablet. These movements can be identified because pressure is equal to zero.
Example of dynamic information of handwritting. Info numeros.jpg
Example of dynamic information of handwritting.

Handwritten biometric recognition is the process of identifying the author of a given text from the handwriting style. Handwritten biometric recognition belongs to behavioural biometric systems because it is based on something that the user has learned to do.

Contents

Static and dynamic recognition

Handwritten biometrics can be split into two main categories:

Static: In this mode, users writes on paper, digitize it through an optical scanner or a camera, and the biometric system recognizes the text analyzing its shape. This group is also known as "off-line".

Dynamic: In this mode, users writes in a digitizing tablet, which acquires the text in real time. Another possibility is the acquisition by means of stylus-operated PDAs. Dynamic recognition is also known as "on-line". Dynamic information for handwriting movement analysis usually consists of the following information:

Better accuracies are achieved by means of dynamic systems. Some technological approaches exist. [1] [2] [3] [4] [5]

Difference from OCR

Handwritten biometric recognition should not be confused with optical character recognition (OCR). While the goal of handwritten biometrics is to identify the author of a given text, the goal of an OCR is to recognize the content of the text, regardless of its author.

Related Research Articles

<span class="mw-page-title-main">Optical character recognition</span> Computer recognition of visual text

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo or from subtitle text superimposed on an image.

<span class="mw-page-title-main">Handwriting recognition</span> Ability of a computer to receive and interpret intelligible handwritten input

Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices. The image of the written text may be sensed "off line" from a piece of paper by optical scanning or intelligent word recognition. Alternatively, the movements of the pen tip may be sensed "on line", for example by a pen-based computer screen surface, a generally easier task as there are more clues available. A handwriting recognition system handles formatting, performs correct segmentation into characters, and finds the most possible words.

Document processing is a field of research and a set of production processes aimed at making an analog document digital. Document processing does not simply aim to photograph or scan a document to obtain a digital image, but also to make it digitally intelligible. This includes extracting the structure of the document or the layout and then the content, which can take the form of text or images. The process can involve traditional computer vision algorithms, convolutional neural networks or manual labor. The problems addressed are related to semantic segmentation, object detection, optical character recognition (OCR), handwritten text recognition (HTR) and, more broadly, transcription, whether automatic or not. The term can also include the phase of digitizing the document using a scanner and the phase of interpreting the document, for example using natural language processing (NLP) or image classification technologies. It is applied in many industrial and scientific fields for the optimization of administrative processes, mail processing and the digitization of analog archives and historical documents.

Automatic identification and data capture (AIDC) refers to the methods of automatically identifying objects, collecting data about them, and entering them directly into computer systems, without human involvement. Technologies typically considered as part of AIDC include QR codes, bar codes, radio frequency identification (RFID), biometrics, magnetic stripes, optical character recognition (OCR), smart cards, and voice recognition. AIDC is also commonly referred to as "Automatic Identification", "Auto-ID" and "Automatic Data Capture".

Optical music recognition (OMR) is a field of research that investigates how to computationally read musical notation in documents. The goal of OMR is to teach the computer to read and interpret sheet music and produce a machine-readable version of the written music score. Once captured digitally, the music can be saved in commonly used file formats, e.g. MIDI and MusicXML . In the past it has, misleadingly, also been called "music optical character recognition". Due to significant differences, this term should no longer be used.

Intelligent character recognition (ICR) is used to extract handwritten text from image images using ICR, also referred to as intelligent OCR. It is a more sophisticated type of OCR technology that recognizes different handwriting styles and fonts to intelligently interpret data on forms and physical documents.

Strikethrough is a typographical presentation of words with a horizontal line through their center, resulting in text like this. Contrary to censored or sanitized (redacted) texts, the words remain readable. This presentation signifies one of two meanings. In ink-written, typewritten, or other non-erasable text, the words are a mistake and not meant for inclusion. When used on a computer screen, however, it indicates deleted information, as popularized by Microsoft Word's revision and track changes features. It can also be used deliberately to imply a change of thought.

Recognition may refer to:

<span class="mw-page-title-main">Pen computing</span> Uses a stylus and tablet/touchscreen

Pen computing refers to any computer user-interface using a pen or stylus and tablet, over input devices such as a keyboard or a mouse.

A text entry interface or text entry device is an interface that is used to enter text information in an electronic device. A commonly used device is a mechanical computer keyboard. Most laptop computers have an integrated mechanical keyboard, and desktop computers are usually operated primarily using a keyboard and mouse. Devices such as smartphones and tablets mean that interfaces such as virtual keyboards and voice recognition are becoming more popular as text entry systems.

Intelligent Word Recognition, or IWR, is the recognition of unconstrained handwritten words. IWR recognizes entire handwritten words or phrases instead of character-by-character, like its predecessor, optical character recognition (OCR). IWR technology matches handwritten or printed words to a user-defined dictionary, significantly reducing character errors encountered in typical character-based recognition engines.

Handwriting movement analysis is the study and analysis of the movements involved in handwriting and drawing. It forms an important part of graphonomics, which became established after the "International Workshop on Handwriting Movement Analysis" in 1982 in Nijmegen, The Netherlands. It would become the first of a continuing series of International Graphonomics Conferences. The first graphonomics milestone was Thomassen, Keuss, Van Galen, Grootveld (1983).

This is a software system for forensic comparison of handwriting. It was developed at CEDAR, the Center of Excellence for Document Analysis and Recognition at the University at Buffalo. CEDAR-FOX has capabilities for interaction with the questioned document examiner to go through processing steps such as extracting regions of interest from a scanned document, determining lines and words of text, recognize textual elements. The final goal is to compare two samples of writing to determine the log-likelihood ratio under the prosecution and defense hypotheses. It can also be used to compare signature samples. The software, which is protected by a United States Patent can be licensed from Cedartech, Inc.

<span class="mw-page-title-main">Signature recognition</span>

Signature recognition is an example of behavioral biometrics that identifies a person based on their handwriting. It can be operated in two different ways:

Matti Kalevi Pietikäinen is a computer scientist. He is currently Professor (emer.) in the Center for Machine Vision and Signal Analysis, University of Oulu, Finland. His research interests are in texture-based computer vision, face analysis, affective computing, biometrics, and vision-based perceptual interfaces. He was Director of the Center for Machine Vision Research, and Scientific Director of Infotech Oulu.

The Medical Intelligence and Language Engineering Laboratory, also known as MILE lab, is a research laboratory at the Indian Institute of Science, Bangalore under the Department of Electrical Engineering. The lab is known for its work on Image processing, online handwriting recognition, Text-To-Speech and Optical character recognition systems, all of which are focused mainly on documents and speech in Indian languages. The lab is headed by A. G. Ramakrishnan.

Sayre's paradox is a dilemma encountered in the design of automated handwriting recognition systems. A standard statement of the paradox is that a cursively written word cannot be recognized without being segmented and cannot be segmented without being recognized. The paradox was first articulated in a 1973 publication by Kenneth M. Sayre, after whom it was named.

Indic OCR refers to the process of converting text images written in Indic scripts into e-text using Optical character recognition (OCR) techniques. Broadly, it can also refer to the OCR systems of Brahmic scripts for languages of South Asia and Southeast Asia, not just the scripts of the Indian subcontinent, which are all written in an abugida-based writing system.

Pattern Recognition is a single blind peer-reviewed academic journal published by Elsevier Science. It was first published in 1968 by Pergamon Press. The founding editor-in-chief was Robert Ledley, who was succeeded from 2009 until 2016 by Ching Suen of Concordia University. Since 2016 the current editor-in-chief is Edwin Hancock of the University of York. The journal publishes papers in the general area of pattern recognition, including applications in the areas of image processing, computer vision, handwriting recognition, biometrics and biomedical signal processing. The journal awards the Pattern Recognition Society Medal to the best paper published in the journal each year. In 2020, the journal had an impact factor of 7.196 and it currently has a Scopus CiteScore of 13.1. Google Scholar currently lists the journal as ranked 6th in the top 20 publications in Computer Vision and Pattern Recognition.

In Codice Ratio is a research project designed to study and use novel techniques such as Optical Character Recognition and Artificial Intelligence to digitize works in the Vatican Apostolic Archive, most of which is handwritten.

References

  1. Chapran, J. (2006). "Biometric Writer Identification: Feature Analysis and Classification". International Journal of Pattern Recognition & Artificial Intelligence. 20 (4): 483–503. doi:10.1142/S0218001406004831.
  2. Schomaker, L. (2007). "Advances in Writer Identification and Verification". Ninth International Conference on Document Analysis and Recognition. ICDAR: 1268–1273. Archived from the original on 2021-01-28. Retrieved 2020-10-12.
  3. Said, H. E. S.; TN Tan; KD Baker (2000). "Personal identification based on handwriting". Pattern Recognition. 33 (2000): 149–160. CiteSeerX   10.1.1.408.9131 . doi:10.1016/S0031-3203(99)00006-0.
  4. Schlapbach, A.; M Liwicki; H Bunke (2008). "A writer identification system for on-line whiteboard data". Pattern Recognition. 41 (7): 2381–2397. doi:10.1016/j.patcog.2008.01.006.
  5. Sesa-Nogueras, Enric; Marcos Faundez-Zanuy (2012). "Biometric recognition using online uppercase handwritten text". Pattern Recognition. 45 (1): 128–144. doi:10.1016/j.patcog.2011.06.002.