Optical mark recognition

Last updated

Optical mark recognition (OMR) collects data from people by identifying markings on a paper. OMR enables the hourly processing of hundreds or even thousands of documents. A common application of this technology is used in exams, where students mark cells as their answers. This allows for very fast automated grading of exam sheets.

Contents

Background

OMR tests form, with registration marks and drop-out colors, designed to be scanned by dedicated OMR device LegacyStyleOMRFormSm.jpg
OMR tests form, with registration marks and drop-out colors, designed to be scanned by dedicated OMR device

Many OMR devices have a scanner that shines a light onto a form. The device then looks at the contrasting reflectivity of the light at certain positions on the form. It will detect the black marks because they reflect less light than the blank areas on the form.

Some OMR devices use forms that are printed on transoptic paper. The device can then measure the amount of light that passes through the paper. It will pick up any black marks on either side of the paper because they reduce the amount of light passing through.

In contrast to the dedicated OMR device, desktop OMR software allows a user to create their own forms in a word processor or computer and print them on a laser printer. The OMR software then works with a common desktop image scanner with a document feeder to process the forms once filled out.

OMR is generally distinguished from optical character recognition (OCR) by the fact that a complicated pattern recognition engine is not required. That is, the marks are constructed in such a way that there is little chance that the OMR device will not read them correctly. This does require the image to have high contrast and an easily recognizable or irrelevant shape. A related field to OMR and OCR is the recognition of barcodes, such as the UPC bar code found on product packaging.

One of the most familiar applications of OMR is the use of #2 pencil (HB in Europe) bubble optical answer sheets in multiple choice question examinations. Students mark their answers, or other personal information, by darkening circles on a forms. The sheet is then graded by a scanning machine.

Lozenge marks represent a later technology that is easier to mark and easier to erase. The large "bubble" marks are legacy technology from very early OMR machines that were so insensitive a large mark was required for reliability. In most Asian countries, a special marker is used to fill in an optical answer sheet. Students, likewise, mark answers or other information by darkening circles marked on a pre-printed sheet. Then the sheet is automatically graded by a scanning machine.

Many of today's OMR applications involve people filling in specialized forms. These forms are optimized for computer scanning, with careful registration in the printing, and careful design so that ambiguity is reduced to the minimum possible. Due to its extremely low error rate, low cost and ease-of-use, OMR is a popular method of tallying votes. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

OMR marks are also added to items of printed mail so folder inserter equipment can be used. The marks are added to each (normally facing/odd) page of a mail document and consist of a sequence of black dashes that folder inserter equipment scans in order to determine when the mail should be folded then inserted in an envelope.

Optical answer sheet

A response to an SAT math question marked on an optical answer sheet SAT-Grid-In-Example.svg
A response to an SAT math question marked on an optical answer sheet

An optical answer sheet or bubble sheet is a special type of form used in multiple choice question examinations. OMR is used to detect answers. The Scantron Corporation creates many optical answer sheets, although certain uses require their own customized system. [ citation needed ]

Optical answer sheets usually have a set of blank ovals or boxes that correspond to each question, often on separate sheets of paper. Bar codes may mark the sheet for automatic processing, and each series of ovals filled will return a certain value when read. In this way students' answers can be digitally recorded, or identity given.

Reading

The first optical answer sheets were read by shining a light through the sheet and measuring how much of the light was blocked using phototubes on the opposite side. [11] As some phototubes are mostly sensitive to the blue end of the visible spectrum, [12] blue pens could not be used, as blue inks reflect and transmit blue light. Because of this, number two pencils had to be used to fill in the bubbles—graphite is a very opaque substance which absorbs or reflects most of the light which hits it. [11]

Modern optical answer sheets are read based on reflected light, measuring lightness and darkness. They do not need to be filled in with a number two pencil, though these are recommended over other types (this is due to the lighter marks made by higher-number pencils and the smudges from number 1 pencils). Black ink will be read, though many systems will ignore marks that are the same color the form is printed in. [11] This also allows optical answer sheets to be double-sided because marks made on the opposite side will not interfere with reflectance readings as much as with opacity readings.

Most systems accommodate for human error in filling in ovals imprecisely—as long as they do not stray into the other ovals and the oval is almost filled, the scanner will detect it as filled in.

Designing and printing

There are specific dimensions of designing OMR sheets with 0.05 mm precision on scale. If the dimensions are not up to the precision scale, the accuracy of the OMR sheet may vary, so the sheet should be designed, printed and cut perfectly.

Errors

It is possible for optical answer sheets to be printed incorrectly, such that all ovals will be read as filled. This occurs if the outline of the ovals is too thick, or is irregular. During the 2008 U.S. presidential election, this occurred with over 19,000 absentee ballots in the Georgia county of Gwinnett, and was discovered after around 10,000 had already been returned. The slight difference was not apparent to the naked eye, and was not detected until a test run was made in late October. This required all ballots to be transferred to correctly printed ones, by sequestered workers of the board of elections, under close observation by members of the Democratic and Republican (but not other) political parties, and county sheriff deputies. The transfer, by law, could not occur until election day (November 4).[ citation needed ]

OMR software

Plain paper OMR survey form, without registration marks and drop-out colors, designed to be scanned by an image scanner and OMR software PlainPaperOMRFormSm.jpg
Plain paper OMR survey form, without registration marks and drop-out colors, designed to be scanned by an image scanner and OMR software

OMR software is a computer software application that makes OMR possible on a desktop computer by using an Image scanner to process surveys, tests, attendance sheets, checklists, and other plain-paper forms printed on a laser printer.

OMR software is used to capture data from OMR sheets. While data capturing scanning devices focus on many factors like thickness of paper dimensions of OMR sheet and the designing pattern.

Commercial OMR software

One of the first OMR software packages that used images from common image scanners was Remark Office OMR, made by Gravic, Inc. (originally named Principia Products, Inc.). Remark Office OMR 1.0 was released in 1991.

The need for OMR software originated because early optical mark recognition systems used dedicated scanners and special pre-printed forms with drop-out colors and registration marks. Such forms typically cost US$0.10 to $0.19 a page. [13] In contrast, OMR software users design their own mark-sense forms with a word processor or built-in form editor, print them locally on a printer, and can save thousands of dollars on large numbers of forms. [14]

Identifying optical marks within a form, such as for processing census forms, has been offered by many forms-processing (Batch Transaction Capture) companies since the late 1980s. Mostly this is based on a bitonal image and pixel count with minimum and maximum pixel counts to eliminate extraneous marks, such as those erased with a dirty eraser that when converted into a black-and-white image (bitonal) can look like a legitimate mark. So this method can cause problems when a user changes their mind, and so some products started to use grayscale to better identify the intent of the marker—internally scantron and NCS scanners used grayscale.

OMR software is also used for adding OMR marks to mail documents so they can be scanned by folder inserter equipment. An example of OMR software is Mail Markup from UK developer Funasset Limited. This software allows the user to configure and select an OMR sequence then apply the OMR marks to mail documents prior to printing.

History

Optical mark recognition (OMR) is the scanning of paper to detect the presence or absence of a mark in a predetermined position. [4] Optical mark recognition has evolved from several other technologies. In the early 19th century and 20th century patents were given for machines that would aid the blind. [2]

OMR is now used as an input device for data entry. Two early forms of OMR are paper tape and punch cards which use actual holes punched into the medium instead of pencil filled circles on the medium. Paper tape was used as early as 1857 as an input device for telegraph. [10] Punch cards were created in 1890 and were used as input devices for computers. The use of punch cards declined greatly in the early 1970s with the introduction of personal computers. [8] With modern OMR, where the presence of a pencil filled in bubble is recognized, the recognition is done via an optical scanner.

The first mark sense scanner was the IBM 805 Test Scoring Machine; this read marks by sensing the electrical conductivity of graphite pencil lead using pairs of wire brushes that scanned the page. In the 1930s, Richard Warren at IBM experimented with optical mark sense systems for test scoring, as documented in US Patents 2,150,256 (filed in 1932, granted in 1939) and 2,010,653 (filed in 1933, granted in 1935). The first successful optical mark-sense scanner was developed by Everett Franklin Lindquist as documented in US Patent 3,050,248 (filed in 1955, granted in 1962). Lindquist had developed numerous standardized educational tests, and needed a better test scoring machine than the then-standard IBM 805. The rights to Lindquist's patents were held by the Measurement Research Center until 1968, when the University of Iowa sold the operation to Westinghouse Corporation.

During the same period, IBM also developed a successful optical mark-sense test-scoring machine, as documented in US Patent 2,944,734 (filed in 1957, granted in 1960). IBM commercialized this as the IBM 1230 Optical mark scoring reader in 1962. This and a variety of related machines allowed IBM to migrate a wide variety of applications developed for its mark sense machines to the new optical technology. These applications included a variety of inventory management and trouble reporting forms, most of which had the dimensions of a standard punched card.

While the other players in the educational testing arena focused on selling scanning services, Scantron Corporation, founded in 1972, [15] had a different model; it would distribute inexpensive scanners to schools and make profits from selling the test forms. As a result, many people came to think of all mark-sense forms (whether optically sensed or not) as scantron forms.

In 1983, Westinghouse Learning Corporation was acquired by National Computer Systems (NCS). In 2000, NCS was acquired by Pearson Education, where the OMR technology formed the core of Pearson's Data Management group. In February 2008, M&F Worldwide purchased the Data Management group from Pearson; the group is now part of the Scantron brand. [16]

OMR has been used in many situations as mentioned below. The use of OMR in inventory systems was a transition between punch cards and bar codes and is not used as much for this purpose. [8] OMR is still used extensively for surveys and testing though.

Usage

The use of OMR is not limited to schools or data collection agencies; many businesses and health care agencies use OMR to streamline their data input processes and reduce input error. OMR, OCR, and ICR technologies all provide a means of data collection from paper forms. OMR may also be done using an OMR (discrete read head) scanner or an imaging scanner. [17]

Applications

OMR betting form used in Japan Racing Association Fukushima Racecourse, Japan. JRA Fukushima OMR betting slip.jpg
OMR betting form used in Japan Racing Association Fukushima Racecourse, Japan.
Betting ticket using this form. JRA Fukushima QR betting ticket 20110925.jpg
Betting ticket using this form.

There are many other applications for OMR, for examples:

Field types

OMR has different fields to provide the format the questioner desires. These fields include:

Capabilities/requirements

In the past and presently, some OMR systems require special paper, special ink and a special input reader (Bergeron, 1998). This restricts the types of questions that can be asked and does not allow for much variability when the form is being input. Progress in OMR now allows users to create and print their own forms and use a scanner (preferably with a document feeder) to read the information. [18] The user is able to arrange questions in a format that suits their needs while still being able to easily input the data. [19] OMR systems approach one hundred percent accuracy and only take 5 milliseconds on average to recognize marks. [18] Users can use squares, circles, ellipses and hexagons for the mark zone. The software can then be set to recognize filled in bubbles, crosses or check marks.

OMR can also be used for personal use. There are all-in-one printers in the market that will print the photos the user selects by filling in the bubbles for size and paper selection on an index sheet that has been printed. Once the sheet has been filled in, the individual places the sheet on the scanner to be scanned and the printer will print the photos according to the marks that were indicated.[ citation needed ]

Disadvantages

There are also some disadvantages and limitations to OMR. If the user wants to gather large amounts of text, then OMR complicates the data collection. [20] There is also the possibility of missing data in the scanning process, and incorrectly or unnumbered pages can lead to their being scanned in the wrong order. Also, unless safeguards are in place, a page could be rescanned, providing duplicate data and skewing the data. [18]

As a result of the widespread adoption and ease of use of OMR, standardized examinations can consist primarily of multiple-choice questions, changing the nature of what is being tested.

See also

Lists

Related Research Articles

<span class="mw-page-title-main">Keyboard technology</span> Hardware technology of keyboards

The technology of computer keyboards includes many elements. Many different keyboard technologies have been developed for consumer demands and optimized for industrial applications. The standard full-size (100%) computer alphanumeric keyboard typically uses 101 to 105 keys; keyboards integrated in laptop computers are typically less comprehensive.

<span class="mw-page-title-main">Optical character recognition</span> Computer recognition of visual text

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo or from subtitle text superimposed on an image.

<span class="mw-page-title-main">Barcode</span> Optical machine-readable representation of data

A barcode or bar code is a method of representing data in a visual, machine-readable form. Initially, barcodes represented data by varying the widths, spacings and sizes of parallel lines. These barcodes, now commonly referred to as linear or one-dimensional (1D), can be scanned by special optical scanners, called barcode readers, of which there are several types.

<span class="mw-page-title-main">Multi-function printer</span> Office machine

An MFP, multi-functional, all-in-one (AIO), or multi-function device (MFD), is an office machine which incorporates the functionality of multiple devices in one, so as to have a smaller footprint in a home or small business setting, or to provide centralized document management/distribution/production in a large-office setting. A typical MFP may act as a combination of some or all of the following devices: email, fax, photocopier, printer, scanner.

<span class="mw-page-title-main">Handwriting recognition</span> Ability of a computer to receive and interpret intelligible handwritten input

Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices. The image of the written text may be sensed "off line" from a piece of paper by optical scanning or intelligent word recognition. Alternatively, the movements of the pen tip may be sensed "on line", for example by a pen-based computer screen surface, a generally easier task as there are more clues available. A handwriting recognition system handles formatting, performs correct segmentation into characters, and finds the most possible words.

<span class="mw-page-title-main">Image scanner</span> Device that optically scans images, printed text

An image scanner is a device that optically scans images, printed text, handwriting, or an object and converts it to a digital image. The most common type of scanner used in offices and in the home is the flatbed scanner, where the document is placed on a glass window for scanning. A sheetfed scanner, which moves the page across an image sensor using a series of rollers, may be used to scan one document at a time or multiple, as in an automatic document feeder. A handheld scanner is a portable version of an image scanner that can be used on any flat surface. Scans are usually downloaded to the computer that the scanner is connected to, although some scanners are able to store scans on standalone flash media.

<span class="mw-page-title-main">Data entry clerk</span> Profession

A data entry clerk, also known as data preparation and control operator, data registration and control operator, and data preparation and registration operator, is a member of staff employed to enter or update data into a computer system. Data is often entered into a computer from paper documents using a keyboard. The keyboards used can often have special keys and multiple colors to help in the task and speed up the work. Proper ergonomics at the workstation is a common topic considered.

<span class="mw-page-title-main">Scantron Corporation</span> American manufacturing company

Scantron Corporation is an American company based in Eagan, Minnesota. Scantron provides assessment solutions and technology services for business, education, certification, and government clients.

Enterprise content management (ECM) extends the concept of content management by adding a timeline for each content item and, possibly, enforcing processes for its creation, approval, and distribution. Systems using ECM generally provide a secure repository for managed items, analog or digital. They also include one methods for importing content to manage new items, and several presentation methods to make items available for use. Although ECM content may be protected by digital rights management (DRM), it is not required. ECM is distinguished from general content management by its cognizance of the processes and procedures of the enterprise for which it is created.

Optical music recognition (OMR) is a field of research that investigates how to computationally read musical notation in documents. The goal of OMR is to teach the computer to read and interpret sheet music and produce a machine-readable version of the written music score. Once captured digitally, the music can be saved in commonly used file formats, e.g. MIDI and MusicXML . In the past it has, misleadingly, also been called "music optical character recognition". Due to significant differences, this term should no longer be used.

TeleForm is a form of processing applications originally developed by Cardiff Software and now is owned by OpenText.

An optical scan voting system is an electronic voting system and uses an optical scanner to read marked paper ballots and tally the results.

A text entry interface or text entry device is an interface that is used to enter text information in an electronic device. A commonly used device is a mechanical computer keyboard. Most laptop computers have an integrated mechanical keyboard, and desktop computers are usually operated primarily using a keyboard and mouse. Devices such as smartphones and tablets mean that interfaces such as virtual keyboards and voice recognition are becoming more popular as text entry systems.

Paper data storage refers to the use of paper as a data storage device. This includes writing, illustrating, and the use of data that can be interpreted by a machine or is the result of the functioning of a machine. A defining feature of paper data storage is the ability of humans to produce it with only simple tools and interpret it visually.

<span class="mw-page-title-main">Computer keyboard</span> Data input device

A computer keyboard is a peripheral input device modeled after the typewriter keyboard which uses an arrangement of buttons or keys to act as mechanical levers or electronic switches. Replacing early punched cards and paper tape technology, interaction via teleprinter-style keyboards have been the main input method for computers since the 1970s, supplemented by the computer mouse since the 1980s.

DAC-1, for Design Augmented by Computer, was one of the earliest graphical computer aided design systems. Developed by General Motors, IBM was brought in as a partner in 1960 and the two developed the system and released it to production in 1963. It was publicly unveiled at the Fall Joint Computer Conference in Detroit 1964. GM used the DAC system, continually modified, into the 1970s when it was succeeded by CADANCE.

Forms processing is a process by which one can capture information entered into data fields and convert it into an electronic format. This can be done manually or automatically, but the general process is that hard copy data is filled out by humans and then "captured" from their respective fields and entered into a database or other electronic format.

Document capture software refers to applications that provide the ability and feature set to automate the process of scanning paper documents or importing electronic documents, often for the purposes of feeding advanced document classification and data collection processes. Most scanning hardware, both scanners and copiers, provides the basic ability to scan to any number of image file formats, including: PDF, TIFF, JPG, BMP, etc. This basic functionality is augmented by document capture software, which can add efficiency and standardization to the process.

Michael Sokolski was a Polish-born American design engineer. Sokolski was the inventor of the Scantron OMR scanner, used to scan and grade forms on which students mark answers to academic multiple choice test questions.

<span class="mw-page-title-main">IBM optical mark and character readers</span> Optical mark and character readers made and sold by IBM

IBM designed, manufactured and sold optical mark and character readers from 1960 until 1984. The IBM 1287 is notable as being the first commercially sold scanner capable of reading handwritten numbers.

References

  1. "Optical mark recognition". Archived from the original on June 13, 2006. Retrieved June 13, 2006.
  2. 1 2 Research Optical Character Recognition | Macmillan Science Library: Computer Sciences. Bookrags.com. 2010-11-02. Retrieved 2015-07-03.
  3. "Optical Scanning Systems —". Aceproject.org. Retrieved 2015-07-03.
  4. 1 2 Haag, S., Cummings, M., McCubbrey, D., Pinsonnault, A., Donovan, R. (2006). Management Information Systems for the Information Age (3rd ed.). Canada: McGraw-Hill Ryerson.
  5. "Statisticians' Lib: Using Scanners and OMR Software for Affordable Data Input". Archived from the original on November 10, 2005. Retrieved June 13, 2006.
  6. "Data Collection on the Cheap". July 2015. Archived from the original (PPT) on 2015-07-22. Retrieved 2015-07-21.
  7. "Remark Office OMR, by Gravic (Principia Products), works with popular image scanners to scan surveys, tests and other plain paper forms". Omrsolutions.com. Retrieved 2015-07-03.
  8. 1 2 3 Palmer, Roger C. (1989, Sept) The Basics of Automatic Identification [Electronic version]. Canadian Datasystems, 21 (9), 30-33
  9. "Forms Processing Technology". Tkvision.com. Archived from the original on 2008-05-11. Retrieved 2015-07-03.
  10. 1 2 Research Input Devices | Macmillan Science Library: Computer Sciences. Bookrags.com. 2010-11-02. Retrieved 2015-07-03.
  11. 1 2 3 Bloomfield, Louis A (29 May 2006). "Question 1529: Why do scantron-type tests only read #2 pencils? Can other pencils work?". HowEverythingWorks.org.
  12. Mullard Technical Handbook Volume 4 Section 4:Photoemissive Cells (1960 Edition)
  13. "Archived copy" (PDF). Archived from the original (PDF) on 2009-03-20. Retrieved 2009-03-12.{{cite web}}: CS1 maint: archived copy as title (link)
  14. Michael Wagenheim. "Grading Biology Exams at a Large State University". RemarkSoftware.com. Retrieved 2015-07-21.
  15. "The Marketplace for Educational Testing". Bc.edu. Retrieved 2015-07-03.
  16. "NCS Pearson, Inc". Archived from the original on June 14, 2010. Retrieved June 14, 2010.
  17. http://datamanagement.scantron.com/pdf/icr-ocr-omr.pdf%5B%5D
  18. 1 2 3 Bergeron, [ who? ]
  19. LoPresti, 1996 [ who? ]
  20. Green, 2000 [ who? ]