Data processing is the collection and manipulation of digital data to produce meaningful information. [1] Data processing is a form of information processing, which is the modification (processing) of information in any manner detectable by an observer. [note 1]
Data processing may involve various processes, including:
The United States Census Bureau history illustrates the evolution of data processing from manual through electronic procedures.
Although widespread use of the term data processing dates only from the 1950s, [2] data processing functions have been performed manually for millennia. For example, bookkeeping involves functions such as posting transactions and producing reports like the balance sheet and the cash flow statement. Completely manual methods were augmented by the application of mechanical or electronic calculators. A person whose job was to perform calculations manually or using a calculator was called a "computer."
The 1890 United States Census schedule was the first to gather data by individual rather than household. A number of questions could be answered by making a check in the appropriate box on the form. From 1850 to 1880 the Census Bureau employed "a system of tallying, which, by reason of the increasing number of combinations of classifications required, became increasingly complex. Only a limited number of combinations could be recorded in one tally, so it was necessary to handle the schedules 5 or 6 times, for as many independent tallies." [3] "It took over 7 years to publish the results of the 1880 census" [4] using manual processing methods.
The term automatic data processing was applied to operations performed by means of unit record equipment, such as Herman Hollerith's application of punched card equipment for the 1890 United States Census. "Using Hollerith's punchcard equipment, the Census Office was able to complete tabulating most of the 1890 census data in 2 to 3 years, compared with 7 to 8 years for the 1880 census. It is estimated that using Hollerith's system saved some $5 million in processing costs" [4] in 1890 dollars even though there were twice as many questions as in 1880.
Computerized data processing, or electronic data processing represents a later development, with a computer used instead of several independent pieces of equipment. The Census Bureau first made limited use of electronic computers for the 1950 United States Census, using a UNIVAC I system, [3] delivered in 1952.
The term data processing has mostly been subsumed by the more general term information technology (IT). [5] The older term "data processing" is suggestive of older technologies. For example, in 1996 the Data Processing Management Association (DPMA) changed its name to the Association of Information Technology Professionals. Nevertheless, the terms are approximately synonymous.
Commercial data processing involves a large volume of input data, relatively few computational operations, and a large volume of output. For example, an insurance company needs to keep records on tens or hundreds of thousands of policies, print and mail bills, and receive and post payments.
In science and engineering, the terms data processing and information systems are considered too broad, and the term data processing is typically used for the initial stage followed by a data analysis in the second stage of the overall data handling.
Data analysis uses specialized algorithms and statistical calculations that are less often observed in a typical general business environment. For data analysis, software suites like SPSS or SAS, or their free counterparts such as DAP, gretl, or PSPP are often used. These tools are usually helpful for processing various huge data sets, as they are able to handle enormous amount of statistical analysis. [6]
A data processing system is a combination of machines, people, and processes that for a set of inputs produces a defined set of outputs. The inputs and outputs are interpreted as data, facts, information etc. depending on the interpreter's relation to the system.
A term commonly used synonymously with data or storage (codes) processing system is information system . [7] With regard particularly to electronic data processing, the corresponding concept is referred to as electronic data processing system.
A very simple example of a data processing system is the process of maintaining a check register. Transactions— checks and deposits— are recorded as they occur and the transactions are summarized to determine a current balance. Monthly the data recorded in the register is reconciled with a hopefully identical list of transactions processed by the bank.
A more sophisticated record keeping system might further identify the transactions— for example deposits by source or checks by type, such as charitable contributions. This information might be used to obtain information like the total of all contributions for the year.
The important thing about this example is that it is a system, in which, all transactions are recorded consistently, and the same method of bank reconciliation is used each time.
This is a flowchart of a data processing system combining manual and computerized processing to handle accounts receivable, billing, and general ledger
Electronic data interchange (EDI) is the concept of businesses electronically communicating information that was traditionally communicated on paper, such as purchase orders, advance ship notices, and invoices. Technical standards for EDI exist to facilitate parties transacting such instruments without having to make special arrangements.
The history of computing hardware spans the developments from early devices used for simple calculations to today’s complex computers, encompassing advancements in both analog and digital technology.
Herman Hollerith was an American statistician, inventor, and businessman who developed an electromechanical tabulating machine for punched cards to assist in summarizing information and, later, in accounting. His invention of the punched card tabulating machine, patented in 1884, marks the beginning of the era of mechanized binary code and semiautomatic data processing systems, and his concept dominated that landscape for nearly a century.
Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.
A punched card is a piece of card stock that stores digital data using punched holes. Punched cards were once common in data processing and the control of automated machines.
Punched tape or perforated paper tape is a form of data storage device that consists of a long strip of paper through which small holes are punched. It was developed from and was subsequently used alongside punched cards, the difference being that the tape is continuous.
A punched card sorter is a machine for sorting decks of punched cards.
Optical mark recognition (OMR) collects data from people by identifying markings on a paper. OMR enables the hourly processing of hundreds or even thousands of documents. A common application of this technology is used in exams, where students mark cells as their answers. This allows for very fast automated grading of exam sheets.
Electronic data processing (EDP) or business information processing can refer to the use of automated methods to process commercial data. Typically, this uses relatively simple, repetitive activities to process large volumes of similar information. For example: stock updates applied to an inventory, banking transactions applied to account and customer master files, booking and ticketing transactions to an airline's reservation system, billing for utility services. The modifier "electronic" or "automatic" was used with "data processing" (DP), especially c. 1960, to distinguish human clerical data processing from that done by computer.
Starting at the end of the nineteenth century, well before the advent of electronic computers, data processing was performed using electromechanical machines collectively referred to as unit record equipment, electric accounting machines (EAM) or tabulating machines. Unit record machines came to be as ubiquitous in industry and government in the first two-thirds of the twentieth century as computers became in the last third. They allowed large volume, sophisticated data-processing tasks to be accomplished before electronic computers were invented and while they were still in their infancy. This data processing was accomplished by processing punched cards through various unit record machines in a carefully choreographed progression. This progression, or flow, from machine to machine was often planned and documented with detailed flowcharts that used standardized symbols for documents and the various machine functions. All but the earliest machines had high-speed mechanical feeders to process cards at rates from around 100 to 2,000 per minute, sensing punched holes with mechanical, electrical, or, later, optical sensors. The operation of many machines was directed by the use of a removable plugboard, control panel, or connection box. Initially all machines were manual or electromechanical. The first use of an electronic component was in 1937 when a photocell was used in a Social Security bill-feed machine. Electronic components were used on other machines beginning in the late 1940s.
A keypunch is a device for precisely punching holes into stiff paper cards at specific locations as determined by keys struck by a human operator. Other devices included here for that same function include the gang punch, the pantograph punch, and the stamp. The term was also used for similar machines used by humans to transcribe data onto punched tape media.
The tabulating machine was an electromechanical machine designed to assist in summarizing information stored on punched cards. Invented by Herman Hollerith, the machine was developed to help process data for the 1890 U.S. Census. Later models were widely used for business applications such as accounting and inventory control. It spawned a class of machines, known as unit record equipment, and the data processing industry.
A data entry clerk, also known as data preparation and control operator, data registration and control operator, and data preparation and registration operator, is a member of staff employed to enter or update data into a computer system. Data is often entered into a computer from paper documents using a keyboard. The keyboards used can often have special keys and multiple colors to help in the task and speed up the work. Proper ergonomics at the workstation is a common topic considered.
An accounting information system (AIS) is a system of collecting, storing and processing financial and accounting data that are used by decision makers. An accounting information system is generally a computer-based method for tracking accounting activity in conjunction with information technology resources. The resulting financial reports can be used internally by management or externally by other interested parties including investors, creditors and tax authorities. Accounting information systems are designed to support all accounting functions and activities including auditing, financial accounting porting, -managerial/ management accounting and tax. The most widely adopted accounting information systems are auditing and financial reporting modules.
A card reader is a data input device that reads data from a card-shaped storage medium and provides the data to a computer. Card readers can acquire data from a card via a number of methods, including: optical scanning of printed text or barcodes or holes on punched cards, electrical signals from connections made or interrupted by a card's punched holes or embedded circuitry, or electronic devices that can read plastic cards embedded with either a magnetic strip, computer chip, RFID chip, or another storage medium.
A computer operator is a role in IT which oversees the running of computer systems, ensuring that the machines, and computers are running properly. The job of a computer operator as defined by the United States Bureau of Labor Statistics is to "monitor and control ... and respond to ... enter commands ... set controls on computer and peripheral devices. This Excludes Data Entry."
A service bureau is a company that provides business services for a fee. The term has been extensively used to describe technology-based services to financial services companies, particularly banks. Service bureaus are a significant sector within the growing 3D printing industry that allow customers to make a decision whether to buy their own equipment or outsource production. Customers of service bureaus typically do not have the scale or expertise to incorporate these services into their internal operations and prefer to outsource them to a service bureau. Outsourced payroll services constitute a commonly provisioned service from a service bureau.
The CDC 1700 is a 16-bit word minicomputer, manufactured by the Control Data Corporation with deliveries beginning in May 1966.
A mechanical computer is a computer built from mechanical components such as levers and gears rather than electronic components. The most common examples are adding machines and mechanical counters, which use the turning of gears to increment output displays. More complex examples could carry out multiplication and division—Friden used a moving head which paused at each column—and even differential analysis. One model, the Ascota 170 accounting machine sold in the 1960s, calculated square roots.
A control-flow diagram (CFD) is a diagram to describe the control flow of a business process, process or review.