Document management system

Last updated

A document management system (DMS) is a system used to track, manage and store documents and reduce paper. Most are capable of keeping a record of the various versions created and modified by different users (history tracking). In the case of the management of digital documents such systems are based on computer programs. The term has some overlap with the concepts of content management systems. It is often viewed as a component of enterprise content management (ECM) systems and related to digital asset management, document imaging, workflow systems and records management systems.

A content management system (CMS) is a software application that can be used to manage the creation and modification of digital content. CMSs are typically used for enterprise content management (ECM) and web content management (WCM). ECM typically supports multiple users in a collaborative environment by integrating document management, digital asset management and record retention. Alternatively, WCM is the collaborative authoring for websites and may include text and embed graphics, photos, video, audio, maps and programme code that display content and interact with the user. ECM typically includes a WCM function.

Enterprise content management (ECM) extends the concept of content management by adding a time line for each content item and possibly enforcing processes for the creation, approval and distribution of them. Systems that implement ECM generally provide a secure repository for managed items, be they analog or digital, that indexes them. They also include one or more methods for importing content to bring new items under management and several presentation methods to make items available for use.

Operations on a collection of digital assets requires the use of a computer application implementing digital asset management (DAM) to ensure that the owner, and possibly their delegates, can perform operations on the data files.



Beginning in the 1980s, a number of vendors began to develop software systems to manage paper-based documents. These systems dealt with paper documents, which included not only printed and published documents, but also photographs, prints, etc...

Photograph image created by light falling on a light-sensitive surface

A photograph is an image created by light falling on a photosensitive surface, usually photographic film or an electronic image sensor, such as a CCD or a CMOS chip. Most photographs are created using a camera, which uses a lens to focus the scene's visible wavelengths of light into a reproduction of what the human eye would see. The process and practice of creating such images is called photography. The word photograph was coined in 1839 by Sir John Herschel and is based on the Greek φῶς (phos), meaning "light," and γραφή (graphê), meaning "drawing, writing," together meaning "drawing with light."

Later developers began to write a second type of system which could manage electronic documents, i.e., all those documents, or files, created on computers, and often stored on users' local file-systems. The earliest electronic document management (EDM) systems managed either proprietary file types, or a limited number of file formats. Many of these systems later[ when? ] became known as document imaging systems, because they focused on the capture, storage, indexing and retrieval of image file formats. EDM systems evolved to a point where systems could manage any type of file format that could be stored on the network. The applications grew to encompass electronic documents, collaboration tools, security, workflow, and auditing capabilities.

An electronic document is any electronic media content that are intended to be used in either an electronic form or as printed output. Originally, any computer data were considered as something internal — the final data output was always on paper. However, the development of computer networks has made it so that in most cases it is much more convenient to distribute electronic documents than printed ones. The improvements in electronic visual display technologies made it possible to view documents on screen instead of printing them.

A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free and may be either unpublished or open.

Document imaging is an information technology category for systems capable of replicating documents commonly used in business. Document imaging systems can take many forms including microfilm, on demand printers, facsimile machines, copiers, multifunction printers, document scanners, computer output microfilm (COM) and archive writers. Document Imaging means the conversion of paper files or microfilm / fiche to digital images.

These systems enabled an organization to capture faxes and forms, to save copies of the documents as images, and to store the image files in the repository for security and quick retrieval (retrieval made possible because the system handled the extraction of the text from the document in the process of capture, and the text-indexer function provided text-retrieval capabilities).

In information technology, a repository is "a central place in which an aggregation of data is kept and maintained in an organized way, usually in computer storage." It "may be just the aggregation of data itself into some accessible place of storage or it may also imply some ability to selectively extract data."

While many EDM systems store documents in their native file format (Microsoft Word or Excel, PDF), some web-based document management systems are beginning to store content in the form of html. These policy management systems [1] require content to be imported into the system. However, once content is imported, the software acts like a search engine so users can find what they are looking for faster. The html format allows for better application of search capabilities such as full-text searching and stemming. [2]


Document management systems commonly provide storage, versioning, metadata, security, as well as indexing and retrieval capabilities. Here is a description of these components:

Metadata Metadata is typically stored for each document. Metadata may, for example, include the date the document will be stored and the identity of the user storing it. The DMS may also extract metadata from the document automatically or prompt the user to add metadata. Some systems also use optical character recognition on scanned images, or perform text extraction on electronic documents. The resulting extracted text can be used to assist users in locating documents by identifying probable keywords or providing for full text search capability, or can be used on its own. Extracted text can also be stored as a component of metadata, stored with the document, or separately from the document as a source for searching document collections. [3]
IntegrationMany document management systems attempt to provide document management functionality directly to other applications, so that users may retrieve existing documents directly from the document management system repository, make changes, and save the changed document back to the repository as a new version, all without leaving the application. Such integration is commonly available for a variety of software tools such as workflow management and content management systems, typically through an application programming interface (API) using open standards such as ODMA, LDAP, WebDAV, and SOAP or RESTful web services. [4] [5]
CaptureCapture primarily involves accepting and processing images of paper documents from scanners or multifunction printers. Optical character recognition (OCR) software is often used, whether integrated into the hardware or as stand-alone software, in order to convert digital images into machine readable text. Optical mark recognition (OMR) software is sometimes used to extract values of check-boxes or bubbles. Capture may also involve accepting electronic documents and other computer-based files. [6]
Data validationData validation rules can check for document failures, missing signatures, misspelled names, and other issues, recommending real-time correction options before importing data into the DMS. Additional processing in the form of harmonization and data format changes may also be applied as part of data validation. [7] [8]
IndexingIndexing tracks electronic documents. Indexing may be as simple as keeping track of unique document identifiers; but often it takes a more complex form, providing classification through the documents' metadata or even through word indexes extracted from the documents' contents. Indexing exists mainly to support information query and retrieval. One area of critical importance for rapid retrieval is the creation of an index topology or scheme. [9]
StorageStore electronic documents. Storage of the documents often includes management of those same documents; where they are stored, for how long, migration of the documents from one storage media to another (hierarchical storage management) and eventual document destruction.
RetrievalRetrieve the electronic documents from the storage. Although the notion of retrieving a particular document is simple, retrieval in the electronic context can be quite complex and powerful. Simple retrieval of individual documents can be supported by allowing the user to specify the unique document identifier, and having the system use the basic index (or a non-indexed query on its data store) to retrieve the document. [9] More flexible retrieval allows the user to specify partial search terms involving the document identifier and/or parts of the expected metadata. This would typically return a list of documents which match the user's search terms. Some systems provide the capability to specify a Boolean expression containing multiple keywords or example phrases expected to exist within the documents' contents. The retrieval for this kind of query may be supported by previously built indexes [9] , or may perform more time-consuming searches through the documents' contents to return a list of the potentially relevant documents. See also Document retrieval.
DistributionA document ready for distribution has to be in a format that cannot be easily altered. An original master copy of the document is usually never used for distribution; rather, an electronic link to the document itself is more common. If a document is to be distributed electronically in a regulatory environment, then additional criteria must be met, including assurances of traceability and versioning, even across other systems. [10] This approach applies to both of the systems by which the document is to be inter-exchanged, if the integrity of the document is imperative.
SecurityDocument security is vital in many document management applications. Compliance requirements for certain documents can be quite complex depending on the type of documents. For instance, in the United States, standards such as ISO 9001 and ISO 13485, as well as U.S. Food and Drug Administration regulations, dictate how the document control process should be addressed. [11] Document management systems may have a rights management module that allows an administrator to give access to documents based on type to only certain people or groups of people. Document marking at the time of printing or PDF-creation is an essential element to preclude alteration or unintended use.
Workflow Workflow is a complex process, and some document management systems have either a built-in workflow module [12] or can integrate with workflow management tools. [5] There are different types of workflow. Usage depends on the environment to which the electronic document management system (EDMS) is applied. Manual workflow requires a user to view the document and decide whom to send it to. Rules-based workflow allows an administrator to create a rule that dictates the flow of the document through an organization: for instance, an invoice passes through an approval process and then is routed to the accounts-payable department. Dynamic rules allow for branches to be created in a workflow process. A simple example would be to enter an invoice amount and if the amount is lower than a certain set amount, it follows different routes through the organization. Advanced workflow mechanisms can manipulate content or signal external processes while these rules are in effect.
CollaborationCollaboration should be inherent in an EDMS. In its basic form, collaborative EDMS should allow documents to be retrieved and worked on by an authorized user. Access should be blocked to other users while work is being performed on the document. Other advanced forms of collaboration act in real time, allowing multiple users to view and modify (or markup) documents at the same time. The resulting document is comprehensive, including all users additions. Collaboration within document management systems means that the various markups by each individual user during the collaboration session are recorded, allowing document history to be monitored. [13]
Versioning Versioning is a process by which documents are checked in or out of the document management system, allowing users to retrieve previous versions and to continue work from a selected point. Versioning is useful for documents that change over time and require updating, but it may be necessary to go back to or reference a previous copy. [13]
SearchingSearching finds documents and folders using template attributes or full text search. Documents can be searched using various attributes and document content.
Federated searchThis refers to the capability to extend search capabilities to draw results from multiple sources, or from multiple DMSes within an enterprise. [14]
Publishing Publishing a document involves the procedures of proofreading, peer or public reviewing, authorizing, printing and approving etc. Those steps ensure prudence and logical thinking. Any careless handling may result in the inaccuracy of the document and therefore mislead or upset its users and readers. In law regulated industries, some of the procedures have to be completed as evidenced by their corresponding signatures and the date(s) on which the document was signed. Refer to the ISO divisions of ICS 01.140.40 and 35.240.30 for further information. [15] [16]

The published document should be in a format that is not easily altered without a specific knowledge or tools, and yet it is read-only or portable. [17]

Hard copy reproductionDocument/image reproduction is often necessary within a document management system, and its supported output devices and reproduction capabilities should be considered. [18]


Many industry associations publish their own lists of particular document control standards that are used in their particular field. Following is a list of some of the relevant ISO documents. Divisions ICS 01.140.10 and 01.140.20. [19] [20] The ISO has also published a series of standards regarding the technical documentation, covered by the division of 01.110. [21]

hi tom, i knoe you like ict :)

Document control

Government regulations require that companies working in certain industries control their documents. These industries include accounting (for example: 8th EU Directive, Sarbanes–Oxley Act), food safety (e.g., Food Safety Modernization Act), ISO (mentioned above), medical-device manufacturing (FDA), manufacture of blood, human cells, and tissue products (FDA), healthcare ( JCAHO ), and information technology ( ITIL ). [22] Some industries work under stricter document control requirements due to the type of information they retain for privacy, warranty, or other highly regulated purposes. Examples include Protected Health Information (PHI) as required by HIPAA or construction project documents required for warranty periods. An information systems strategy plan (ISSP) can shape organisational information systems over medium to long-term periods. [23]

Documents stored in a document management system—such as procedures, work instructions, and policy statements—provide evidence of documents under control. Failing to comply can cause fines, the loss of business, or damage to a business's reputation.

The following are important aspects of document control:

Integrated DM

Integrated document management comprises the technologies, tools, and methods used to capture, manage, store, preserve, deliver and dispose of 'documents' across an enterprise. In this context 'documents' are any of a myriad of information assets including images, office documents, graphics, and drawings as well as the new electronic objects such as Web pages, email, instant messages, and video.

Document management software

Paper documents have long been used in storing information. However, paper can be costly and, if used excessively, wasteful. Document management software is not simply a tool but it lets a user manage access, track and edit information stored. Document management software is an electronic cabinet that can be used to organize all paper and digital files. [24] The software helps the businesses to combine paper to digital files and store it into a single hub after it is scanned and digital formats gets imported. [25] Web based document management software are becoming the staple of the industry.[ citation needed ]

See also

Related Research Articles

The Portable Document Format (PDF) is a file format developed by Adobe in the 1990s to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. Based on the PostScript language, each PDF file encapsulates a complete description of a fixed-layout flat document, including the text, fonts, vector graphics, raster images and other information needed to display it. PDF was standardized as ISO 32000 in 2008, and no longer requires any royalties for its implementation.

Documentation set of documents providing knowledge

Documentation is a set of documents provided on paper, or online, or on digital or analog media, such as audio tape or CDs. Examples are user guides, white papers, on-line help, quick-reference guides. It is becoming less common to see paper (hard-copy) documentation. Documentation is distributed via websites, software products, and other on-line applications.

MPEG-7 is a multimedia content description standard. It was standardized in ISO/IEC 15938. This description will be associated with the content itself, to allow fast and efficient searching for material that is of interest to the user. MPEG-7 is formally called Multimedia Content Description Interface. Thus, it is not a standard which deals with the actual encoding of moving pictures and audio, like MPEG-1, MPEG-2 and MPEG-4. It uses XML to store metadata, and can be attached to timecode in order to tag particular events, or synchronise lyrics to a song, for example.

Records management, also known as records and information management, is an organizational function devoted to the management of information in an organization throughout its life cycle, from the time of creation or inscription to its eventual disposition. This includes identifying, classifying, storing, securing, retrieving, tracking and destroying or permanently preserving records. The ISO 15489-1: 2001 standard defines records management as "[the] field of management responsible for the efficient and systematic control of the creation, receipt, maintenance, use and disposition of records, including the processes for capturing and maintaining evidence of and information about business activities and transactions in the form of records".

Micro Focus Content Manager is an electronic document and records management system (EDRMS) marketed by the Micro Focus. Content Manager is an enterprise document and records management system for physical and electronic information designed to help businesses capture, manage, and secure business information in order to meet governance and regulatory compliance obligations.

The digital dark age is a lack of historical information in the digital age as a direct result of outdated file formats, software, or hardware that becomes corrupt, scarce, or inaccessible as technologies evolve and data decay. Future generations may find it difficult or impossible to retrieve electronic documents and multimedia, because they have been recorded in an obsolete and obscure file format. The name derives from the term Dark Ages in the sense that there could be a relative lack of records in the digital age, as documents are transferred to digital formats and original copies are lost. An early mention of the term was at a conference of the International Federation of Library Associations and Institutions (IFLA) in 1997.[1] The term was also mentioned in 1998 at the Time and Bits conference,[2][3] which was co-sponsored by the Long Now Foundation and the Getty Conservation Institute.

A specification often refers to a set of documented requirements to be satisfied by a material, design, product, or service. A specification is often a type of technical standard.

Documentum is an enterprise content management platform, now owned by OpenText, as well as the name of the software company that originally developed the technology. EMC acquired Documentum for $1.7 billion in December, 2003. The Documentum platform was part of EMC's Enterprise Content Division (ECD) business unit, one of EMC's four operating divisions.

Metadata data about data

Metadata is "data information that provides information about other data". Many distinct types of metadata exist, among these descriptive metadata, structural metadata, administrative metadata, reference metadata and statistical metadata.

ISO 24517-1:2008 is an ISO Standard published in 2008.

Document Capture Software refers to applications that provide the ability and feature set to automate the process of scanning paper documents or importing electronic documents, often for the purposes of feeding advanced document classification and data collection processes. Most scanning hardware, both scanners and copiers, provides the basic ability to scan to any number of image file formats, including: PDF, TIFF, JPG, BMP, etc. This basic functionality is augmented by document capture software, which can add efficiency and standardization to the process.

Digital mailroom is the automation of incoming mail processes. Using document scanning and document capture technologies, companies can digitise incoming mail and automate the classification and distribution of mail within the organization. Both paper and electronic mail (email) can be managed through the same process allowing companies to standardize their internal mail distribution procedures and adhere to company compliance policies.

An electronic trial master file (eTMF) is a trial master file in electronic format. It is a type of content management system for the pharmaceutical industry, providing a formalized means of organizing and storing documents, images, and other digital content for pharmaceutical clinical trials that may be required for compliance with government regulatory agencies. The term eTMF encompasses strategies, methods and tools used throughout the lifecycle of the clinical trial regulated content. An eTMF system consists of software and hardware that facilitates the management of regulated clinical trial content. Regulatory agencies have outlined the required components of eTMF systems that use electronic means to store the content of a clinical trial, requiring that they include: Digital content archiving, security and access control, change controls, audit trails, and system validation.

Documentation of cultural property Aspect of collections care

The documentation of cultural property is a critical aspect of collections care. As stewards of cultural property, museums collect and preserve not only objects but the research and documentation connected to those objects, in order to more effectively care for them. Documenting cultural heritage is a collaborative effort. Essentially, registrars, collection managers, conservators, and curators all contribute to the task of recording and preserving information regarding collections. There are two main types of documentation museums are responsible for: records generated in the registration process—accessions, loans, inventories, etc. and information regarding research on objects and their historical significance. Properly maintaining both types of documentation is vital to preserving cultural heritage.

A machine-readable document is a document whose content can be readily processed by computers. Such documents are distinguished from machine-readable data by virtue of having sufficient structure to provide the necessary context to support the business processes for which they are created.


  1. Policy Management System Archived 29 October 2011 at the Wayback Machine
  2. Stemming: Making searching easier Archived 11 January 2012 at the Wayback Machine
  3. Parsons, M. (2004). Effective Knowledge Management for Law Firms. Oxford University Press. p. 234. ISBN   9780195169683 . Retrieved 19 May 2018.
  4. Shivakumar, S.K. (2016). Enterprise Content and Search Management for Building Digital Platforms. John Wiley & Sons. p. 93. ISBN   9781119206828 . Retrieved 19 May 2018.
  5. 1 2 Fletcher, A.N.; Brahm, M.; Pargmann, H. (2003). Workflow Management with SAP WebFlow: A Practical Manual. Springer Science & Business Media. pp. 15–16. ISBN   9783540404033 . Retrieved 19 May 2018.
  6. Webber, M.; Webber, L. (2016). It Governance: Policies and Procedures. Wolters Kluwer. p. 41-4. ISBN   9781454871323 . Retrieved 19 May 2018.
  7. Trinchieri, D. (2003). Evaluation of Integrated Document Management System (IDMS) Options for the Arizona Department of Transportation (ADOT). Arizona Department of Transportation. p. 158. The data validation rules should be embedded in the form itself, rather than accomplished in a post-processing environment. This provides the use an interactive real-time experience. Often data validation requires a database look-up. The rules should allow this database query, providing the user real-time choices based on query results.
  8. Morley, D.; Parker, C.S. (2014). Understanding Computers: Today and Tomorrow, Comprehensive. Cengage Learning. p. 558–559. ISBN   9781285767277 . Retrieved 19 May 2018.
  9. 1 2 3 Meurant, G. (2012). Introduction to Electronic Document Management Systems. Academic Press. p. 120. ISBN   9780323140621 . Retrieved 19 May 2018.
  10. Sommerville, J.; Craig, N. (2006). Implementing IT in Construction. Routledge. p. 130. ISBN   9781134198986 . Retrieved 19 May 2018.
  11. Skipper, S.L. (2015). How to Establish a Document Control System for Compliance with ISO 9001:2015, ISO 13485:2016, and FDA Requirements. ASQ Quality Press. p. 156. ISBN   9780873899178 . Retrieved 19 May 2018.
  12. Austerberry, D. (2012). Digital Asset Management. CRC Press. pp. 27–28. ISBN   9781136033629 . Retrieved 19 May 2018.
  13. 1 2 Austerberry, D. (2012). Digital Asset Management. CRC Press. p. 30. ISBN   9781136033629 . Retrieved 19 May 2018.
  14. White, M. (2012). Enterprise Search. O'Reilly Media, Inc. pp. 73–74. ISBN   9781449330408 . Retrieved 19 May 2018.
  15. International Organization for Standardization. "01.140.40: Publishing". Archived from the original on 6 June 2011. Retrieved 14 July 2008.
  16. International Organization for Standardization. "35.240.30: IT applications in information, documentation and publishing". Archived from the original on 6 June 2011. Retrieved 14 July 2008.
  17. OnSphere Corporation. "SOP Document Management in a Validated Environments" (PDF). Archived from the original (PDF) on 4 September 2011. Retrieved 25 April 2011.
  18. Meurant, G. (2012). Introduction to Electronic Document Management Systems. Academic Press. p. 16. ISBN   9780323140621 . Retrieved 19 May 2018.
  19. International Organization for Standardization. "01.140.10: Writing and transliteration". Archived from the original on 7 July 2009. Retrieved 14 July 2008.
  20. International Organization for Standardization. "01.140.20: Information sciences". Archived from the original on 5 December 2008. Retrieved 14 July 2008.
  21. International Organization for Standardization. "01.110: Technical product documentation". Archived from the original on 6 June 2011. Retrieved 15 July 2008.
  22. "Code of Federal Regulations Title 21, Part 1271". Food and Drug Administration. Archived from the original on 10 October 2011. Retrieved 31 January 2012.
  23. Wiggins, Bob (2000). Effective Document Management: Unlocking Corporate Knowledge (2 ed.). Gower. p. 25. ISBN   9780566081484. Archived from the original on 13 January 2018. Retrieved 9 April 2016. At the organisational level an information systems strategy plan (ISSP) is a way to determine in general terms what information systems an organisation should have in place over the medium to long term (typically around three to five years [...]).
  24. "Document Management Systems - A Buyer's Guide". Archived from the original on 10 January 2017. Retrieved 10 January 2017.
  25. Chaouni, Mamoun (5 February 2015). "7 Powerful Advantages of Using a Document Management System" Archived 10 January 2017 at the Wayback Machine