Clinical data management (CDM) is a critical process in clinical research, which leads to generation of high-quality, reliable, and statistically sound data from clinical trials. [1] Clinical data management ensures collection, integration and availability of data at appropriate quality and cost. It also supports the conduct, management and analysis of studies across the spectrum of clinical research as defined by the National Institutes of Health (NIH). The ultimate goal of CDM is to ensure that conclusions drawn from research are well supported by the data. Achieving this goal protects public health and increases confidence in marketed therapeutics.[ citation needed ]
Job profile acceptable in CDM: clinical researcher, clinical research associate, clinical research coordinator etc. The clinical data manager plays a key role in the setup and conduct of a clinical trial. The data collected during a clinical trial form the basis of subsequent safety and efficacy analysis which in turn drive decision making on product development in the pharmaceutical industry. The clinical data manager is involved in early discussions about data collection options and then oversees development of data collection tools based on the clinical trial protocol. Once subject enrollment begins, the data manager ensures that data are collected, validated, complete, and consistent. The clinical data manager liaises with other data providers (e.g. a central laboratory processing blood samples collected) and ensures that such data are transmitted securely and are consistent with other data collected in the clinical trial. At the completion of the clinical trial, the clinical data manager ensures that all data expected to be captured have been accounted for and that all data management activities are complete. At this stage, the data are declared final (terminology varies, but common descriptions are "Database Lock", “Data Lock” and "Database Freeze"), and the clinical data manager transfers data for statistical analysis.
Standard operating procedures (SOPs) describe the process to be followed in conducting data management activities and support the obligation to follow applicable laws and guidelines (e.g. ICH GCP and 21CFR Part 11) in the conduct of data management activities.
The data management plan describes the activities to be conducted in the course of processing data. Key topics to cover include the SOPs to be followed, the clinical data management system (CDMS) to be used, description of data sources, data handling processes, data transfer formats and process, and quality control procedure
The case report form (CRF) is the data collection tool for the clinical trial and can be paper or electronic. Paper CRFs will be printed, often using No Carbon Required paper, and shipped to the investigative sites conducting the clinical trial for completion after which they are couriered back to Data Management. Electronic CRFs enable data to be typed directly into fields using a computer and transmitted electronically to Data Management. Design of CRFs needs to take into account the information required to be collected by the clinical trial protocol and intended to be included in statistical analysis. Where available, standard CRF pages may be re-used for collection of data which is common across most clinical trials e.g. subject demographics. [2] [3] Apart from CRF design, electronic trial design also includes edit check programming. Edit checks are used to fire a query message when discrepant data is entered, to map certain data points from one CRF to the other, to calculate certain fields like Subject's Age, BMI etc.. Edit checks help the investigators to enter the right data right at the moment data is entered and also help in increasing the quality of the Clinical trial data.
For a clinical trial utilizing an electronic CRF database design and CRF design are closely linked. The electronic CRF enables entry of data into an underlying relational database. For a clinical trial utilizing a paper CRF, the relational database is built separately. In both cases, the relational database allows entry of all data captured on the Case report form.
All computer systems used in the processing and management of clinical trial data must undergo validation testing to ensure that they perform as intended and that results are reproducible.
The Clinical Data Interchange Standards Consortium leads the development of global, system independent data standards which are now commonly used as the underlying data structures for clinical trial data. These describe parameters such as the name, length and format of each data field (variable) in the relational database.
Validation Rules are electronic checks defined in advance which ensure the completeness and consistency of the clinical trial data.
Once an electronic CRF (eCRF) is built, the clinical data manager (and other parties as appropriate) conducts User Acceptance Testing (UAT). The tester enters test data into the e-CRF and record whether it functions as intended. UAT is performed until all the issues (if found) are resolved.
When an electronic CRF is in use data entry is carried out at the investigative site where the clinical trial is conducted by site staff who have been granted appropriate access to do so.
When using a paper CRF the pages are entered by data entry operators. Best practice is for a first pass data entry to be completed followed by a second pass or verification step by an independent operator. Any discrepancies between the first and second pass may be resolved such that the data entered is a true reflection of that recorded on the CRF. Where the operator is unable to read the entry the clinical data manager should be notified so that the entry may be clarified with the person who completed the CRF.
Data validation is the application of validation rules to the data. For electronic CRFs the validation rules may be applied in real time at the point of entry. Offline validation may still be required (e.g. for cross checks between data types)
Where data entered does not pass validation rules then a data query may be issued to the investigative site where the clinical trial is conducted to request clarification of the entry. Data queries must not be leading (i.e. they must not suggest the correction that should be made). For electronic CRFs only the site staff with appropriate access may modify data entries. For paper CRFs, the clinical data manager applies the data query response to the database and a copy of the data query is retained at the investigative site. When an item or variable has an error or a query raised against it, it is said to have a “discrepancy” or “query”.
All EDC systems have a discrepancy management tool or also refer to “edit check” or “validation check” that is programmed using any known programming language (e.g. SAS, PL/SQL, C#, SQL, Python, etc).
So what is a ‘query’? A query is an error generated when a validation check detects a problem with the data. Validation checks are run automatically whenever a page is saved “submitted” and can identify problems with a single variable, between two or more variables on the same eCRF page, or between variables on different pages. A variable can have multiple validation checks associated with it.
Errors can be resolved in several ways:
Samples collected during a clinical trial may be sent to a single central laboratory for analysis. The clinical data manager liaises with the central laboratory and agrees data formats and transfer schedules in Data Transfer Agreement. The sample collection date and time may be reconciled against the CRF to ensure that all samples collected have been analysed.
Analysis of clinical trial data may be carried out by laboratories, image processing specialists or other third parties. The clinical data manager liaises with such data providers and agree data formats and transfer schedules. Data may be reconciled against the CRF to ensure consistency.
The CRF collects adverse events reported during the conduct of the clinical trial however there is a separate process which ensures that serious adverse events are reported quickly. The clinical data manager must ensure that data is reconciled between these processes.
Where the subject is required to record data (e.g. daily symptoms) then a diary is provided for completion. Data management of this data requires a different approach to CRF data as, for example, it is generally not practical to raise data queries. Patient diaries may be developed in either paper or electronic (eDiary) formats. Such eDiaries generally take the form of a handheld device which enables the subject to enter the required data and transmits this data to a centralised server.
Once all expected data is accounted for, all data queries closed, all external data received and reconciled and all other data management activities complete the database may be finalized.
Typical reports generated and used by the clinical data manager includes:
Quality Control is applied at various stages in the Clinical data management process and is normally mandated by SOP.
In computing, a database is an organized collection of data stored and accessed electronically through the use of a database management system. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations, including data modeling, efficient data representation and storage, query languages, security and privacy of sensitive data, and distributed computing issues, including supporting concurrent access and fault tolerance.
A relational database is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems are equipped with the option of using SQL for querying and updating the database.
A database transaction symbolizes a unit of work, performed within a database management system against a database, that is treated in a coherent and reliable way independent of other transactions. A transaction generally represents any change in a database. Transactions in a database environment have two main purposes:
In computer science, data validation is the process of ensuring data has undergone data cleansing to confirm they have data quality, that is, that they are both correct and useful. It uses routines, often called "validation rules", "validation constraints", or "check routines", that check for correctness, meaningfulness, and security of data that are input to the system. The rules may be implemented through the automated facilities of a data dictionary, or by the inclusion of explicit application program validation logic of the computer and its application.
Acquisition or collection of clinical trial data can be achieved through various methods that may include, but are not limited to, any of the following: paper or electronic medical records, paper forms completed at a site, interactive voice response systems, local electronic data capture systems, or central web based systems.
The Clinical Data Interchange Standards Consortium (CDISC) is a standards developing organization (SDO) dealing with medical research data linked with healthcare, to "enable information system interoperability to improve medical research and related areas of healthcare". The standards support medical research from protocol through analysis and reporting of results and have been shown to decrease resources needed by 60% overall and 70–90% in the start-up stages when they are implemented at the beginning of the research process.
An electronic data capture (EDC) system is a computerized system designed for the collection of clinical data in electronic format for use mainly in human clinical trials. EDC replaces the traditional paper-based data collection methodology to streamline data collection and expedite the time to market for drugs and medical devices. EDC solutions are widely adopted by pharmaceutical companies and contract research organizations (CRO).
Oracle Clinical or OC is a database management system designed by Oracle to provide data management, data entry and data validation functionalities to support Clinical Trial operations.
An Entity–attribute–value model (EAV) is a data model optimized for the space-efficient storage of sparse—or ad-hoc—property or data values, intended for situations where runtime usage patterns are arbitrary, subject to user variation, or otherwise unforseeable using a fixed design. The use-case targets applications which offer a large or rich system of defined property types, which are in turn appropriate to a wide set of entities, but where typically only a small, specific selection of these are instantated for a given entity. Therefore, this type of data model relates to the mathematical notion of a sparse matrix.
A case report form is a paper or electronic questionnaire specifically used in clinical trial research. The case report form is the tool used by the sponsor of the clinical trial to collect data from each participating patient. All data on each patient participating in a clinical trial are held and/or documented in the CRF, including adverse events.
A data clarification form (DCF) or data query form is a questionnaire specifically used in clinical research. The DCF is the primary data clarification tool from the trial sponsor or contract research organization (CRO) towards the investigator to clarify discrepancies and ask the investigator for clarification. The DCF is part of the data validation process in a clinical trial.
A clinical data management system or CDMS is a tool used in clinical research to manage the data of a clinical trial. The clinical trial data gathered at the investigator site in the case report form are stored in the CDMS. To reduce the possibility of errors due to human entry, the systems employ various means to verify the data. Systems for clinical data management can be self-contained or part of the functionality of a CTMS. A CTMS with clinical data management functionality can help with the validation of clinical data as well as helps the site employ for other important activities like building patient registries and assist in patient recruitment efforts.
A Clinical Trial Management System (CTMS) is a software system used by biotechnology and pharmaceutical industries to manage clinical trials in clinical research. The system maintains and manages planning, performing and reporting functions, along with participant contact information, tracking deadlines and milestones.
Good clinical data management practice (GCDMP) is the current industry standards for clinical data management that consist of best business practice and acceptable regulatory standards. In all phases of clinical trials, clinical and laboratory information must be collected and converted to digital form for analysis and reporting purposes. The U.S. Food and Drug Administration and International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use have provided specific regulations and guidelines surrounding this component of the drug and device development process. The effective, efficient and regulatory-compliant management of clinical trial data is an essential component of drug and device development.
A Clinical Research Coordinator (CRC) is a person responsible for conducting clinical trials using good clinical practice (GCP) under the auspices of a Principal Investigator (PI).
Caisis is an open-source, web-based, patient data management system that integrates research with patient care. The system is freely distributed to promote the collection of standard, well structured data suitable for research and multi-institution collaboration.
A clinical trial portal is a web portal or enterprise portal that primarily serves sponsors and investigators in a clinical trial. Clinical portals can be developed for a particular study, however study-specific portals may be part of larger, clinical sponsor or Contract Research Organization (CRO) portals that cover multiple trials. A clinical portal is typically developed by a sponsor or CRO to facilitate centralized access to relevant information, documentation and online applications by investigational sites participating in a trial, as well as for the monitors, study managers, data managers, medical, safety and regulatory staff that help plan, conduct, manage and review the trial.
An electronic patient-reported outcome (ePRO) is a patient-reported outcome that is collected by electronic methods. ePRO methods are most commonly used in clinical trials, but they are also used elsewhere in health care. As a function of the regulatory process, a majority of ePRO questionnaires undergo the linguistic validation process. When the data is captured for a clinical trial, the data is considered a form of Electronic Source Data.
The following is provided as an overview of and topical guide to databases:
An electronic trial master file (eTMF) is a trial master file in electronic format. It is a type of content management system for the pharmaceutical industry, providing a formalized means of organizing and storing documents, images, and other digital content for pharmaceutical clinical trials that may be required for compliance with government regulatory agencies. The term eTMF encompasses strategies, methods and tools used throughout the lifecycle of the clinical trial regulated content. An eTMF system consists of software and hardware that facilitates the management of regulated clinical trial content. Regulatory agencies have outlined the required components of eTMF systems that use electronic means to store the content of a clinical trial, requiring that they include: Digital content archiving, security and access control, change controls, audit trails, and system validation.