Data collection

Last updated
Example of data collection in the biological sciences: Adelie penguins are identified and weighed each time they cross the automated weighbridge on their way to or from the sea. Automated weighbridge for Adelie penguins - journal.pone.0085291.g002.png
Example of data collection in the biological sciences: Adélie penguins are identified and weighed each time they cross the automated weighbridge on their way to or from the sea.

Data collection or data gathering is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes. Data collection is a research component in all study fields, including physical and social sciences, humanities, [2] and business. While methods vary by discipline, the emphasis on ensuring accurate and honest collection remains the same. The goal for all data collection is to capture evidence that allows data analysis to lead to the formulation of credible answers to the questions that have been posed.

Contents

Regardless of the field of or preference for defining data (quantitative or qualitative), accurate data collection is essential to maintain research integrity. The selection of appropriate data collection instruments (existing, modified, or newly developed) and delineated instructions for their correct use reduce the likelihood of errors.

Methodology

Data collection and validation consist of four steps when it involves taking a census and seven steps when it involves sampling. [3]

A formal data collection process is necessary, as it ensures that the data gathered are both defined and accurate. This way, subsequent decisions based on arguments embodied in the findings are made using valid data. [4] The process provides both a baseline from which to measure and in certain cases an indication of what to improve.

Tools

Data collection system

Data management platform

Data management platforms (DMP) are centralized storage and analytical systems for data, mainly used in marketing. DMPs exist to compile and transform large amounts of demand and supply data into discernible information. Marketers may want to receive and utilize first, second and third-party data.[ clarification needed ] DMPs enable this, because they are the aggregate system of DSPs (demand side platform) and SSPs (supply side platform). DMPs are integral for optimizing and future advertising campaigns.

Data integrity issues

The main reason for maintaining data integrity is to support the observation of errors in the data collection process. Those errors may be made intentionally (deliberate falsification) or non-intentionally (random or systematic errors). [5]

There are two approaches that may protect data integrity and secure scientific validity of study results: [6]

Quality assurance (QA)

QA's focus is prevention, which is primarily a cost-effective activity to protect the integrity of data collection. Standardization of protocol, with comprehensive and detailed procedure descriptions for data collection, are central for prevention. The risk of failing to identify problems and errors in the research process is often caused by poorly written guidelines. Listed are several examples of such failures:

  • Uncertainty of timing, methods and identification of the responsible person
  • Partial listing of items needed to be collected
  • Vague description of data collection instruments instead of rigorous step-by-step instructions on administering tests
  • Failure to recognize exact content and strategies for training and retraining staff members responsible for data collection
  • Unclear instructions for using, making adjustments to, and calibrating data collection equipment
  • No predetermined mechanism to document changes in procedures that occur during the investigation

User privacy issues

There are serious concerns about the integrity of individual user data collected by cloud computing, because this data is transferred across countries that have different standards of protection for individual user data. [7] Information processing has advanced to the level where user data can now be used to predict what an individual is saying before they even speak. [8]

Quality control (QC)

Since QC actions occur during or after the data collection, all the details can be carefully documented. There is a necessity for a clearly defined communication structure as a precondition for establishing monitoring systems. Uncertainty about the flow of information is not recommended, as a poorly organized communication structure leads to lax monitoring and can also limit the opportunities for detecting errors. Quality control is also responsible for the identification of actions necessary for correcting faulty data collection practices and also minimizing such future occurrences. A team is more likely to not realize the necessity to perform these actions if their procedures are written vaguely and are not based on feedback or education.

Data collection problems that necessitate prompt action:

See also

Related Research Articles

Biostatistics is a branch of statistics that applies statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experiments and the interpretation of the results.

<span class="mw-page-title-main">Research</span> Systematic study undertaken to increase knowledge

Research is "creative and systematic work undertaken to increase the stock of knowledge". It involves the collection, organization and analysis of evidence to increase understanding of a topic, characterized by a particular attentiveness to controlling sources of bias and error. These activities are characterized by accounting and controlling for biases. A research project may be an expansion of past work in the field. To test the validity of instruments, procedures, or experiments, research may replicate elements of prior projects or the project as a whole.

Marketing research is the systematic gathering, recording, and analysis of qualitative and quantitative data about issues relating to marketing products and services. The goal is to identify and assess how changing elements of the marketing mix impacts customer behavior.

<span class="mw-page-title-main">Usability</span> Capacity of a system for its users to perform tasks

Usability can be described as the capacity of a system to provide a condition for its users to perform the tasks safely, effectively, and efficiently while enjoying the experience. In software engineering, usability is the degree to which a software can be used by specified consumers to achieve quantified objectives with effectiveness, efficiency, and satisfaction in a quantified context of use.

<span class="mw-page-title-main">Multimethodology</span>

Multimethodology or multimethod research includes the use of more than one method of data collection or research in a research study or set of related studies. Mixed methods research is more specific in that it includes the mixing of qualitative and quantitative data, methods, methodologies, and/or paradigms in a research study or set of related studies. One could argue that mixed methods research is a special case of multimethod research. Another applicable, but less often used label, for multi or mixed research is methodological pluralism. All of these approaches to professional and academic research emphasize that monomethod research can be improved through the use of multiple data sources, methods, research methodologies, perspectives, standpoints, and paradigms.

<span class="mw-page-title-main">Qualitative research</span> Form of research

Qualitative research is a type of research that aims to gather and analyse non-numerical (descriptive) data in order to gain an understanding of individuals' social reality, including understanding their attitudes, beliefs, and motivation. This type of research typically involves in-depth interviews, focus groups, or observations in order to collect data that is rich in detail and context. Qualitative research is often used to explore complex phenomena or to gain insight into people's experiences and perspectives on a particular topic. It is particularly useful when researchers want to understand the meaning that people attach to their experiences or when they want to uncover the underlying reasons for people's behavior. Qualitative methods include ethnography, grounded theory, discourse analysis, and interpretative phenomenological analysis. Qualitative research methods have been used in sociology, anthropology, political science, psychology, communication studies, social work, folklore, educational research, information science and software engineering research.

<span class="mw-page-title-main">Quantitative research</span> All procedures for the numerical representation of empirical facts

Quantitative research is a research strategy that focuses on quantifying the collection and analysis of data. It is formed from a deductive approach where emphasis is placed on the testing of theory, shaped by empiricist and positivist philosophies.

Educational research refers to the systematic collection and analysis of data related to the field of education. Research may involve a variety of methods and various aspects of education including student learning, interaction, teaching methods, teacher training, and classroom dynamics.

<span class="mw-page-title-main">Content analysis</span> Research method for studying documents and communication artifacts

Content analysis is the study of documents and communication artifacts, which might be texts of various formats, pictures, audio or video. Social scientists use content analysis to examine patterns in communication in a replicable and systematic manner. One of the key advantages of using content analysis to analyse social phenomena is their non-invasive nature, in contrast to simulating social experiences or collecting survey answers.

An assay is an investigative (analytic) procedure in laboratory medicine, mining, pharmacology, environmental biology and molecular biology for qualitatively assessing or quantitatively measuring the presence, amount, or functional activity of a target entity. The measured entity is often called the analyte, the measurand, or the target of the assay. The analyte can be a drug, biochemical substance, chemical element or compound, or cell in an organism or organic sample. An assay usually aims to measure an analyte's intensive property and express it in the relevant measurement unit.

<span class="mw-page-title-main">Methodology</span> Study of research methods

In its most common sense, methodology is the study of research methods. However, the term can also refer to the methods themselves or to the philosophical discussion of associated background assumptions. A method is a structured procedure for bringing about a certain goal, like acquiring knowledge or verifying knowledge claims. This normally involves various steps, like choosing a sample, collecting data from this sample, and interpreting the data. The study of methods concerns a detailed description and analysis of these processes. It includes evaluative aspects by comparing different methods. This way, it is assessed what advantages and disadvantages they have and for what research goals they may be used. These descriptions and evaluations depend on philosophical background assumptions. Examples are how to conceptualize the studied phenomena and what constitutes evidence for or against them. When understood in the widest sense, methodology also includes the discussion of these more abstract issues.

<span class="mw-page-title-main">Checklist</span> Aide-memoire to ensure consistency and completeness in carrying out a task

A checklist is a type of job aid used in repetitive tasks to reduce failure by compensating for potential limits of human memory and attention. Checklists are used both to ensure that safety-critical system preparations are carried out completely and in the correct order, and in less critical applications to ensure that no step is left out of a procedure. they help to ensure consistency and completeness in carrying out a task. A basic example is the "to do list". A more advanced checklist would be a schedule, which lays out tasks to be done according to time of day or other factors, or a pre-flight checklist for an airliner, which should ensure a safe take-off.

Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specified period of time. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time.

<span class="mw-page-title-main">Research design</span> Overall strategy utilized to carry out research

Research design refers to the overall strategy utilized to answer research questions. A research design typically outlines the theories and models underlying a project; the research question(s) of a project; a strategy for gathering data and information; and a strategy for producing answers from the data. A strong research design yields valid answers to research questions while weak designs yield unreliable, imprecise or irrelevant answers.

A test method is a method for a test in science or engineering, such as a physical test, chemical test, or statistical test. It is a definitive procedure that produces a test result. In order to ensure accurate and relevant test results, a test method should be "explicit, unambiguous, and experimentally feasible.", as well as effective and reproducible.

Analytical quality control (AQC) refers to all those processes and procedures designed to ensure that the results of laboratory analysis are consistent, comparable, accurate and within specified limits of precision. Constituents submitted to the analytical laboratory must be accurately described to avoid faulty interpretations, approximations, or incorrect results. The qualitative and quantitative data generated from the laboratory can then be used for decision making. In the chemical sense, quantitative analysis refers to the measurement of the amount or concentration of an element or chemical compound in a matrix that differs from the element or compound. Fields such as industry, medicine, and law enforcement can make use of AQC.

Diary studies is a research method that collects qualitative information by having participants record entries about their everyday lives in a log, diary or journal about the activity or experience being studied. This collection of data uses a longitudinal technique, meaning participants are studied over a period of time. This research tool, although not being able to provide results as detailed as a true field study, can still offer a vast amount of contextual information without the costs of a true field study. Diary studies are also known as experience sampling or ecological momentary assessment (EMA) methodology.

Thematic analysis is one of the most common forms of analysis within qualitative research. It emphasizes identifying, analysing and interpreting patterns of meaning within qualitative data. Thematic analysis is often understood as a method or technique in contrast to most other qualitative analytic approaches – such as grounded theory, discourse analysis, narrative analysis and interpretative phenomenological analysis – which can be described as methodologies or theoretically informed frameworks for research. Thematic analysis is best thought of as an umbrella term for a variety of different approaches, rather than a singular method. Different versions of thematic analysis are underpinned by different philosophical and conceptual assumptions and are divergent in terms of procedure. Leading thematic analysis proponents, psychologists Virginia Braun and Victoria Clarke distinguish between three main types of thematic analysis: coding reliability approaches, code book approaches and reflexive approaches. They describe their own widely used approach first outlined in 2006 in the journal Qualitative Research in Psychology as reflexive thematic analysis. Their 2006 paper has over 120,000 Google Scholar citations and according to Google Scholar is the most cited academic paper published in 2006. The popularity of this paper exemplifies the growing interest in thematic analysis as a distinct method.

Willem Egbert (Wim) Saris is a Dutch sociologist and Emeritus Professor of Statistics and Methodology, especially known for his work on "Causal modelling in non-experimental research" and measurement errors.

The Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines are a set of protocols for conducting and reporting quantitative real-time PCR experiments and data, as devised by Bustin et al. in 2009. They were devised after a paper was published in 2002 that claimed to detect measles virus in children with autism through the use of RT-qPCR, but the results proved to be completely unreproducible by other scientists. The authors themselves also did not try to reproduce them and the raw data was found to have a large amount of errors and basic mistakes in analysis. This incident prompted Stephen Bustin to create the MIQE guidelines to provide a baseline level of quality for qPCR data published in scientific literature.

References

  1. Lescroël, A. L.; Ballard, G.; Grémillet, D.; Authier, M.; Ainley, D. G. (2014). Descamps, Sébastien (ed.). "Antarctic Climate Change: Extreme Events Disrupt Plastic Phenotypic Response in Adélie Penguins". PLOS ONE. 9 (1): e85291. Bibcode:2014PLoSO...985291L. doi: 10.1371/journal.pone.0085291 . PMC   3906005 . PMID   24489657.
  2. Vuong, Quan-Hoang; La, Viet-Phuong; Vuong, Thu-Trang; Ho, Manh-Toan; Nguyen, Hong-Kong T.; Nguyen, Viet-Ha; Pham, Hiep-Hung; Ho, Manh-Tung (September 25, 2018). "An open database of productivity in Vietnam's social sciences and humanities for public use". Scientific Data. 5: 180188. Bibcode:2018NatSD...580188V. doi:10.1038/sdata.2018.188. PMC   6154282 . PMID   30251992.
  3. Ziafati Bafarasat, A. (2021) Collecting and validating data: A simple guide for researchers. Advance. Preprint.. https://doi.org/10.31124/advance.13637864.v1
  4. Data Collection and Analysis By Dr. Roger Sapsford, Victor Jupp ISBN   0-7619-5046-X
  5. Northern Illinois University (2005). "Data Collection". Responsible Conduct in Data Management. Retrieved June 8, 2019.
  6. Most, Marlene M.; Craddick, Shirley; Crawford, Staci; Redican, Susan; Rhodes, Donna; Rukenbrod, Fran; Laws, Reesa (October 2003). "Dietary quality assurance processes of the DASH-Sodium controlled diet study". Journal of the American Dietetic Association. 103 (10): 1339–1346. doi:10.1016/s0002-8223(03)01080-0. PMID   14520254.
  7. Wang, Faye Fangfei (10 January 2014). Law of Electronic Commercial Transactions: Contemporary Issues in the EU, US and China. Routledge. p. 154. ISBN   978-1-134-11522-8.
  8. "Data, not privacy, is the real danger". NBC News. 4 February 2019.