# Epidemiology

Last updated

Epidemiology is the study and analysis of the distribution (who, when, and where) and determinants of health and disease conditions in defined populations.

In epidemiology, a risk factor is a variable associated with an increased risk of disease or infection. When evidence is found the term determinant is used as a variable associated with either increased or decreased risk.

In biology, a population is the number of all the organisms of the same group or species, which live in a particular geographical area, and have the capability of interbreeding. The area of a sexual population is the area where inter-breeding is potentially possible between any pair within the area, and where the probability of interbreeding is greater than the probability of cross-breeding with individuals from other areas.

## Contents

It is the cornerstone of public health, and shapes policy decisions and evidence-based practice by identifying risk factors for disease and targets for preventive healthcare. Epidemiologists help with study design, collection, and statistical analysis of data, amend interpretation and dissemination of results (including peer review and occasional systematic review). Epidemiology has helped develop methodology used in clinical research, public health studies, and, to a lesser extent, basic research in the biological sciences. [1]

Public health has been defined as "the science and art of preventing disease, prolonging life and promoting human health through organized efforts and informed choices of society, organizations, public and private, communities and individuals". Analyzing the health of a population and the threats it faces is the basis for public health. The public can be as small as a handful of people or as large as a village or an entire city; in the case of a pandemic it may encompass several continents. The concept of health takes into account physical, psychological and social well-being. As such, according to the World Health Organization, it is not merely the absence of disease or infirmity.

Evidence-based practice (EBP) is an interdisciplinary approach to clinical practice that has been gaining ground following its formal introduction in 1992. It started in medicine as evidence-based medicine (EBM) and spread to allied health professions, educational fields, and others. EBP is traditionally defined in terms of a "three legged stool" integrating three basic principles: (1) the best available research evidence bearing on whether and why a treatment works, (2) clinical expertise to rapidly identify each patient's unique health state and diagnosis, their individual risks and benefits of potential interventions, and (3) client preferences and values.

Preventive healthcare consists of measures taken for disease prevention, as opposed to disease treatment. Just as health comprises a variety of physical and mental states, so do disease and disability, which are affected by environmental factors, genetic predisposition, disease agents, and lifestyle choices. Health, disease, and disability are dynamic processes which begin before individuals realize they are affected. Disease prevention relies on anticipatory actions that can be categorized as primal, primary, secondary, and tertiary prevention.

Major areas of epidemiological study include disease causation, transmission, outbreak investigation, disease surveillance, environmental epidemiology, forensic epidemiology, occupational epidemiology, screening, biomonitoring, and comparisons of treatment effects such as in clinical trials. Epidemiologists rely on other scientific disciplines like biology to better understand disease processes, statistics to make efficient use of the data and draw appropriate conclusions, social sciences to better understand proximate and distal causes, and engineering for exposure assessment.

In medicine, public health, and biology, transmission is the passing of a pathogen causing communicable disease from an infected host individual or group to a particular individual or group, regardless of whether the other individual was previously infected.

In epidemiology, an outbreak is a sudden increase in occurrences of a disease in a particular time and place. It may affect a small and localized group or impact upon thousands of people across an entire continent. Two linked cases of a rare infectious disease may be sufficient to constitute an outbreak. Outbreaks include epidemics, which term is normally only used for infectious diseases, as well as diseases with an environmental origin, such as a water or foodborne disease. They may affect a region in a country or a group of countries. Pandemics are near-global disease outbreaks.

Disease surveillance is an epidemiological practice by which the spread of disease is monitored in order to establish patterns of progression. The main role of disease surveillance is to predict, observe, and minimize the harm caused by outbreak, epidemic, and pandemic situations, as well as increase knowledge about which factors contribute to such circumstances. A key part of modern disease surveillance is the practice of disease case reporting.

## Etymology

Epidemiology, literally meaning "the study of what is upon the people", is derived from Greek, Modern epi, meaning 'upon, among', demos , meaning 'people, district',and logos , meaning 'study, word, discourse', suggesting that it applies only to human populations. However, the term is widely used in studies of zoological populations (veterinary epidemiology), although the term "epizoology" is available, and it has also been applied to studies of plant populations (botanical or plant disease epidemiology). [2]

Logos is a term in Western philosophy, psychology, rhetoric, and religion derived from a Greek word variously meaning "ground", "plea", "opinion", "expectation", "word", "speech", "account", "reason", "proportion", and "discourse". It became a technical term in Western philosophy beginning with Heraclitus, who used the term for a principle of order and knowledge. Logos is the logic behind an argument. Logos tries to persuade an audience using logical arguments and supportive evidence. Logos is a persuasive technique often used in writing and rhetoric.

Plant disease epidemiology is the study of disease in plant populations. Much like diseases of humans and other animals, plant diseases occur due to pathogens such as bacteria, viruses, fungi, oomycetes, nematodes, phytoplasmas, protozoa, and parasitic plants. Plant disease epidemiologists strive for an understanding of the cause and effects of disease and develop strategies to intervene in situations where crop losses may occur. Typically successful intervention will lead to a low enough level of disease to be acceptable, depending upon the value of the crop.

The distinction between "epidemic" and "endemic" was first drawn by Hippocrates, [3] to distinguish between diseases that are "visited upon" a population (epidemic) from those that "reside within" a population (endemic). [4] The term "epidemiology" appears to have first been used to describe the study of epidemics in 1802 by the Spanish physician Villalba in Epidemiología Española. [4] Epidemiologists also study the interaction of diseases in a population, a condition known as a syndemic.

Hippocrates of Kos, also known as Hippocrates II, was a Greek physician of the Age of Pericles, who is considered one of the most outstanding figures in the history of medicine. He is often referred to as the "Father of Medicine" in recognition of his lasting contributions to the field as the founder of the Hippocratic School of Medicine. This intellectual school revolutionized medicine in ancient Greece, establishing it as a discipline distinct from other fields with which it had traditionally been associated, thus establishing medicine as a profession.

A syndemic or synergistic epidemic is the aggregation of two or more concurrent or sequential epidemics or disease clusters in a population with biological interactions, which exacerbate the prognosis and burden of disease. The term was developed by Merrill Singer in the mid-1990s. Syndemics develop under health disparity, caused by poverty, stress, or structural violence and are studied by epidemiologists and medical anthropologists concerned with public health, community health and the effects of social conditions on health.

The term epidemiology is now widely applied to cover the description and causation of not only epidemic disease, but of disease in general, and even many non-disease, health-related conditions, such as high blood pressure and obesity. Therefore, this epidemiology is based upon how the pattern of the disease causes change in the function of everyone.

Obesity is a medical condition in which excess body fat has accumulated to an extent that it may have a negative effect on health. People are generally considered obese when their body mass index (BMI), a measurement obtained by dividing a person's weight by the square of the person's height, is over 30 kg/m2; the range 25–30 kg/m2 is defined as overweight. Some East Asian countries use lower values. Obesity increases the likelihood of various diseases and conditions, particularly cardiovascular diseases, type 2 diabetes, obstructive sleep apnea, certain types of cancer, osteoarthritis, and depression.

## History

The Greek physician Hippocrates, known as the father of medicine, [5] [6] sought a logic to sickness; he is the first person known to have examined the relationships between the occurrence of disease and environmental influences. [7] Hippocrates believed sickness of the human body to be caused by an imbalance of the four humors (air, fire, water and earth "atoms"). The cure to the sickness was to remove or add the humor in question to balance the body. This belief led to the application of bloodletting and dieting in medicine. [8] He coined the terms endemic (for diseases usually found in some places but not in others) and epidemic (for diseases that are seen at some times but not others). [9]

### Modern era

In the middle of the 16th century, a doctor from Verona named Girolamo Fracastoro was the first to propose a theory that these very small, unseeable, particles that cause disease were alive. They were considered to be able to spread by air, multiply by themselves and to be destroyable by fire. In this way he refuted Galen's miasma theory (poison gas in sick people). In 1543 he wrote a book De contagione et contagiosis morbis , in which he was the first to promote personal and environmental hygiene to prevent disease. The development of a sufficiently powerful microscope by Antonie van Leeuwenhoek in 1675 provided visual evidence of living particles consistent with a germ theory of disease.

During the Ming Dynasty, Wu Youke (1582–1652) developed the idea that some diseases were caused by transmissible agents, which he called Li Qi (戾气 or pestilential factors) when he observed various epidermics raged around him between 1641 and 1644. [10] His book Wen Yi Lun (瘟疫论，Treatise on Pestilence/Treatise of Epidemic Diseases) can be regarded as the main etiological work that brought forward the concept. [11] His concepts are still being considered in analysing SARS outbreak by WHO in 2004 in the context of traditional Chinese medicine. [12]

Another pioneer, Thomas Sydenham (1624–1689), was the first to distinguish the fevers of Londoners in the later 1600s. His theories on cures of fevers met with much resistance from traditional physicians at the time. He was not able to find the initial cause of the smallpox fever he researched and treated. [8]

John Graunt, a haberdasher and amateur statistician, published Natural and Political Observations ... upon the Bills of Mortality in 1662. In it, he analysed the mortality rolls in London before the Great Plague, presented one of the first life tables, and reported time trends for many diseases, new and old. He provided statistical evidence for many theories on disease, and also refuted some widespread ideas on them.

John Snow is famous for his investigations into the causes of the 19th century cholera epidemics, and is also known as the father of (modern) epidemiology. [13] [14] He began with noticing the significantly higher death rates in two areas supplied by Southwark Company. His identification of the Broad Street pump as the cause of the Soho epidemic is considered the classic example of epidemiology. Snow used chlorine in an attempt to clean the water and removed the handle; this ended the outbreak. This has been perceived as a major event in the history of public health and regarded as the founding event of the science of epidemiology, having helped shape public health policies around the world. [15] [16] However, Snow's research and preventive measures to avoid further outbreaks were not fully accepted or put into practice until after his death.

Other pioneers include Danish physician Peter Anton Schleisner, who in 1849 related his work on the prevention of the epidemic of neonatal tetanus on the Vestmanna Islands in Iceland. [17] [18] Another important pioneer was Hungarian physician Ignaz Semmelweis, who in 1847 brought down infant mortality at a Vienna hospital by instituting a disinfection procedure. His findings were published in 1850, but his work was ill-received by his colleagues, who discontinued the procedure. Disinfection did not become widely practiced until British surgeon Joseph Lister 'discovered' antiseptics in 1865 in light of the work of Louis Pasteur.

In the early 20th century, mathematical methods were introduced into epidemiology by Ronald Ross, Janet Lane-Claypon, Anderson Gray McKendrick, and others. [19] [20] [21] [22]

Another breakthrough was the 1954 publication of the results of a British Doctors Study, led by Richard Doll and Austin Bradford Hill, which lent very strong statistical support to the link between tobacco smoking and lung cancer.

In the late 20th century, with advancement of biomedical sciences, a number of molecular markers in blood, other biospecimens and environment were identified as predictors of development or risk of a certain disease. Epidemiology research to examine the relationship between these biomarkers analyzed at the molecular level, and disease was broadly named "molecular epidemiology". Specifically, "genetic epidemiology" has been used for epidemiology of germline genetic variation and disease. Genetic variation is typically determined using DNA from peripheral blood leukocytes. Since the 2000s, genome-wide association studies (GWAS) have been commonly performed to identify genetic risk factors for many diseases and health conditions.

While most molecular epidemiology studies are still using conventional disease diagnosis and classification systems, it is increasingly recognized that disease progression represents inherently heterogeneous processes differing from person to person. Conceptually, each individual has a unique disease process different from any other individual ("the unique disease principle"), [23] [24] considering uniqueness of the exposome (a totality of endogenous and exogenous / environmental exposures) and its unique influence on molecular pathologic process in each individual. Studies to examine the relationship between an exposure and molecular pathologic signature of disease (particularly cancer) became increasingly common throughout the 2000s. However, the use of molecular pathology in epidemiology posed unique challenges including lack of research guidelines and standardized statistical methodologies, and paucity of interdisciplinary experts and training programs. [25] Furthermore, the concept of disease heterogeneity appears to conflict with the long-standing premise in epidemiology that individuals with the same disease name have similar etiologies and disease processes. To resolve these issues and advance population health science in the era of molecular precision medicine, "molecular pathology" and "epidemiology" was integrated to create a new interdisciplinary field of "molecular pathological epidemiology" (MPE), [26] [27] defined as "epidemiology of molecular pathology and heterogeneity of disease". In MPE, investigators analyze the relationships between (A) environmental, dietary, lifestyle and genetic factors; (B) alterations in cellular or extracellular molecules; and (C) evolution and progression of disease. A better understanding of heterogeneity of disease pathogenesis will further contribute to elucidate etiologies of disease. The MPE approach can be applied to not only neoplastic diseases but also non-neoplastic diseases. [28] The concept and paradigm of MPE have become widespread in the 2010s. [29] [30] [31] [32] [33] [34] [35]

By 2012 it was recognized that many pathogens' evolution is rapid enough to be highly relevant to epidemiology, and that therefore much could be gained from an interdisciplinary approach to infectious disease integrating epidemiology and molecular evolution to "inform control strategies, or even patient treatment." [36] [37]

## Types of studies

Epidemiologists employ a range of study designs from the observational to experimental and generally categorized as descriptive, analytic (aiming to further examine known associations or hypothesized relationships), and experimental (a term often equated with clinical or community trials of treatments and other interventions). In observational studies, nature is allowed to "take its course," as epidemiologists observe from the sidelines. Conversely, in experimental studies, the epidemiologist is the one in control of all of the factors entering a certain case study. [38] Epidemiological studies are aimed, where possible, at revealing unbiased relationships between exposures such as alcohol or smoking, biological agents, stress, or chemicals to mortality or morbidity. The identification of causal relationships between these exposures and outcomes is an important aspect of epidemiology. Modern epidemiologists use informatics as a tool.

Observational studies have two components, descriptive and analytical. Descriptive observations pertain to the "who, what, where and when of health-related state occurrence". However, analytical observations deal more with the ‘how’ of a health-related event. [38] Experimental epidemiology contains three case types: randomized controlled trials (often used for new medicine or drug testing), field trials (conducted on those at a high risk of contracting a disease), and community trials (research on social originating diseases). [38]

The term 'epidemiologic triad' is used to describe the intersection of Host, Agent, and Environment in analyzing an outbreak.

### Case series

Case-series may refer to the qualitative study of the experience of a single patient, or small group of patients with a similar diagnosis, or to a statistical factor with the potential to produce illness with periods when they are unexposed.

The former type of study is purely descriptive and cannot be used to make inferences about the general population of patients with that disease. These types of studies, in which an astute clinician identifies an unusual feature of a disease or a patient's history, may lead to a formulation of a new hypothesis. Using the data from the series, analytic studies could be done to investigate possible causal factors. These can include case-control studies or prospective studies. A case-control study would involve matching comparable controls without the disease to the cases in the series. A prospective study would involve following the case series over time to evaluate the disease's natural history. [39]

The latter type, more formally described as self-controlled case-series studies, divide individual patient follow-up time into exposed and unexposed periods and use fixed-effects Poisson regression processes to compare the incidence rate of a given outcome between exposed and unexposed periods. This technique has been extensively used in the study of adverse reactions to vaccination and has been shown in some circumstances to provide statistical power comparable to that available in cohort studies.

### Case-control studies

Case-control studies select subjects based on their disease status. It is a retrospective study. A group of individuals that are disease positive (the "case" group) is compared with a group of disease negative individuals (the "control" group). The control group should ideally come from the same population that gave rise to the cases. The case-control study looks back through time at potential exposures that both groups (cases and controls) may have encountered. A 2×2 table is constructed, displaying exposed cases (A), exposed controls (B), unexposed cases (C) and unexposed controls (D). The statistic generated to measure association is the odds ratio (OR), which is the ratio of the odds of exposure in the cases (A/C) to the odds of exposure in the controls (B/D), i.e. OR = (AD/BC).

CasesControls
ExposedAB
UnexposedCD

If the OR is significantly greater than 1, then the conclusion is "those with the disease are more likely to have been exposed," whereas if it is close to 1 then the exposure and disease are not likely associated. If the OR is far less than one, then this suggests that the exposure is a protective factor in the causation of the disease. Case-control studies are usually faster and more cost effective than cohort studies, but are sensitive to bias (such as recall bias and selection bias). The main challenge is to identify the appropriate control group; the distribution of exposure among the control group should be representative of the distribution in the population that gave rise to the cases. This can be achieved by drawing a random sample from the original population at risk. This has as a consequence that the control group can contain people with the disease under study when the disease has a high attack rate in a population.

A major drawback for case control studies is that, in order to be considered to be statistically significant, the minimum number of cases required at the 95% confidence interval is related to the odds ratio by the equation:

${\displaystyle {\text{total cases}}=A+C=1.96^{2}(1+N)\left({\frac {1}{\ln(OR)}}\right)^{2}\left({\frac {OR+2{\sqrt {OR}}+1}{\sqrt {OR}}}\right)\approx 15.5(1+N)\left({\frac {1}{\ln(OR)}}\right)^{2}}$

where N is the ratio of cases to controls. As the odds ratio approached 1, approaches 0; rendering case control studies all but useless for low odds ratios. For instance, for an odds ratio of 1.5 and cases = controls, the table shown above would look like this:

CasesControls
Exposed10384
Unexposed84103

For an odds ratio of 1.1:

CasesControls
Exposed17321652
Unexposed16521732

### Cohort studies

Cohort studies select subjects based on their exposure status. The study subjects should be at risk of the outcome under investigation at the beginning of the cohort study; this usually means that they should be disease free when the cohort study starts. The cohort is followed through time to assess their later outcome status. An example of a cohort study would be the investigation of a cohort of smokers and non-smokers over time to estimate the incidence of lung cancer. The same 2×2 table is constructed as with the case control study. However, the point estimate generated is the relative risk (RR), which is the probability of disease for a person in the exposed group, Pe = A / (A + B) over the probability of disease for a person in the unexposed group, Pu = C / (C + D), i.e. RR = Pe / Pu.

.....CaseNon-caseTotal
ExposedAB(A + B)
UnexposedCD(C + D)

As with the OR, a RR greater than 1 shows association, where the conclusion can be read "those with the exposure were more likely to develop disease."

Prospective studies have many benefits over case control studies. The RR is a more powerful effect measure than the OR, as the OR is just an estimation of the RR, since true incidence cannot be calculated in a case control study where subjects are selected based on disease status. Temporality can be established in a prospective study, and confounders are more easily controlled for. However, they are more costly, and there is a greater chance of losing subjects to follow-up based on the long time period over which the cohort is followed.

Cohort studies also are limited by the same equation for number of cases as for cohort studies, but, if the base incidence rate in the study population is very low, the number of cases required is reduced by ½.

### Outbreak investigation

For information on investigation of infectious disease outbreaks, please see outbreak investigation.

## Causal inference

Although epidemiology is sometimes viewed as a collection of statistical tools used to elucidate the associations of exposures to health outcomes, a deeper understanding of this science is that of discovering causal relationships.

"Correlation does not imply causation" is a common theme for much of the epidemiological literature. For epidemiologists, the key is in the term inference. Correlation, or at least association between two variables, is a necessary but not sufficient criteria for inference that one variable causes the other. Epidemiologists use gathered data and a broad range of biomedical and psychosocial theories in an iterative way to generate or expand theory, to test hypotheses, and to make educated, informed assertions about which relationships are causal, and about exactly how they are causal.

Epidemiologists emphasize that the "one cause – one effect" understanding is a simplistic mis-belief.[ citation needed ] Most outcomes, whether disease or death, are caused by a chain or web consisting of many component causes. [40] Causes can be distinguished as necessary, sufficient or probabilistic conditions. If a necessary condition can be identified and controlled (e.g., antibodies to a disease agent, energy in an injury), the harmful outcome can be avoided (Robertson, 2015).

In 1965, Austin Bradford Hill proposed a series of considerations to help assess evidence of causation, [41] which have come to be commonly known as the "Bradford Hill criteria". In contrast to the explicit intentions of their author, Hill's considerations are now sometimes taught as a checklist to be implemented for assessing causality. [42] Hill himself said "None of my nine viewpoints can bring indisputable evidence for or against the cause-and-effect hypothesis and none can be required sine qua non." [41]

1. Strength of Association: A small association does not mean that there is not a causal effect, though the larger the association, the more likely that it is causal. [41]
2. Consistency of Data: Consistent findings observed by different persons in different places with different samples strengthens the likelihood of an effect. [41]
3. Specificity: Causation is likely if a very specific population at a specific site and disease with no other likely explanation. The more specific an association between a factor and an effect is, the bigger the probability of a causal relationship. [41]
4. Temporality: The effect has to occur after the cause (and if there is an expected delay between the cause and expected effect, then the effect must occur after that delay). [41]
5. Biological gradient: Greater exposure should generally lead to greater incidence of the effect. However, in some cases, the mere presence of the factor can trigger the effect. In other cases, an inverse proportion is observed: greater exposure leads to lower incidence. [41]
6. Plausibility: A plausible mechanism between cause and effect is helpful (but Hill noted that knowledge of the mechanism is limited by current knowledge). [41]
7. Coherence: Coherence between epidemiological and laboratory findings increases the likelihood of an effect. However, Hill noted that "... lack of such [laboratory] evidence cannot nullify the epidemiological effect on associations". [41]
8. Experiment: "Occasionally it is possible to appeal to experimental evidence". [41]
9. Analogy: The effect of similar factors may be considered. [41]

Epidemiological studies can only go to prove that an agent could have caused, but not that it did cause, an effect in any particular case:

"Epidemiology is concerned with the incidence of disease in populations and does not address the question of the cause of an individual's disease. This question, sometimes referred to as specific causation, is beyond the domain of the science of epidemiology. Epidemiology has its limits at the point where an inference is made that the relationship between an agent and a disease is causal (general causation) and where the magnitude of excess risk attributed to the agent has been determined; that is, epidemiology addresses whether an agent can cause a disease, not whether an agent did cause a specific plaintiff's disease." [43]

In United States law, epidemiology alone cannot prove that a causal association does not exist in general. Conversely, it can be (and is in some circumstances) taken by US courts, in an individual case, to justify an inference that a causal association does exist, based upon a balance of probability.

The subdiscipline of forensic epidemiology is directed at the investigation of specific causation of disease or injury in individuals or groups of individuals in instances in which causation is disputed or is unclear, for presentation in legal settings.

## Population-based health management

Epidemiological practice and the results of epidemiological analysis make a significant contribution to emerging population-based health management frameworks.

Population-based health management encompasses the ability to:

• Assess the health states and health needs of a target population;
• Implement and evaluate interventions that are designed to improve the health of that population; and
• Efficiently and effectively provide care for members of that population in a way that is consistent with the community's cultural, policy and health resource values.

Modern population-based health management is complex, requiring a multiple set of skills (medical, political, technological, mathematical etc.) of which epidemiological practice and analysis is a core component, that is unified with management science to provide efficient and effective health care and health guidance to a population. This task requires the forward looking ability of modern risk management approaches that transform health risk factors, incidence, prevalence and mortality statistics (derived from epidemiological analysis) into management metrics that not only guide how a health system responds to current population health issues, but also how a health system can be managed to better respond to future potential population health issues. [44]

Examples of organizations that use population-based health management that leverage the work and results of epidemiological practice include Canadian Strategy for Cancer Control, Health Canada Tobacco Control Programs, Rick Hansen Foundation, Canadian Tobacco Control Research Initiative. [45] [46] [47]

Each of these organizations use a population-based health management framework called Life at Risk that combines epidemiological quantitative analysis with demographics, health agency operational research and economics to perform:

• Population Life Impacts Simulations: Measurement of the future potential impact of disease upon the population with respect to new disease cases, prevalence, premature death as well as potential years of life lost from disability and death;
• Labour Force Life Impacts Simulations: Measurement of the future potential impact of disease upon the labour force with respect to new disease cases, prevalence, premature death and potential years of life lost from disability and death;
• Economic Impacts of Disease Simulations: Measurement of the future potential impact of disease upon private sector disposable income impacts (wages, corporate profits, private health care costs) and public sector disposable income impacts (personal income tax, corporate income tax, consumption taxes, publicly funded health care costs).

## Applied field epidemiology

Applied epidemiology is the practice of using epidemiological methods to protect or improve the health of a population. Applied field epidemiology can include investigating communicable and non-communicable disease outbreaks, mortality and morbidity rates, and nutritional status, among other indicators of health, with the purpose of communicating the results to those who can implement appropriate policies or disease control measures.

### Humanitarian context

As the surveillance and reporting of diseases and other health factors becomes increasingly difficult in humanitarian crisis situations, the methodologies used to report the data are compromised. One study found that less than half (42.4%) of nutrition surveys sampled from humanitarian contexts correctly calculated the prevalence of malnutrition and only one-third (35.3%) of the surveys met the criteria for quality. Among the mortality surveys, only 3.2% met the criteria for quality. As nutritional status and mortality rates help indicate the severity of a crisis, the tracking and reporting of these health factors is crucial.

Vital registries are usually the most effective ways to collect data, but in humanitarian contexts these registries can be non-existent, unreliable, or inaccessible. As such, mortality is often inaccurately measured using either prospective demographic surveillance or retrospective mortality surveys. Prospective demographic surveillance requires lots of manpower and is difficult to implement in a spread-out population. Retrospective morality surveys are prone to selection and reporting biases. Other methods are being developed, but are not common practice yet. [48] [49] [50] [51]

## Validity: precision and bias

Different fields in epidemiology have different levels of validity. One way to assess the validity of findings is the ratio of false-positives (claimed effects that are not correct) to false-negatives (studies which fail to support a true effect). To take the field of genetic epidemiology, candidate-gene studies produced over 100 false-positive findings for each false-negative. By contrast genome-wide association appear close to the reverse, with only one false positive for every 100 or more false-negatives. [52] This ratio has improved over time in genetic epidemiology as the field has adopted stringent criteria. By contrast other epidemiological fields have not required such rigorous reporting and are much less reliable as a result. [52]

### Random error

Random error is the result of fluctuations around a true value because of sampling variability. Random error is just that: random. It can occur during data collection, coding, transfer, or analysis. Examples of random error include: poorly worded questions, a misunderstanding in interpreting an individual answer from a particular respondent, or a typographical error during coding. Random error affects measurement in a transient, inconsistent manner and it is impossible to correct for random error.

There is random error in all sampling procedures. This is called sampling error.

Precision in epidemiological variables is a measure of random error. Precision is also inversely related to random error, so that to reduce random error is to increase precision. Confidence intervals are computed to demonstrate the precision of relative risk estimates. The narrower the confidence interval, the more precise the relative risk estimate.

There are two basic ways to reduce random error in an epidemiological study. The first is to increase the sample size of the study. In other words, add more subjects to your study. The second is to reduce the variability in measurement in the study. This might be accomplished by using a more precise measuring device or by increasing the number of measurements.

Note, that if sample size or number of measurements are increased, or a more precise measuring tool is purchased, the costs of the study are usually increased. There is usually an uneasy balance between the need for adequate precision and the practical issue of study cost.

### Systematic error

A systematic error or bias occurs when there is a difference between the true value (in the population) and the observed value (in the study) from any cause other than sampling variability. An example of systematic error is if, unknown to you, the pulse oximeter you are using is set incorrectly and adds two points to the true value each time a measurement is taken. The measuring device could be precise but not accurate. Because the error happens in every instance, it is systematic. Conclusions you draw based on that data will still be incorrect. But the error can be reproduced in the future (e.g., by using the same mis-set instrument).

A mistake in coding that affects all responses for that particular question is another example of a systematic error.

The validity of a study is dependent on the degree of systematic error. Validity is usually separated into two components:

• Internal validity is dependent on the amount of error in measurements, including exposure, disease, and the associations between these variables. Good internal validity implies a lack of error in measurement and suggests that inferences may be drawn at least as they pertain to the subjects under study.
• External validity pertains to the process of generalizing the findings of the study to the population from which the sample was drawn (or even beyond that population to a more universal statement). This requires an understanding of which conditions are relevant (or irrelevant) to the generalization. Internal validity is clearly a prerequisite for external validity.

#### Selection bias

Selection bias occurs when study subjects are selected or become part of the study as a result of a third, unmeasured variable which is associated with both the exposure and outcome of interest. [53] For instance, it has repeatedly been noted that cigarette smokers and non smokers tend to differ in their study participation rates. (Sackett D cites the example of Seltzer et al., in which 85% of non smokers and 67% of smokers returned mailed questionnaires.) [54] It is important to note that such a difference in response will not lead to bias if it is not also associated with a systematic difference in outcome between the two response groups.

#### Information bias

Information bias is bias arising from systematic error in the assessment of a variable. [55] An example of this is recall bias. A typical example is again provided by Sackett in his discussion of a study examining the effect of specific exposures on fetal health: "in questioning mothers whose recent pregnancies had ended in fetal death or malformation (cases) and a matched group of mothers whose pregnancies ended normally (controls) it was found that 28% of the former, but only 20% of the latter, reported exposure to drugs which could not be substantiated either in earlier prospective interviews or in other health records". [54] In this example, recall bias probably occurred as a result of women who had had miscarriages having an apparent tendency to better recall and therefore report previous exposures.

#### Confounding

Confounding has traditionally been defined as bias arising from the co-occurrence or mixing of effects of extraneous factors, referred to as confounders, with the main effect(s) of interest. [55] [56] A more recent definition of confounding invokes the notion of counterfactual effects. [56] According to this view, when one observes an outcome of interest, say Y=1 (as opposed to Y=0), in a given population A which is entirely exposed (i.e. exposure X = 1 for every unit of the population) the risk of this event will be RA1. The counterfactual or unobserved risk RA0 corresponds to the risk which would have been observed if these same individuals had been unexposed (i.e. X = 0 for every unit of the population). The true effect of exposure therefore is: RA1  RA0 (if one is interested in risk differences) or RA1/RA0 (if one is interested in relative risk). Since the counterfactual risk RA0 is unobservable we approximate it using a second population B and we actually measure the following relations: RA1  RB0 or RA1/RB0. In this situation, confounding occurs when RA0  RB0. [56] (NB: Example assumes binary outcome and exposure variables.)

Some epidemiologists prefer to think of confounding separately from common categorizations of bias since, unlike selection and information bias, confounding stems from real causal effects. [53]

## The profession

To date, few universities offer epidemiology as a course of study at the undergraduate level. One notable undergraduate program exists at Johns Hopkins University, where students who major in public health can take graduate level courses, including epidemiology, their senior year at the Bloomberg School of Public Health. [57]

Although epidemiologic research is conducted by individuals from diverse disciplines, including clinically trained professionals such as physicians, formal training is available through Masters or Doctoral programs including Master of Public Health (MPH), Master of Science of Epidemiology (MSc.), Doctor of Public Health (DrPH), Doctor of Pharmacy (PharmD), Doctor of Philosophy (PhD), Doctor of Science (ScD). Many other graduate programs, e.g., Doctor of Social Work (DSW), Doctor of Clinical Practice (DClinP), Doctor of Podiatric Medicine (DPM), Doctor of Veterinary Medicine (DVM), Doctor of Nursing Practice (DNP), Doctor of Physical Therapy (DPT), or for clinically trained physicians, Doctor of Medicine (MD) or Bachelor of Medicine and Surgery (MBBS or MBChB) and Doctor of Osteopathic Medicine (DO), include some training in epidemiologic research or related topics, but this training is generally substantially less than offered in training programs focused on epidemiology or public health. Reflecting the strong historical tie between epidemiology and medicine, formal training programs may be set in either schools of public health and medical schools.

As public health/health protection practitioners, epidemiologists work in a number of different settings. Some epidemiologists work 'in the field'; i.e., in the community, commonly in a public health/health protection service, and are often at the forefront of investigating and combating disease outbreaks. Others work for non-profit organizations, universities, hospitals and larger government entities such as state and local health departments, various Ministries of Health, Doctors without Borders, the Centers for Disease Control and Prevention (CDC), the Health Protection Agency, the World Health Organization (WHO), or the Public Health Agency of Canada. Epidemiologists can also work in for-profit organizations such as pharmaceutical and medical device companies in groups such as market research or clinical development.

## Related Research Articles

Mortality rate, or death rate, is a measure of the number of deaths in a particular population, scaled to the size of that population, per unit of time. Mortality rate is typically expressed in units of deaths per 1,000 individuals per year; thus, a mortality rate of 9.5 in a population of 1,000 would mean 9.5 deaths per year in that entire population, or 0.95% out of the total. It is distinct from "morbidity", which is either the prevalence or incidence of a disease, and also from the incidence rate.

The science of epidemiology has matured significantly from the times of Hippocrates, Semmelweis and John Snow. The techniques for gathering and analyzing epidemiological data vary depending on the type of disease being monitored but each study will have overarching similarities.

A cohort study is a particular form of longitudinal study that samples a cohort, performing a cross-section at intervals through time. While a cohort study is a panel study, a panel study is not always a cohort study as individuals in a panel study do not always share a common characteristic.

The pathogenesis of a disease is the biological mechanism progress of disease showing its morphological features or that leads to the diseased state. The term can also describe the origin and development of the disease, and whether it is acute, chronic, or recurrent. The word comes from the Greek πάθος pathos and γένεσις genesis ("creation").

A case–control study is a type of observational study in which two existing groups differing in outcome are identified and compared on the basis of some supposed causal attribute. Case–control studies are often used to identify factors that may contribute to a medical condition by comparing subjects who have that condition/disease with patients who do not have the condition/disease but are otherwise similar. They require fewer resources but provide less evidence for causal inference than a randomized controlled trial. We only get odds ratio from a case–control study which is an inferior measure of strength of association as compared to relative risk.

In epidemiological research, recall bias is a systematic error caused by differences in the accuracy or completeness of the recollections retrieved ("recalled") by study participants regarding events or experiences from the past. Sometimes also referred to as response bias, responder bias or reporting bias, this type of measurement bias can be a methodological issue in research involving interviews or questionnaires, in which case it could lead to misclassification of various types of exposure. Recall bias is of particular concern in retrospective studies that use a case-control design to investigate the etiology of a disease or psychiatric condition. For example, in studies of risk factors for breast cancer, women who have had the disease may search their memories more thoroughly than members of the unaffected control group; those who had the disease may recall a greater variety of risk factors they had been exposed to, including those falsely attributed to the disease in the media, such as use of oral contraceptives. To minimize recall bias, some clinical trials have adopted a "wash out period", i.e., a substantial time period that must elapse between the subject's first observation and their subsequent observation of the same event.

In medical research and social science, a cross-sectional study is a type of observational study that analyzes data from a population, or a representative subset, at a specific point in time—that is, cross-sectional data.

In statistics, a confounder is a variable that influences both the dependent variable and independent variable causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlations or associations.

Screening, in medicine, is a strategy used in a population to identify the possible presence of an as-yet-undiagnosed disease in individuals without signs or symptoms. This can include individuals with pre-symptomatic or unrecognized symptomatic disease. As such, screening tests are somewhat unusual in that they are performed on persons apparently in good health.

In epidemiology, Mendelian randomization is a method of using measured variation in genes of known function to examine the causal effect of a modifiable exposure on disease in observational studies. The design was first proposed in 1986 and subsequently described by Gray and Wheatley as a method for obtaining unbiased estimates of the effects of a putative causal variable without conducting a traditional randomised trial. These authors also coined the term Mendelian randomization. The design has a powerful control for reverse causation and confounding, which often impede or mislead epidemiological studies.

Environmental epidemiology is a branch of epidemiology concerned with determining how environmental exposures impact human health. This field seeks to understand how various external risk factors may predispose to or protect against disease, illness, injury, developmental abnormalities, or death. These factors may be naturally occurring or may be introduced into environments where people live, work, and play.

Molecular epidemiology is a branch of epidemiology and medical science that focuses on the contribution of potential genetic and environmental risk factors, identified at the molecular level, to the etiology, distribution and prevention of disease within families and across populations. This field has emerged from the integration of molecular biology into traditional epidemiological research. Molecular epidemiology improves our understanding of the pathogenesis of disease by identifying specific pathways, molecules and genes that influence the risk of developing disease. More broadly, it seeks to establish understanding of how the interactions between genetic traits and environmental exposures result in disease.

While epidemiology is "the study of the distribution and determinants of states of health in populations", social epidemiology is "that branch of epidemiology concerned with the way that social structures, institutions, and relationships influence health." This research includes "both specific features of, and pathways by which, societal conditions affect health".

The Bradford Hill criteria, otherwise known as Hill's criteria for causation, are a group of 9 principles, established in 1965 by the English epidemiologist Sir Austin Bradford Hill. They can be useful in establishing epidemiologic evidence of a causal relationship between a presumed cause and an observed effect and have been widely used in public health research. Their exact application and limits of the criteria continue to be debated.

Cause, also known as etiology and aetiology, is the reason or origination of something.

Causal inference is the process of drawing a conclusion about a causal connection based on the conditions of the occurrence of an effect. The main difference between causal inference and inference of association is that the former analyzes the response of the effect variable when the cause is changed. The science of why things occur is called etiology. Causal inference is an example of causal reasoning.

Molecular pathological epidemiology is a discipline combining epidemiology and pathology. It is defined as "epidemiology of molecular pathology and heterogeneity of disease". Pathology and epidemiology share the same goal of elucidating etiology of disease, and MPE aims to achieve this goal at molecular, individual and population levels. Typically, MPE utilizes tissue pathology resources and data within existing epidemiology studies. Molecular epidemiology broadly encompasses MPE and conventional-type molecular epidemiology with the use of traditional disease designation systems.

The discipline of forensic epidemiology (FE) is a hybrid of principles and practices common to both forensic medicine and epidemiology. FE is directed at filling the gap between clinical judgment and epidemiologic data for determinations of causality in civil lawsuits and criminal prosecution and defense.

Occupational epidemiology is a subdiscipline of epidemiology that focuses on investigations of workers and the workplace. Occupational epidemiologic studies examine health outcomes among workers, and their potential association with conditions in the workplace including noise, chemicals, heat, or radiation, or work organization such as schedules.

## References

### Citations

1. Miquel Porta (2014). A Dictionary of Epidemiology (6th ed.). New York: Oxford University Press. ISBN   978-0-19-997673-7 . Retrieved 16 July 2014.
2. Nutter, Jr., F.W. (1999). "Understanding the interrelationships between botanical, human, and veterinary epidemiology: the Ys and Rs of it all". Ecosys Health. 5 (3): 131–40. doi:10.1046/j.1526-0992.1999.09922.x.
3. Hippocrates. (~200BC). Airs, Waters, Places.
4. Carol Buck, Alvaro Llopis, Enrique Nájera, Milton Terris. (1998). The Challenge of Epidemiology: Issues and Selected Readings. Scientific Publication No. 505. Pan American Health Organization. Washington, DC. p3.
5. Alfredo Morabia (2004). A history of epidemiologic methods and concepts. Birkhäuser. p. 93. ISBN   3-7643-6818-7.
6. Historical Developments in Epidemiology. Chapter 2. Jones & Bartlett Learning LLC.
7. Ray M. Merrill (2010). Introduction to Epidemiology. Jones & Bartlett Learning. p. 24. ISBN   0-7637-6622-4.
8. Merril, Ray M., PhD, MPH. "An Introduction to Epidemiology, Fifth Edition". Chapter 2: Historic Developments in Epidemiology. Jones and Bartlett Publishing, 2010. Web. 17 September 2012.
9. "Changing Concepts: Background to Epidemiology" (PDF). Duncan & Associates. Retrieved 3 February 2008.
10. Joseph, P Byre (16 January 2012). Encyclopedia of the Black Death. ABC-CLIO. p. 76. ISBN   9781598842548 . Retrieved 24 February 2019.
11. Guobin, Xu; Yanhui, Chen; Lianhua, Xu (28 March 2018). Introduction to Chinese Culture: Cultural History, Arts, Festivals and Rituals. Springer. p. 70. ISBN   9789811081569 . Retrieved 24 February 2019.
12. "SARS: Clinical Trials on Treatment Using a Combination of Traditional Chinese Medicine and Western Medicine". Wordl Health Organisation. Archived from the original on 8 June 2018. Retrieved 24 February 2019.
13. Doctor John Snow Blames Water Pollution for Cholera Epidemic, by David Vachon UCLA Department of Epidemlology, School of Public Health May & June 2005
14. John Snow, Father of Epidemiology NPR Talk of the Nation. 24 September 2004
15. Dr. John Snow. John Snow, Inc. and JSI Research & Training Institute, Inc.
16. Ólöf Garðarsdóttir; Loftur Guttormsson (June 2008). "An isolated case of early medical intervention. The battle against neonatal tetanus in the island of Vestmannaeyjar (Iceland) during the 19th century" (PDF). Instituto de Economía y Geografía. Retrieved 19 April 2011.
17. Ólöf Garðarsdóttir; Loftur Guttormsson (25 August 2009). "Public health measures against neonatal tetanus on the island of Vestmannaeyjar (Iceland) during the 19th century". The History of the Family. 14 (3): 266–79. doi:10.1016/j.hisfam.2009.08.004.
18. Statisticians of the centuries. By C. C. Heyde, Eugene Senet
19. Anderson Gray McKendrick Archived 22 August 2011 at the Wayback Machine
20. "Origins and early development of the case-control study" (PDF). Archived from the original (PDF) on 18 January 2017. Retrieved 31 August 2013.
21. Ogino S, Fuchs CS, Giovannucci E (2012). "How many molecular subtypes? Implications of the unique tumor principle in personalized medicine". Expert Rev Mol Diagn. 12: 621–8. doi:10.1586/erm.12.46. PMC  . PMID   22845482.
22. Ogino S, Lochhead P, Chan AT, Nishihara R, Cho E, Wolpin BM, Meyerhardt JA, Meissner A, Schernhammer ES, Fuchs CS, Giovannucci E (2013). "Molecular pathological epidemiology of epigenetics: Emerging integrative science to analyze environment, host, and disease". Mod Pathol. 26: 465–84. doi:10.1038/modpathol.2012.214. PMC  . PMID   23307060.
23. Ogino S, King EE, Beck AH, Sherman ME, Milner DA, Giovannucci E (2012). "Interdisciplinary education to integrate pathology and epidemiology: Towards molecular and population-level health science". Am J Epidemiol. 176: 659–67. doi:10.1093/aje/kws226. PMC  .
24. Ogino S, Stampfer M (2010). "Lifestyle factors and microsatellite instability in colorectal cancer: the evolving field of molecular pathological epidemiology". J Natl Cancer Inst. 102: 365–7. doi:10.1093/jnci/djq031. PMC  . PMID   20208016.
25. Ogino S, Chan AT, Fuchs CS, Giovannucci E (2011). "Molecular pathological epidemiology of colorectal neoplasia: an emerging transdisciplinary and interdisciplinary field". Gut. 60: 397–411. doi:10.1136/gut.2010.217182. PMC  . PMID   21036793.
26. Field AE, Camargo CA, Ogino S (2013). "The merits of subtyping obesity: one size does not fit all". JAMA. 310: 2147–8. doi:10.1001/jama.2013.281501.
27. Curtin K, Slattery ML, Samowitz WS (2011). "CpG island methylation in colorectal cancer: past, present and future". Pathology Research International. 2011: 902674.
28. Hughes LA, Khalid-de Bakker CA, Smits KM, den Brandt PA, Jonkers D, Ahuja N, Herman JG, Weijenberg MP, van Engeland M (2012). "The CpG island methylator phenotype in colorectal cancer: Progress and problems". Biochim Biophys Acta. 1825: 77–85. doi:10.1016/j.bbcan.2011.10.005. PMID   22056543.
29. Ku CS, Cooper DN, Wu M, Roukos DH, Pawitan Y, Soong R, Iacopetta B (2012). "Gene discovery in familial cancer syndromes by exome sequencing: prospects for the elucidation of familial colorectal cancer type X.". Mod Pathol. 25: 1055–68. doi:10.1038/modpathol.2012.62. PMID   22522846.
30. Chia WK, Ali R, Toh HC (2012). "Aspirin as adjuvant therapy for colorectal cancer-reinterpreting paradigms". Nat Rev Clin Oncol. 9: 561–70. doi:10.1038/nrclinonc.2012.137.
31. Spitz MR, Caporaso NE, Sellers TA (2012). "Integrative cancer epidemiology—the next generation". Cancer Discov. 2: 1087–90. doi:10.1158/2159-8290.cd-12-0424. PMC  . PMID   23230187.
32. Zaidi N, Lupien L, Kuemmerle NB, Kinlaw WB, Swinnen JV, Smans K (2013). "Lipogenesis and lipolysis: The pathways exploited by the cancer cells to acquire fatty acids". Prog Lipid Res. 52: 585–9. doi:10.1016/j.plipres.2013.08.005. PMC  . PMID   24001676.
33. Ikramuddin S, Livingston EH (2013). "New Insights on Bariatric Surgery Outcomes". JAMA. 310: 2401–2. doi:10.1001/jama.2013.280927.
34. Little TJ, Allen JE, Babayan SA, Matthews KR, Colegrave N (2012). "Harnessing evolutionary biology to combat infectious disease". Nature Medicine. 18: 217–220. doi:10.1038/nm.2572. PMC  . PMID   22310693.
35. Pybus OG, Fraser C, Rambaut A (2013). "Evolutionary epidemiology: preparing for an age of genomic plenty". Phil Trans R Soc B. 368: 20120193. doi:10.1098/rstb.2012.0193. PMC  .
36. "Principles of Epidemiology." Key Concepts in Public Health. London: Sage UK, 2009. Credo Reference. 1 August 2011. Web. 30 September 2012.
37. Hennekens, Charles H.; Julie E. Buring (1987). Mayrent, Sherry L. (Ed.), ed. Epidemiology in Medicine. Lippincott, Williams and Wilkins. ISBN   978-0-316-35636-7.CS1 maint: Extra text: editors list (link)
38. Rothman, Kenneth J. (1986). Modern Epidemiology. Boston/Toronto: Little, Brown and Company. ISBN   0-316-75776-4.
39. Hill, Austin Bradford (1965). "The Environment and Disease: Association or Causation?". Proceedings of the Royal Society of Medicine . 58 (5): 295–300. PMC  . PMID   14283879.
40. Phillips, Carl V.; Karen J. Goodman (October 2004). "The missed lessons of Sir Austin Bradford Hill". Epidemiologic Perspectives and Innovations. 1 (3): 3. doi:10.1186/1742-5573-1-3. PMC  . PMID   15507128. Archived from the original on 23 May 2008.
41. Green, Michael D.; D. Michal Freedman, and Leon Gordis. Reference Guide on Epidemiology (PDF). Federal Judicial Centre. Archived from the original (PDF) on 27 February 2008. Retrieved 3 February 2008.
42. Neil Myburgh; Debra Jackson. "Measuring Health and Disease I: Introduction to Epidemiology". Archived from the original on 1 August 2011. Retrieved 16 December 2011.
43. Smetanin, P.; P. Kobak (October 2005). Interdisciplinary Cancer Risk Management: Canadian Life and Economic Impacts. 1st International Cancer Control Congress (PDF).
44. Smetanin, P.; P. Kobak (July 2006). A Population-Based Risk Management Framework for Cancer Control. The International Union Against Cancer Conference. Archived from the original (PDF) on 2 February 2014.
45. Smetanin, P.; P. Kobak (July 2005). Selected Canadian Life and Economic Forecast Impacts of Lung Cancer. 11th World Conference on Lung Cancer. Archived from the original (PDF) on 2 February 2014.
46. WHO, "Health topics: Epidemiology." http://www.who.int/topics/epidemiology/en/ Accessed: 30 October 2017.
47. Miquel Porta. A Dictionary of Epidemiology. http://global.oup.com/academic/product/a-dictionary-of-epidemiology-9780199976737?cc=us&lang=en 6th edition, New York, 2014 Oxford University Press ISBN   978-0-19-997673-7 Accessed: 30 October 2017.
48. Prudhon, C & Spiegel, P. "A review of methodology and analysis of nutrition and mortality surveys conducted in humanitarian emergencies from October 1993 to April 2004" Emerging Themes in Epidemiology 2007, 4:10. http://www.ete-online.com/content/4/1/10 Accessed: 30 October 2017.
49. Roberts, B et al. "A new method to estimate mortality in crisis-affected and resource-poor settings: validation study." International Journal of Epidemiology 2010;39:1584–1596. Accessed: 30 October 2017.
50. Ioannidis, J. P. A.; Tarone, R.; McLaughlin, J. K. (2011). "The False-positive to False-negative Ratio in Epidemiologic Studies". Epidemiology. 22 (4): 450–456. doi:10.1097/EDE.0b013e31821b506e. PMID   21490505.
51. Hernán, M. A.; Hernández-Díaz, S.; Robins, J. M. (2004). "A structural approach to selection bias". Epidemiology. 15 (5): 615–625. doi:10.1097/01.ede.0000135174.63482.43. PMID   15308962.
52. Archived 29 August 2017 at the Wayback Machine 24
53. Rothman, K. (2002). Epidemiology: An Introduction. Oxford: Oxford University Press. ISBN   0195135547.
54. Greenland S, Morgenstern H (2001). "Confounding in Health Research". Annu. Rev. Public Health. 22: 189–212. doi:10.1146/annurev.publhealth.22.1.189.
55. "Public Health Studies". Public Health Studies at Johns Hopkins. Retrieved 13 April 2017.