Epidemiology is the indispensable basic science of public health. It provides the logical framework for the facts that enable public health officials to identify important public health problems and to delineate their dimensions. Epidemiologic methods are used to define these health problems; to classify, identify, and elucidate their causes; and to plan and evaluate rational control measures.


In ancient times, epidemics and plagues were terrifying natural phenomena that cried out for a more rational explanation than that they were due to the wrath of god or the machinations of evil spirits. Hippocrates (c. 460377 b.c.e.) described many kinds of epidemics and in On Airs, Waters, Places and other writings. He offered empirical insights into environmental and behavioral factors that might be associated with certain kinds of disease. Although doctors and others engaged in the healing arts did not clearly understand the concept of contagion until several hundred years later, Fracastorius (c. 14781553) identified several ways that infections can be transmittedby direct contact, by what we now call droplet spread, and by contaminated clothing.

The science of epidemiology took root with empirical observations of epidemics and other causes of death. John Graunt (16201674), in London, complied the first mortality tables on England's bills of mortality. Statistical analyses of deaths due to childbed fever by Ignaz Semmelweiss (18181865) in Vienna in the early nineteenth century and of tuberculosis by Pierre Charles Alexandre Louis (17871872) in Paris demonstrated the power of numbers. In London, in 1848 and 1854, meticulous, logical examination of the facts and figures about cholera epidemics by John Snow (18131858) revealed the mode of communication of this deadly epidemic disease. Snow is regarded as the founder of modern epidemiology because of his use of such careful methods.

Until early in the twentieth century almost all epidemiology focused on communicable diseases, although Percivall Pott's (17141788) observations on cancer of the scrotum in chimney sweeps and James Lind's dietary experiment with fresh fruit to prevent scurvy (1975) were precursors of modern noncommunicable disease epidemiology and clinical trials, respectively. The use of epidemiology in studies of coronary heart disease and cancer in large-scale trials of many new preventive and therapeutic regimens, in nationwide surveys of health status, and in evaluation of health services came to the fore in the second half of the twentieth century. In the final quarter of the twentieth century, powerful computers, information technology, and more rigorous methodological approaches transformed epidemiology and made it a mandatory feature of clinical science as well as the most fundamental basic science of public health.


The word "epidemiology" was coined in the mid nineteenth century to describe the scientific study of epidemics. Its meaning has expanded over the years, and present-day epidemiology encompasses the study of all varieties of illness and injury as they affect defined groups of people. In 1983 a committee representing the International Epidemiological Association defined epidemiology as "the study of the distribution and determinants of health-related states or events in specified populations, and the application of this study to control of health problems." Study includes observation, surveillance, hypothesis-testing research projects, analysis of epidemiologic and other kinds of data, and certain other kinds of experiments. Distribution includes analysis of data according to the time scale over which events occur, the places where the events occur, and the categories of persons to whom they occur. Determinants are all the physical, biological, behavioral, social, and cultural factors that influence health. Health-related states or events include diseases, causes of death, behaviors such as the use of tobacco, reactions to preventive regimens, and provision and use of health services. Specified populations are those with identifiable characteristics such as known numbers and age groups. The ultimate aim and purpose of epidemiologyto promote, protect, and restore good healthis manifested in the "application of this study to control health problems."

Epidemiologists attempt to identify, measure, count, and control diseases, injuries, and causes of untimely death; and to relate these events to the associated inherited, environmental, and behavioral factors that cause or contribute to them. One of the great intellectual challenges of epidemiology is to dissect these factors and unravel their connections in order to identify exactly what is ultimately responsible for a particular disease or health problem.


The information used by epidemiologists comes from a diverse array of sources; draws on a wide range of sciences and technologies; and calls on the expertise of technologists and other people engaged in many kinds of crafts. Some connections are obviousthose with vital statistics, biostatistics, microbiology, immunology, and chemistry; with every clinical specialty from pediatrics to geriatrics and palliative care, and from family practice to hematology and neurosurgery. Other obvious connections are to the social and behavioral sciences, and, less obviously, to animal husbandry, wildlife biology, agricultural science, physics, atmospheric sciences, oceanography, engineering, town planning, education, law enforcement, communications technology, and the media. Epidemiology may be the most ecumenical of all the sciences. Probably no other branch of biomedical science has so many connections to such a wide range of other human activities.


The basis of all epidemiology is the comparison of groups of people. For these comparisons to be valid, it is necessary to convert raw numbers into rates. A rate is a fractionthe upper part (the numerator) is the number of people affected by the problem, event, or condition of interest; the lower part (the denominator) is the number of persons in the population who are at risk of experiencing the problem, event, or condition. Because the events normally continue over a long period, often indefinitely, rates are expressed in relation to a specified time. Since fractions are awkward to deal with, there is commonly a multiplier, and the rate, as shown in the following formula, is expressed in terms of so many per thousand, per hundred thousand, etc., in a specified time, usually a year, though shorter periods are used when circumstances warrant it:

In practice there are many variations in the ways rates are expressed, but the basic elements of events, population at risk, and time are common to all.

Rates have many uses. By comparing rates, epidemiologists can examine the experience of particular groups of people at specified times, in different cities, countries, or occupational groups. The observed differences are the basis for inferences about the reasons for these differences, and are used to test hypotheses about these reasons, possibly about the putative cause of a particular kind of cancer, for instance. In addition to the absolute requirement, for validity, of basing all comparisons on rates, another important use is in calculating the risks to individuals and groups of experiencing an event such as a heart attack, the occurrence of cancer, or traffic injury. Comparisons are often rendered invalid, or relatively unreliable, by differences among the populations being comparedoften because of failure to allow for various kinds of biases and confounding factors. A common problem stems from differences in the age composition of populations that are being compared. This problem is overcome by the procedure of age-adjustment. Another problem is that there may be important qualitative differences, such as health or employment status, between groups that are being compared.

The terms "incidence" and "prevalence" are often confused. Incidence refers to the number of new cases, events, or deaths, that occur in a specified time, usually one year. Prevalence refers to the total number of events or cases, both new and long-term, that are present at a particular point in time. Prevalence is therefore expressed as a number, not a rate, as there is no time dimension involved.


An epidemic is the occurrence of a number of cases of a disease clearly in excess of normal expectation. This is usually a large number when the disease is one of the common infectious fevers, but even a single case of a dangerous contagious disease, such as typhoid, that has long been absent from a community should suffice to activate the highest level of epidemic surveillance and control measures. The occurrence of a small number of cases of a rare variety of cancer, closely clustered in time and space, may also signal an epidemic. Observational and analytic epidemiology blend in the investigation of epidemics. The investigation demands meticulous attention to detail in collecting information about all the cases of the condition, including mild and inconspicuous cases as well as those with florid manifestations, and must include details about all possible associated factors, such as dietary intake (this is especially important in outbreaks of food poisoning), occupation, living conditions, and unusual recent experiences. Particular attention is paid to the index casethe first identified case of a condition. In most infectious disease epidemics, this could be the case that introduced the infection into the affected community. Information is also gathered about healthy people in the same community, aimed at discovering why they have not been affected. Laboratory tests are used to confirm the diagnosis, identify the pathogenic organism, toxic chemical, or other agent that caused the disease; and to measure immunological responses among both sick and healthy people. Analyzing all this information often clarifies the nature and cause of an epidemic and points the way to appropriate control measures.

Investigating epidemics can be tedious because it needs to be so painstaking, even, seemingly, a boring routine task. But often it is as exciting as detective fiction. For example, an epidemic of typhoid in Aberdeen, Scotland, was traced eventually to a contaminated can of processed beef from Argentina. The can had been cooled in a river adjacent to the canning works. As the pressure inside the can fell when it cooled, a partial vacuum was created and typhoid bacilli in raw sewage in the water were sucked into the can through a minute hole.

Identifying the existence of an epidemic sometimes requires unusual vigilance and an ability to make connections among seemingly isolated events. An epidemic of lethal pneumonia among members of the American Legion who attended a convention in Philadelphia in 1976 and then returned to their hometowns before becoming ill, would not have come to light without rigorous scrutiny on the part of epidemic intelligence service officers of the Centers for Disease Control. Subsequent investigations led to the identification of Legionnaire's disease.

Techniques of molecular biology, notably DNA typing and the identification of biomarkers, have immensely enhanced the precision of epidemic investigation. It is now possible to trace the exact passage of an infectious agent such as the gonococcus or HIV (human immunodeficiency virus) as it is transmitted by direct contact from one individual to another among a group of people; or to show that coughing by a passenger with open pulmonary tuberculosis on a crowded airline flight can cause primary tuberculous infection of other passengers in the same compartment of that flight; or to determine how certain cancer-causing agents actually induce cancer. Books and articles in the popular press, notably the accounts by the journalist Berton Roueché in the New Yorker, and on some TV programs have communicated the excitement and challenge of epidemic investigations.


The application of several analytic methods of epidemiologic study has contributed substantially to scientists' understanding of disease causation, and therefore to control and prevention of many conditions of great public health importance. The available methods are observational epidemiology (the empirical study of naturally occurring events), analytic study, and, under carefully defined conditions and with all due ethical safeguards, human experimentation.

Observational Epidemiology. This method begins with surveillance of populations, using vital and health statisticsincluding analysis of death rates arranged by age, sex, locality, and cause of death. Other information is derived from notified cases of infectious diseases of public health importance, from registries of cancer or other diseases, and from hospital discharge statistics. Since 1957, the National Center for Health Statistics has conducted continuously a National Health Survey that has carried observational epidemiology to new levels of comprehensiveness.

It is often possible to make imaginative use of many other kinds of available information about defined population groups. Schools and many employers keep records of absences due to sickness, sometimes with reasons for these absences. Police and other law enforcement agencies keep records of calls to settle domestic disputes and of damage due to vandalism, which are useful indicators of social pathologies associated with local variations in the frequency of domestic violence, alcohol abuse, and broken families. All such sources of information combine to make it possible for epidemiologists and public health specialists to produce a multidimensional "community diagnosis." Serial measurements can indicate whether things are improving or getting worse, and in which ways these trends are moving for each of different indicators ranging from adolescent smoking behavior to reasons for long-term disability among the elderly.

Analytic Observational Studies. The possibilities of observational epidemiology are considerable, but not limitless. They are powerfully reinforced by analytic studies. The two main analytic methods are the case-control study and the cohort study.

Careful questioning of patients has enabled many doctors to make inferences about the influence of past experience on present disease. Percivall Pott, an eighteenth-century British physician, observed that cancer of the scrotum occurred among former chimney sweeps, and correctly inferred that it was associated with the accumulation of tar in the skin creases. Two hundred years later, in 1940, Norman Gregg, an ophthalmologist in Sydney, Australia, similarly inferred correctly that the cases he was seeing of congenital cataract must be associated with rubella (German measles), which their mothers had had during early pregnancy.

The case-control study is a systematic extension of routine medical history taking, in which the past histories of patients (the cases) suffering from the condition of interest are compared to the past histories of persons (the controls) who do not have the condition of interest, but who otherwise resemble the cases in such particulars as age and sex. Analysis of data about a series of cases and controls may show differences that are statistically significant. Sometimes only small numbers of cases are required to demonstrate significant differences between cases and controls. This makes the case-control study a suitable way to search for causes of rare conditions. For example, the discovery that a very rare form of liver cancer was strongly associated with occupational exposure to vinyl chloride required only four cases, and the fact that expectant mothers' use of artificial estrogens during early pregnancy can cause cancer of the vagina many years later in their daughters was based on a case-control study of eight cases. Although case-control studies can be flawed by the presence of biases that are often difficult or even impossible to eliminate, they are a valuable method of investigation because they can be done rapidly and at relatively little expense. The findings can be confirmed or refuted by more rigorous research methods such as cohort studies.

A cohort study is conducted by identifying individuals in a defined population who are exposed to varying levels of known or suspected risk for the condition of interest, such as cancer of the lung or coronary heart disease. The population is observed over a certain period, and the death and disease incidence rates among those exposed to varying and known levels of risk are compared. Cohort studies require large numbers, commonly many thousands, and prolonged observation, commonly years or even decades. They are therefore expensive, requiring a large and dedicated staff and maintenance of detailed records of very large numbers of people, only a small proportion of whom will ultimately fall ill and die of the condition of interest. Some cohort studies have become famous. The people of Framingham, Massachusetts, have been the subjects of cohort studies of coronary heart disease since 1948. In 1951, Richard Doll and Austin Bradford Hill began a cohort study of lung cancer in relation to tobacco smoking in a cohort of about 40,000 male British doctors. Later phases of this study have expanded to include risk factors for coronary heart disease and other chronic conditions; and by the late 1990s this study had yielded dramatic evidence of the relationship of tobacco smoking to cancers of many kindsand to coronary heart disease, chronic obstructive lung disease, and various other life-shortening chronic diseases.

It is possible to get results from a cohort study without waiting many years, if detailed information about exposure to risk factors at some time in the past is available in sufficient detail for a population of sufficient size. A method that permits reliable linking of past and present medical and other relevant records, such as a record linkage system, facilitates this approach. Record linkage is the process of relating information from two or more sets of recordscompiled years apart and sometimes by different agenciesabout the same individuals. A prerequisite is a way to identify individuals with a high degree of precision, such as a unique numbering system, or a system combining a sequence of digits for birthdate, birthplace, and sex; with alphabet letters or a phonetic code used for other details, such as the individual's mother's maiden name. Obviously the logistics of all this make it a costly method, but the yield can justify the expense. This method, known as an historical cohort study, has demonstrated the relationship of childhood cancer and developmental anomalies to prenatal maternal exposure to small diagnostic doses of X-rays. Record linkage and historical cohort studies have also demonstrated a relationship between birthweight and the occurrence of cardiovascular disease in middle age.

Experimental Epidemiology. In the 1920s, experimental epidemiology meant observing the passage of infectious pathogens in colonies of rodents, but such experiments are rarely necessary, and the meaning of the term has changed. Experiments in which the investigator studies the effects of intentional alteration or intervention in the course of a disease are now done on humans rather than experimental animals, usually using a randomized controlled-trial design.

The randomized controlled trial (RCT) is a form of human experimentation in which the subjects, usually patients, are randomly allocated to receive either a standard accepted therapeutic or preventive regimen, or an experimental regimen. The purpose of random allocation is to eliminate or minimize bias in the selection of subjects. This greatly enhances the validity of the results. Preferably, the subjects and those observing the trial's results should be unaware of which subjects are receiving the experimental and control regimens, thus eliminating the power of suggestion as a factor influencing the response of individuals to the regimen. There are very important ethical constraints on the conduct of randomized controlled trials. The only ethically acceptable justification for conducting a randomized controlled trial is uncertainty about which of the available regimens is the best, a state of affairs known as "equipoise." It is absolutely essential to obtain the genuinely informed consent of all human subjects on whom a trial is conducted.


In the final quarter of the twentieth century, physicians in clinical practice discovered the value of epidemiologic methods in enhancing the efficacy of treatment regimens, mainly through rigorous attention to the nature and quality of the evidence on which clinical decisions are based. Evidencebased medicine then moved into public health practice, where it is illuminating decisions about many aspects of public health practice, such as the most effective way to deploy public health nurses in a local health department.


Epidemiology made spectacular progress in several other directions in the 1990s. One was in the application of molecular biology, resulting in what is sometimes called molecular epidemiology. Other advances have been made in genetic epidemiology, where the meeting of molecular genetics with public health, occupational and environmental health, and infant and child health has produced both exciting stories of great progress and difficult ethical and moral problems. What are scientists and physicians to do, for instance, with the newfound knowledge and technical capability to identify defective genes, especially genes that, in interaction with some environmental circumstances, can disqualify certain individuals from particular occupations and can render others ineligible for life insurance? Such dilemmas presage a testing time for society's values.

Another set of new challenges face epidemiologists who specialize in studies of risk management. The global environment is changing as the burden of greenhouse gases increases and leads to a rise in average global ambient temperatures, and remote sensing and climate models enable us to predict the likely future distribution of vector-borne diseases such as malaria, dengue, and schistosomiasis. A new realm of risk factor analysis is thus emerging, based on future health scenarios that incorporate climate models and in the most sophisticated applicationsinclude sets of models for future patterns of biodiversity, human settlements, and economic and industrial dynamics. In these ways epidemiologists are helping to plan the public health services that will be needed in the future.

John M. Last

Epidemiology is the study of the frequency, distribution, and determinants of disease in humans. Its aim is the prevention or effective control of disease. The term originated in the study of epidemics, rapidly spreading diseases that affect large numbers of a population (from the Greek epi meaning upon and demos meaning people). Epidemiology touches on ethics in two key areas: The need for competent and honest use of its information, and questions of responsibility raised by the global picture it presents of the health of humanity.

Speculation about the nature and causes of disease dates back to antiquity. The formal history of epidemiology, like that of statistics, begins with the systematic official recording of births and deaths in the seventeenth century, proceeding to the quantitative investigation of diseases with the emergence of scientific medicine in the nineteenth. Based on the theory of probability, statistical inference reached maturity in the early-twentieth century and gradually spread into a wide range of disciplines. Its application to medical research gave rise to biostatistics and contemporary epidemiology.

There is no clear division between the two fields. Epidemiology focuses more on public health issues and the need for valid population-based information, but it uses the theory and methods of biostatistics. Its practitioners tend to be individuals with primary interest and training in medicine or a related science, whereas biostatisticians come from mathematics. They work together as members of the medical research team, in the dynamic context of scientific advances and the latest information technology.

Modern Epidemiology

The mathematical approach to medicine, with the methodical tabulation of patient information on diseases and treatment outcomes, was introduced in the 1830s by the French physician Pierre C. A. Louis (1787–1872). As a notable result of his researches in Paris hospitals, his Numerical Method revealed the uselessness of bloodletting. Inspired by Louis, his British student William Farr (1807–1883) became the central figure in the development of vital statistics in England and the use of statistics to address public health concerns. Farr worked with John Snow (1813–1858), the physician who investigated the cholera epidemic sweeping through London in 1854. Snow's finding that the cholera poison was transmitted in contaminated water from the Broad Street pump was a milestone event in epidemiology and public health. Farr also provided guidance in statistics for Florence Nightingale (1820–1910) to support her work in hospital reform.

The existence of microbes was discovered in the late-seventeenth century by the Dutch lens grinder Antonie van Leeuwenhoek (1632–1723), who saw "animalcules, more than a million for each drop of water" through his microscope (Porter 1998, p. 225). The role of germs as causes of disease was established by Louis Pasteur (1822–1895), French chemist and founder of microbiology. Pasteur invented methods to isolate and culture bacteria, and to destroy them in perishable products by a heat treatment now called pasteurization. He found that inoculation by a weakened culture provided immunity, protection against the disease. This explained the earlier discovery of the English physician Edward Jenner (1749–1823) that vaccination with the milder cowpox protected against smallpox. (Vaccination comes from the Latin vacca meaning cow.) The German physician Robert Koch (1843–1910), founder of bacteriology, further developed techniques of isolating and culturing bacteria. He identified the germ causing anthrax in 1876, tuberculosis in 1882, and cholera in 1883. He contributed to the study of other major diseases, including plague, dysentery, typhoid fever, leprosy, and malaria.

Extensive public health measures of hygiene and immunization, along with the introduction of the sulfonamide drugs in the late 1930s and antibiotics in the 1940s, brought most infectious diseases under control. Attention turned to chronic diseases, by then the leading causes of morbidity and mortality—multicausal diseases with a long latency period and natural course. Two historic discoveries of the mid-twentieth century were tobacco use as a cause of lung cancer, and risk factors for heart disease. From the study of infectious and chronic diseases epidemiology has evolved into a multidimensional approach, defined by disease, exposure, and methods, with focus on new developments in medical science. Its many specialties include cancer, cardiovascular, and aging epidemiology, environmental, nutritional, and occupational epidemiology, clinical and pharmaco-epidemiology, and molecular and genetic epidemiology. With the sequencing of the human genome,

Table 1: Some Basic Terms of Epidemiology
SOURCE: Courtesy of Valerie Miké. Data in example from U.S. Census Bureau website and American Cancer Society (2004).
Measures of Morbidity and Mortality
• PREVALENCE (Burden of disease): Number of existing cases of a disease at a given point in time divided by the total population.
• INCIDENCE (Cumulative incidence, risk): Number of new cases of a disease during a given time period divided by the total population at risk.
• INCIDENCE RATE (Incidence density): Number of new cases of a disease during a given time period divided by the total person-time of observation.
• PERSON-TIME (usually person-years): Total disease-free time of all persons in the study, allowing for different starting dates and lengths of time observed
• CRUDE DEATH RATE: Number of deaths during a given time period divided by the total population.
• STANDARDIZED DEATH RATE: Crude death rate adjusted to control for age or other characteristic to allow valid comparisons using a standard population.
Example of Age-Adjusted Death Rates (2000 US Standard Population)
 AlaskaFloridaUnited States
Crude death rate/1,000 population (in 2000):4.610.38.9
Percent of population over age 65 (in 2000):5.717.612.4
Age-adjusted death rate/100,000 population (avg. for 1996–2000):   
Breast cancer25.225.627.7
Prostate cancer24.228.432.9
Prevalence, incidence, and death rates are expressed in units of a base (proportion mulitplied by base), usually per 1,000 or 100,000 population.
Table 2: Case-Control Study of Lung Cancer and Smoking
Smokers Lung CancerControlsOdds Ratio (ad/bc)
Historic study showing the association between cigarette smoking and lung cancer. No association would correspond to an odds ration of 1. P-values obtained by chi-square test for 2×2 tables.
SOURCE: Data from Doll and Hill (1950).
 Total649 649  
 Total60 60  
Table 3: Cohort Study of Risk Factors for Coronary Heart Disease: Systolic Blood Pressure
Systolic BP (mmHg)Age 35–64Age 65–94
Average annual incidence per 1,000 persons of coronary heart disease, by systolic blood pressure. Example of relative rist (RR): For men (35-64), systolic BP>180 relative to <120 mmHg, RR=22/7=3.1. No association would correspond to RR=1. Results of 30-year follow-up in historic Framingham Heart Study of risk factors for cardiovascular disease.
SOURCE: Adapted from Stokes et al. (1989).
Total Events516305244269

genetics is assuming increasing importance across all lines of inquiry. In its principles of studying human populations, epidemiology is related to psychology, sociology, and anthropology, all of which employ statistical inference.

Basic Concepts and Methods

Epidemiology may be descriptive or analytic. Descriptive epidemiology reports the general characteristics of a disease in a population. Its methods include case reports, correlational studies (to describe any association between potential risk factors and disease in a given database) and cross-sectional surveys (to determine prevalence of a disease and potential risk factors at a given point in time). Analytic epidemiology uses observational and experimental studies. The latter are clinical trials to test the effectiveness of interventions to treat or prevent a disease. But experimentation on humans is not ethically feasible for studying causes of disease. Observational research designs are thus the primary tools of epidemiology, the main types being case-control and cohort studies. After definition of some basic terms, these are discussed further below.

Careful study is required to assess potential biases and confounding variables. General guidelines for establishing causality are provided by Hill's Criteria (Table 6).
Interpreting a Statistical Association
Possible Reasons for an Observed Statistical Association
SOURCE: Courtesy of Valerie Miké.
1.CHANCE: This is precisely the meaning of P-value, the probability that the observed outcome is due to chance.
2.BIAS: Systematic errors that distort the results, such as selection bias, recall bias, and observation bias.
3.CONFOUNDING: There is an extraneous, confounding variable (perhaps as yet unknown) that is related to the risk factor being studied and is an independent risk factor for the disease.
4.CAUSE-AND-EFFECT: The risk factor in the observed association is a cause of the disease.

MEASURES OF MORBIDITY AND MORTALITY. Some basic concepts of epidemiology are listed in Table 1. It is important to distinguish between the prevalence of a disease and its incidence. Prevalence signifies the amount of disease present at a point in time, such as the proportion of people with adult-onset diabetes in the United States on January 1, 2005. Incidence refers to new cases diagnosed during a given period of time, such as the proportion of U.S. adults diagnosed with diabetes in 2005. The denominator of incidence rate is person-time, a useful concept that allows for inclusion of subjects with different starting dates and lengths of time observed in a study. Causes of a disease can be investigated by observing incidence in a well-defined group of subjects without the disease, and patterns of disease incidence can be compared over time or populations.

Mortality is measured in terms of crude death rate, the actual proportion observed, or the standardized death rate, which involves adjustment for some characteristic. The example shows age-adjusted cancer death rates for the states of Alaska and Florida. Alaska has a much lower crude death rate than Florida, but its population is much younger. Both breast and prostate cancer are associated with older age, but after age-adjustment the two states are seen to have similar death rates for these two sites, both lower than the national average. The adjusted figures are meaningless in themselves, but provide for valid comparison of rates across groups and time. U.S. cancer death rates have been adjusted using the 2000 U.S. age distribution to make them comparable back to 1930 and ahead to the future.

OBSERVATIONAL RESEARCH DESIGNS: CASE-CONTROL AND COHORT STUDIES. A case-control study is retrospective: It identifies a group of people with the disease (cases) and selects a group as similar as possible to the cases but without the disease (controls). The aim is to determine the proportion of each group who were exposed to the risk factor of interest and compare them. Table 2 shows results of the case-control study of lung cancer and smoking reported in 1950 by Sir Richard Doll (b. 1912) and Sir Austin Bradford Hill (1897–1991), British pioneers of epidemiology and biostatistics. They identified 649 men and sixty women with lung cancer in twenty London hospitals and matched them with controls of the same age and sex but without lung cancer. The information they collected on all participants included their smoking history. The observed association, measured by the so-called odds ratio (the odds of smoking in cases over the odds of smoking in controls), was clearly statistically significant.

A cohort study is usually prospective. (It may be historical, if based on recorded past information.) It identifies a large group (cohort) of individuals who do not have the disease but for whom complete information is available concerning the risk factor(s) of interest; the cohort is then observed for the occurrence of the disease. A noted cohort design was the Framingham Heart Study, initiated by the U.S. Public Health Service in 1948 to identify risk factors for heart disease. Over 5,000 adult residents of Framingham, Massachusetts, men and women with negative test results for cardiovascular disease, agreed to join the study and undergo repeat testing at two-year intervals. The age and test measures at the start of each two-year period were used to classify subjects. Results of a thirty-year follow-up evaluation (part of a multivariate analysis including other risk factors and cardiovascular outcomes) are shown in Table 3, demonstrating a strong association between systolic blood pressure and incidence of coronary heart disease. Other suitable groups for cohort studies are members of professional groups, like doctors and nurses.

There are advantages and disadvantages pertaining to each research design, and the choice depends on the circumstances of the scientific question of interest. Any observed association then requires careful interpretation.

Association or Causation?

Possible reasons for an observed statistical association are listed in Table 4. Chance is simply the meaning of

Proposed in 1884 by Robert Koch for bacteria, the original wording has been modified to include other microbes. Further versions use molecular biology as a tool to associate microbial agents with disease. Table is adapted from a leading textbook of medical microbiology.
Koch's Postulates for Establishing the Causes of Infectious Diseases, with Molecular Update
Koch's PostulatesMolecular Koch's Postulates*
*In addition, guidelines for establishing microbial disease causation in terms of the prevalence of the nucleic acid sequence of a putative pathogen in relation to disease status are given in the third column of the table from which this is taken.
SOURCE: Brooks et al. (2001), p. 134.
1.The microorganism should be found in all cases of the disease in question, and its distribution in the body should be in accordance with the lesions observed.1.The phenotype or property under investigation should be significantly associated with pathogenic strains of a species and not with nonpathogenic strains.
2.The microorganism should be grown in pure culture in vitro (or outside the body of the host) for several generations.2.Specific inactivation of the gene or genes associated with the suspected virulence trait should lead to a measurable decrease in pathogenicity or virulence.
3.When such a pure culture is inoculated into susceptible animal species, the typical disease must result.3.Reversion or replacement of the mutated gene with the wild type gene should lead to restoration of pathogenicity or virulence.
4.The microorganism must again be isolated from the lesions of such experimentally produced disease.  

the P-value, the probability that the association is due to chance. Bias refers to systematic errors that do not cancel out with larger sample size, but distort the results in one direction. For example, in a case-control study patients with the disease may be more likely to recall exposure to the risk factor than the controls, leading to recall bias. Bias is a serious problem in observational studies and needs to be assessed in the particular context of each research design. Confounding is the effect of an extraneous variable that is associated with the risk factor, but is also an independent risk factor for the disease. For example, an association between birth rank and Down's syndrome, the genetic disorder Trisomy 21 (an extra copy of chromosome 21) does not imply causality; the confounding variable is maternal age, which is associated with birth rank and is a known risk factor for the disease. There may also be confounding variables as yet unknown, but their potential effects must always be considered.

The establishment of causation is a long-debated problem in the philosophy of science. In the practical field of medicine, where life-and-death decisions must be made every day, there are guidelines to help assess the role of agents in the etiology of disease. When microbes were being identified as causes of devastating diseases in the late-nineteenth century, Robert Koch formulated postulates to prove that a particular microbe causes a given disease. Anticipated by his teacher Jacob Henle (1809–1885), these are also called Henle-Koch Postulates. They are shown in Table 5, along with current updates using molecular biology. The original version claims only necessary causation, not sufficient; the microorganism needs a susceptible host. Even more general, the molecular guidelines are expressed in terms of statistical association. But they are the organizing principle in contemporary studies of microbial etiology, crucial for the identification of newly emerging pathogens that may pose serious threats to public health.

Guidelines for establishing causality in observational studies are listed in Table 6. Formulated by Sir Austin Bradford Hill, they are based on criteria employed in the 1964 U.S. Surgeon General's Report to show that smoking causes lung cancer. Applied in a wider context, they are to be used primarily as an aid to exploration. In general there is no necessary or sufficient condition to establish causality from an observed association. Such conclusions result from a consensus of the scientific community.

Epidemiology and Ethics

The complex, probing methods of epidemiology yield tentative, partial, often conflicting results, replete with qualifications. Taken out of context by interest groups or the media, they can mislead and have harmful consequences. Their correct use requires professional competence and integrity. But beyond these issues of immediate concern, epidemiology plays a larger role. With its adjusted measures allowing comparison of health patterns over space and time, it provides a quantitative aerial video of the globe. Some of the images it presents are troubling.

Formulated in 1965 by Sir Austin Bradford Hill, these are very general, tentative guidelines, with numerous exceptions and reservations. Aside from temporality, which may be considered part of the definition of causation, there is no necessary or sufficient criterion for establishing the causality of an observed association.
Hill's Criteria for Establishing Causality in Observational Studies
Aspects of Association to Consider
SOURCE: Hill (1965).
1.STRENGTH: Stronger associations more likely to be causal.
2.CONSISTENCY: Association is observed repeatedly in different populations under different circumstances.
3.SPECIFICITY: Disease outcome is specific to or characteristic of exposure.
4.TEMPORALITY: Exposure precedes disease.
5.BIOLOGIC GRADIENT: Monotone dose-response relationship (increase in exposure corresponds to increase in disease).
6.PLAUSIBILITY: Causal hypothesis is biologically plausible.
7.COHERENCE: Causal interpretation does not conflict with what is known about the natural history and biology of the disease.
8.EXPERIMENTAL EVIDENCE: Removal of putative cause in an intervention or prevention program results in reduction of disease incidence and mortality.
9.ANALOGY: Drug or chemical structurally similar to a known harmful agent may induce similar harmful effects.

There are now more obese than undernourished people living on earth, and their number is increasing rapidly in developing nations. According to a 2000 estimate of the World Health Organization (WHO), there are 220 million adults with Body Mass Index (BMI) <17, classified as undernourished, and over 300 million with BMI > 30, defined as obese. (BMI is weight in kilograms divided by height squared in meters.) This global epidemic of obesity, called globesity, brings with it the related conditions of diabetes, hypertension, and heart disease, and the problem is equally serious for children.

The harmful effects of tobacco have been known for half a century, and while the prevalence of smoking has been slowly declining in most industrialized nations, it has been rising steadily in the developing world. It is estimated that the number of smoking-related premature deaths worldwide, 5 million in 2000, will rise to 10 million per year by 2030, with 70 percent occurring in developing countries. Tobacco use will kill more people than the combined mortality due to malaria, pneumonia, tuberculosis, and diarrhea.

In the area of infectious diseases, after decades of exuberant optimism reality set in with the appearance of acquired immune deficiency syndrome (AIDS) in the 1980s. Homo sapiens lives in a sea of microbes and will never have total control. Vigilance for the emergence of disease-causing strains must be the aim, to detect outbreaks, identify pathogens and their mode of transmission, and seek control and prevention. Knowing the cause may not eliminate the disease, even when possible in principle, if (as with smoking) it hinges on human behavior. AIDS, for example, is preventable. Ongoing threats include new diseases from mutation or isolated animal reservoirs (Ebola, West Nile, severe acute respiratory syndrome [SARS]), resurgence of older strains, drug-resistance, targeted release through bioterrorism, and rapid spread through global travel.

At a WHO conference held in Geneva in November 2004, experts issued an urgent appeal for greater international cooperation, and called on governments to make pandemic preparedness part of their national security planning. Of particular concern was the new bird influenza strain A(H5N1), which could mutate and cause a pandemic on the scale of the influenza epidemic of 1918 that killed more than 20 million people. It is estimated that a new pandemic virus could spread around the world in less than six months, infecting 30 percent of the population and killing about 1 percent of those infected. The drug industry would have to prepare billions of doses of the influenza vaccine within weeks of an outbreak to halt its course. There are questions of what could possibly be feasible technologically, the huge investment needed, and the driving force to motivate the effort when it cannot be a matter of fiscal gain.

In March 2005 the British medical journal Lancet published four articles reporting on the appalling state of global infant health care. Four million babies die each year in the first month of life, nearly all in low- and middle-income nations. The highest numbers occur in south-central Asian countries, while the highest rates are generally in Sub-Saharan Africa. It is estimated that three-quarters of these deaths could be prevented with low-lost interventions. A similar number of babies are stillborn and 500,000 mothers die from pregnancy-related causes each year. The moral implications of this public health tragedy are overwhelming.

The problems humanity faces at the start of the twenty-first century are inseparable from dominant worldviews and the interplay of powerful economic and political forces. Epidemiology provides health-related information as a guide to action. Its proper use is an essential component of the Ethics of Evidence, proposed for dealing with the uncertainties of medicine in the framework of contemporary culture (Miké 1999, 2003). The Ethics of Evidence calls for integrating the best evidence of all relevant fields to promote human well-being, anchored in an inescapable moral dimension. Looking to the future, it urges all to be aware, to be informed, and to be responsible.


History and Scientific Foundations

Applications and Research

Impacts and Issues

Primary Source Connection



Epidemiology is the study of the causes and distribution of illness and injury. It constitutes the scientific underpinning of public health practice. According to noted British epidemiologist Sir Richard Doll (1912–2005), “Epidemiology is the simplest and most direct method of studying the causes of disease in humans, and many major contributions have been made by studies that have demanded nothing more than an ability to count, to think logically and to have an imaginative idea.” In practice, epidemiology is applied in the three main areas of public health: safety and injuries, chronic disease, and infectious disease. This article will emphasize examples and applications in infectious disease epidemiology.

History and Scientific Foundations

The first physician known to consider the fundamental concepts of disease causation was the ancient Greek Hippocrates (c.460—c.377 BC), when he wrote that medical thinkers should consider the climate and seasons, the air, the water that people use, the soil and people's eating, drinking and exercise habits in a region. Subsequently and until recent times, these causes of diseases were often considered, but not quantitatively measured. In 1662, John Graunt (1620–1674), a London haberdasher, published an analysis of the weekly reports of births and deaths in London, the first statistical description of population disease patterns. Among his findings, he noted a higher death rate for men than women, a high infant mortality rate, and seasonal variations in mortality. Graunt's study, with its meticulous counting and disease pattern description, set the foundation for modern public health practice.

Graunt's data collection and analytical methodology was furthered by the physician William Farr, who assumed responsibility for medical statistics for England and Wales in 1839 and set up a system for the routine collection of the numbers and causes of deaths. In analyzing statistical relationships between disease and such circumstances as marital status, occupations such as mining and working with earthenware, elevation above sea level and imprisonment, he addressed many of the basic methodological issues that contemporary epidemiologists deal with. These issues include defining populations at risk for disease and the relative disease risk between population groups, and considering whether associations between disease and the factors mentioned above might be caused by other factors, such as age, length of exposure to a condition, or overall health.

A generation later public health research came into its own as a practical tool when another British physician, John Snow (1813–1858), tested the hypothesis that a cholera epidemic in London was being transmitted by contaminated water. By examining death rates from cholera, he realized that they were significantly higher in areas supplied with water by the Lambeth and the Southwark and Vauxhall companies, which drew their water from a part of the Thames River that was grossly polluted with sewage. When the Lambeth Company changed the location of its water source to another part of the river that was relatively less polluted, rates of cholera in the areas served by that company declined, while no change occurred among the areas served by the Southwark and Vauxhall. Areas of London served by both companies experienced a cholera death rate that was intermediate between the death rates in the areas supplied by just one of the companies. The geographic pattern of infections was carefully recorded and plotted on a map of London. In recognizing the grand but simple natural experiment posed by the change in the Lambeth Company water source, Snow was able to make a uniquely valuable contribution to epidemiology and public health practice.

After Snow's seminal work, investigations by epidemiologists have come to include many chronic diseases with complex and often still unknown causal agents, and the methods of epidemiology have become similarly complex. Today researchers use genetics, molecular biology, and microbiology as investigative tools, and the methods used to establish relative disease risk make use of the most advanced statistical techniques available. Yet, reliance on meticulous counting and categorizing of cases and the imperative to think logically and avoid the pitfalls in mathematical relationships in medical data remain at the heart of all of the research used to show elevated disease risk in population subgroups and to prove that medical treatments are safe and effective.

Basic Epidemiological Concepts and Terms

The most basic concepts in epidemiology are the measures used to discover whether a statistical association exists between various factors and disease. These measures include various kinds of rates, proportions, and ratios. Mortality (death) and morbidity (disease) rates are the raw material that researchers use in establishing disease causation. Morbidity rates are most usefully expressed in terms of disease incidence (the rate with which members of a population or research sample contract a disease) and prevalence (the proportion of the group that has a disease over a given period of time).


INCIDENCE: The number of new cases of a disease or injury that occur in a population during a specified period of time.

MORBIDITY: The term “morbidity” comes from the Latin word morbus, which means sick. In medicine it refers not just to the state of being ill, but also to the severity of the illness. A serious disease is said to have a high morbidity.

MORTALITY: Mortality is the condition of being susceptible to death. The term “mortality” comes from the Latin word mors, which means “death.” Mortality can also refer to the rate of deaths caused by an illness or injury, i.e., “Rabies has a high mortality.”

NOTIFIABLE DISEASES: Diseases that the law requires must be reported to health officials when diagnosed, including active tuberculosis and several sexually transmitted diseases; also called reportable diseases.

PREVALENCE: The actual number of cases of disease (or injury) that exist in a population.

SURVEILLANCE: The systematic analysis, collection, evaluation, interpretation, and dissemination of data. In public health, it assists in the identification of health threats and the planning, implementation, and evaluation of responses to those threats.

The most important task in epidemiology is the assessment or measurement of disease risk. The population at risk is the group of people that could potentially contract a disease, which can range from the entire world population (e.g., at risk for the flu) to a small group of people within a remote and isolated community (e.g., at risk for contracting a particular, ecologically restricted parasite). The most basic measure of a population group's risk for a disease is relative risk—the ratio of the prevalence of a disease in one group with particular biological, demographic, or behavioral characteristics to the prevalence in another group with different characteristics.

The simplest measure of relative risk is the odds ratio, which is the ratio of the odds that a person in one group has a disease to the odds that a person in a second, comparator group has the disease. The odds for contracting a disease are the ratio between the proportion of people in a population group that share particular characteristics that put them at risk for a disease to the proportion of people in a reference or control population (often the general population in a certain region or jurisdiction). For example, patients with chronic obstructive pulmonary disease (COPD), an inflammatory condition of the lungs associated with smoking and long exposure to air pollution, are at significantly greater risk of contracting community-acquired pneumonia (CAP) compared to a general population group matched on age and gender. Thus in a sample of subjects that includes both COPD patients and subjects who do not have COPD, epidemiologists expect that the odds ratio for the COPD patients contracting CAP would be significantly greater than 1.0.

The mortality rate is the ratio of the number of deaths in a population, either in total or disease-specific, to the total number of members of that population, and is usually given in terms of a large population denominator, so that the numerator can be expressed as a whole number. Thus, in 1982, the number of deaths from all causes was 1,973,000 and number of people in the United States was 231,534,000, yielding a death rate from all causes of 852.1 per 100,000 per year. That same year there were 1,807 deaths from tuberculosis yielding a disease-specific mortality rate of 7.8 per million per year.

Assessing disease frequency is more complex because of the factors of time and disease duration. For example, disease prevalence can be assessed at a point in time (point prevalence) or over a period of time, usually a year (period prevalence, annual prevalence). This is the prevalence that is usually measured in illness surveys that are reported to the public in the news. Researchers can also measure prevalence over an indefinite time period, as in the case of lifetime prevalence, which is the prevalence of a disease over the course of the entire lives of the people in the population under study up to the point in time when the researchers make the assessment. Researchers calculate this by determining for every person in the study sample whether or not he or she has ever had the disease, or by checking lifetime health records for everybody in the population for the occurrence of the disease, counting the occurrences, and then dividing by the number of people in the population.

The other basic measure of disease frequency is incidence, the number of cases of a disease that occur in a given period of time. Incidence is a critical statistic in describing the course of a fast-moving epidemic, in which medical decision-makers must know how quickly a disease is spreading. The incidence rate is the key to public health planning because it enables officials to understand what the prevalence of a disease is likely to be in the future. Prevalence is mathematically related to the cumulative incidence of a disease over a period of time as well as the expected duration of a disease, which can be a week in the case of the flu or a lifetime in the case of juvenile onset diabetes. Therefore, incidence not only indicates the rate of new disease cases, but is the basis of the rate of change of disease prevalence.

Epidemiologists use statistical analysis to discover associations between death and disease in populations and various factors—including environmental (e.g., pollution), demographic (age and gender), biological (e.g., body mass index or “BMI” and genetics), social (e.g., educational level), and behavioral (e.g., tobacco smoking, diet or type of medical treatment)—that could be implicated in causing disease.

Familiarity with basic concepts of probability and statistics is essential in understanding health care and epidemiological research. Statistical associations take into account the role of chance in contracting disease. Researchers compare disease rates for two or more population groups that vary in their environmental, genetic, pathogen exposure, or behavioral characteristics and observe whether a particular group characteristic is associated with a difference in rates that is unlikely to have occurred by chance alone.

Applications and Research

Applications in Public Health Practice

Certain concepts are basic to infectious disease epidemiology. These include the infectious agent, which is the organism that can develop within a human host and be passed along to other people via a particular mode of transmission, for example by air, food, or sexual intercourse. Infectious diseases have geographic scope or occurrence, and take a certain length of time to result in disease symptoms called the incubation period. After this incubation period, there is a period during which the individual can pass the infection along to others, called the period of communicability of the disease. The infectivity of a disease is the probability that an infected individual can pass the infection to an uninfected person, and the virulence of an infectious agent is the relative power and pathogenicity possessed by the organism. Populations of animals or human groups that harbor the infectious agent constitute a reservoir of the disease, and an organism such as a tick or insect that carries the infectious agent from such a reservoir to vulnerable individuals is called a vector.

Once the epidemic is underway, public health officials must begin attempts to control it even as they continue to gather epidemiological information about its cause and distribution. These control efforts consist of preventive measures for individuals and groups, which are measures designed to prevent further spread of the disease, and treatment in order to minimize the period of communicability of the infection, as well as reduce morbidity and mortality. Control of patient contacts and the immediate environment are foremost among such preventive measures, which can extend to patient isolation and observance of universal precautions, including handwashing, wearing of gloves and masks, and sterilization in dangerous instances. Epidemic measures, including the necessary abrogation of civil rights as in quarantines, are sometimes necessary to contain a communicable disease that has spread within an area, state, or nation. The epidemic may have disaster implications if effective preventive actions are not initiated, and the scope of actions can be international, requiring the coordination of disparate public health capabilities across national boundaries.

Screening Programs

Screening a community using relatively simple diagnostic tests is one of the most powerful tools that healthcare professionals and public health authorities have in preventing or combating disease. Familiar examples of screening include HIV testing to help prevent AIDS, tuberculin testing to screen for tuberculosis, and hepatitis C testing by insurers to detect subclinical infection that could result in liver cirrhosis over the long term. In undertaking a screening program, authorities must always judge whether the benefits of preventing the illness in question outweigh the costs and the number of cases that have been mistakenly identified, called false positives.

The ability of the test to identify true positives (sensitivity) and true negatives (specificity) makes screening a valuable prevention tool. However, the usefulness of the screening test is proportional to the disease prevalence in the population at risk. If the disease prevalence is very low, there are likely to be more false positives than true positives, which would cast doubt on the usefulness and the cost-effectiveness of the test. For example, if the prevalence of a disease in the population is only 2% and a test with a false positive rate of 4% is given to everyone (normally a good rate for a screening test), then individuals falsely identified as having the disease would be twice as frequent as individuals accurately identified with the disease. This would render the test results virtually useless. Public health officials deal with this situation by screening only population subgroups that have a high risk of contracting the disease. In infectious disease, screening tests are valuable for infections with a long latency period, which is the period of time during which an infected individual does not show disease symptoms, or which have a lengthy and ambiguous symptomatic period.

Clinical Trials

Clinical trials are the experimental branch of epidemiology in which scientific sampling with randomized selection of research subjects is combined with prospective study design and experimental controls involving a placebo or comparator active treatment control group. The statistical analysis used in clinical trials is similar to what is used in other types of epidemiological studies, usually simple counting of cases that improve or deteriorate and comparisons of morbidity and mortality rates between the trial treatment groups.

Clinical trials in infectious disease are most common when a significant follow-up period is available. One such trial was a rigorous test of the effectiveness of condoms in HIV/AIDS prevention. This experiment was reported in 1994 in the New England Journal of Medicine. Although in the United States and Western Europe the transmission of AIDS has been largely within certain high-risk groups, including drug users and homosexual males, worldwide the predominant mode of HIV transmission is heterosexual intercourse. The effectiveness of condoms to prevent HIV transmission is generally acknowledged, but even after more than 25 years of the growth of the epidemic, many people remain ignorant of the scientific support for their preventive value.

A group of European scientists conducted a prospective study of HIV negative subjects that had no risk factor for AIDS other than having a stable heterosexual relationship with an HIV infected partner. A sample of 304 HIV negative subjects (196 women and 108 men) was followed for an average of 20 months. During the trial 130 couples (42.8%) ended sexual relations, usually due to the illness or death of the HIV-infected partner. Of the remaining 256 couples that continued having exclusive sexual relationships, 124 couples (48.4%) consistently used condoms. None of the seronegative partners among these couples became infected with HIV. On the other hand, among the 121 couples that inconsistently used condoms, the seroconversion rate was 4.8 per 100 person-years.

Because none of the seronegative partners among the consistent condom-using couples became infected, this trial presents extremely powerful evidence of the effectiveness of condom use in preventing AIDS. On the other hand, there appear to be several main reasons why some of the couples did not use condoms consistently. Therefore, the main issue in the journal article shifts from the question of whether or not condoms prevent HIV infection—they clearly do—to the issue of why so many couples do not use condoms in view of the obvious risk. Couples with infected partners that got their infection through drug use were much less likely to use condoms than when the seropositive partner got infected through sexual relations. Couples with more seriously ill partners at the beginning of the study were significantly more likely to use condoms consistently. Finally, the longer the couple had been together before the start of the trial was positively associated with condom use.

Impacts and Issues

The control of infectious disease is an urgent mission for epidemiologists employed in various state and federal public health agencies and their partners in private industry and research foundations. The American Public Health Association (APHA) provides guidance for the epidemiology and control of more than 100 communicable diseases that confront public health practitioners at present.

Infectious disease epidemiology requires accurate and timely incidence and prevalence data such as is provided with comprehensive disease surveillance of usual and emerging diseases. Although the development of an organized surveillance system is critical to the provision of these data, the system's effectiveness depends on the willingness and ability of health care providers to detect, diagnose, and report the incidence of cases that the system is supposed to track. A reporting system functions at four levels: 1) the basic data is collected in the local community where the disease occurs; 2) the data are assembled at the district, state, or provincial levels; 3) information is aggregated under national auspices (e.g., the Centers for Disease Control and Prevention (CDC) in the United States); and 4) for certain prescribed diseases, the national health authority reports the disease information to the World Health Organization (WHO).

The reporting of cases at the local level is mandated for notifiable illnesses that come to the attention of healthcare providers. Case reports provide patient information, suspect organisms, and dates of onset with basis for diagnosis, consistent with patient privacy rights. Collective case reports are compiled at the district level by diagnosis stipulating the number of cases occurring within a prescribed time. Any unusual or group expression of illness that may be of public concern should be reported as an epidemic, whether the illness is included in the list of notifiable diseases and whether it is a well-known identified disease or an unknown clinical entity.

Because of the emergence or re-emergence of HIV/AIDS and resistant strains of tuberculosis, malaria, gonorrhea, and E. coli among others, infectious disease epidemiology, once thought to be waning in importance due to significant advances in public sanitation and immunization programs, has re-emerged as an urgent challenge. Infectious diseases currently threaten to destroy social order in some developing nations and pose extremely difficult public health problems even in the wealthiest societies. Hantavirus infections, thought to be a serious problem primarily in Asia, have emerged as an epidemic in the southwestern United States. Lyme disease continues to afflict ever larger populations in the Northeast United States; Ebola virus has jumped from monkeys to humans in Africa and pneumococci are becoming resistant to the antibiotics used to treat infections.

Air travel has created the situation in which travelers can return home from areas where particular pathogens are endemic within the incubation period of every infectious disease, which can potentially precipitate an epidemic.

Primary Source Connection

John Snow (1813–1858) was an English physician who made great advances in the understanding of both anesthetics and the spread of disease, especially cholera.

The first pandemic, which reached Great Britain in 1831, caused as much fear and panic as tuberculosis did in the early twentieth century and HIV/AIDS does today. The death rate from cholera was over 50 percent and medical opinion was sharply divided as to the cause. At the time, John Snow was a doctor's apprentice gaining his first experience with the disease, noting its symptoms of diarrhea and extreme dehydration.

The germ theory of disease, which holds that viruses and bacteria are the causative infectious agents of diseases such as yellow fever, smallpox, typhoid, cholera, and others, was in its infancy at this time. Some doctors accepted the hypothesis of contagion in which disease spreads from one person to another. Others assumed that “miasmata” or toxins in the air, spread disease.

Snow first began a serious scientific investigation of cholera transmission during the 1848 London epidemic. In his classic essay, On the Mode of Communication of Cholera, published on August 29, 1849, he postulated that polluted water was a source of cholera—especially water contaminated by the waste of an infected person, a not-uncommon occurrence at the time. When an outbreak erupted a few years later in central London at the end of August 1854, close to where Snow himself lived, he resumed his research.

The historical claim that Snow removed the pump handle himself—which would, of course, have stopped exposure to the contaminated water—has little evidence and may be a myth. Snow recommended its removal, but the actual removal was probably done by the local curate, Henry Whitehead, several days after the outbreak began.

It is partially thanks to John Snow's work in the Broad Street area that Britain suffered fewer major outbreaks of cholera after this time. An influential figure in medical circles, he had been elected president of the Medical Society of London in 1855. Fortunately for British public health, the successful proof of his theory on the transmission of cholera—from person to person via contaminated water— took hold, and the “environmental” theory eventually died away. Although the actual causative agent, the bacterium Vibrio cholerae, would not be identified until 1883, Snow's preventive methods worked. Indeed, they are still effective today, for despite the advent of vaccination and antibiotics, handwashing and the avoidance of contaminated food and water are still fundamental ways of preventing infection.

Because Snow based his investigation on the idea of germ theory, which French microbiologist Louis Pasteur (1822–1895) would later prove, he used a scientific approach and epidemiological study of cholera victims to validate his hypothesis. As his case notes amply demonstrate, much of his research was driven by his patients’ visible suffering.

