Tufts OpenCourseware
Author: James N Hyde, M.A.,S.M.
  • Discuss guidelines for assessing causality
  • Contrast causation and association
  • Describe and contrast the types of observational studies
Color Key
Important key words or phrases.
Important concepts or main ideas.

1. Introduction

You can observe a lot by just watching.”
Yogi Berra
Former catcher N.Y. Yankees

At its heart Epidemiology is an observational science. The Holy Grail for the epidemiologist is the ability to study and identify causal relationships, whether between a drug or surgical procedure and a favorable clinical outcome, or a factor such as antecedent viral illness and the incidence of disease e.g. multiple sclerosis. Where do all of the hypotheses that are tested in these studies come from? Do they just spontaneously emerge from the experience of public health scientists and clinicians, or do they result from some more organized process?

The answer is both. Ideas for studies may well come from clinical observations, e.g. recent trials of the efficacy of angiogenesis inhibiting compounds such as angiostatin or from observations among a large number of people, e.g. dietary fiber and colorectal cancer. In fact, they derive often from “natural experiments” in which chance has resulted in an “exposure” which has led to an outcome. Someone observes these two events and wants to investigate whether or not their relationship is causal. An example is the observation that women with low saturated fat intake have lower rates of breast cancer than women with high saturated fat intake. Because of ethical considerations and cost, mounting a clinical trial to unravel the role of saturated fat intake in breast cancer may not be an option. Hence there is a need to find mechanisms to take advantage of natural experiments in which the breast cancer rate in women with high levels of saturated fat intake are compared to breast cancer rates of women with low saturated fat intake.

The term “observational study” refers to the fact that the investigator takes advantage of natural events and studies them, unlike intervention studies, such as a clinical trial, in which the investigator determines who gets the exposure and who does not.

Before moving on to discuss observational studies a word about “causation vs. association.”

1.1. Causation and Association:

Epidemiologists are very wary of using the word “cause.” The reason is not that we are afraid to make definitive statements, but rather that during our training we have it beaten into us the difference between things that are “associated” with each other vs. things that are causally related to each other. This is a distinction that lay people and especially people in the media grasp only in the dimmest of ways. For example, consider the association in the figure below showing the relationship between per capita # of telephones and coronary heart disease deaths for 15 countries in the world:

Per Capita Phones and CHD Deaths

Clearly, it’s crazy to assert that telephones cause heart disease. On the other hand, there is a true association depicted here; it’s simply not causal.

Epidemiologists identify three types of non-causal associations:

  1. Chance associations that can occur at random. (A large part of this course will be spent exploring these sorts of events).
  2. Artifactual associations that occur through some error or defect in the design or execution of a study.
  3. Indirect associations in which an exposure is associated with an outcome but through a third factor or variable.1 (What variable or variables might that be in the case of telephones and CHD?)

Once you have ruled out the likelihood of any of these non-causal explanations for an apparent association, one is left with one additional possibility: The association is causal. “If you wiggle A then B wiggles too”. This is not proof of causation but it’s additional evidence of its existence. A set of guidelines for assessing causal associations in epidemiologic studies has evolved over the years, starting with the work of Bradford Hill. Many others have contributed to the list of criteria for assessing causal associations. (See Gordis pp.193-195.)

  1. Temporality: What is the evidence that the exposure precedes the outcome?
  2. Strength: How strong is the association? The stronger the association, the more likely it is that the association is causal.2
  3. Dose-response relationship: Is there a relationship between the magnitude of the exposure and the amount of disease observed?
  4. Replicability: Have other studies demonstrated the same findings or relationship between the exposure and outcome?
  5. Biologic Plausibility: Is this relationship consistent and congruent with current biological knowledge?
  6. Alternative Explanations: Have alternative explanations for the observed results been explored?
  7. Cessation of Exposure: Does the outcome diminish if the exposure is removed?
  8. Specificity: Does the exposure have specific effects or generalized effects? (This is the weakest of the criteria.)
  9. Consistency with other knowledge: How consistent are the findings with other knowledge we currently have?

It is not required that all of these criteria be met. But epidemiologists will feel more confident about putative associations that at least meet the first five of these criteria.

1.1.1. Footnote

We use the word “exposure” to refer to anything whose relationship with an “outcome” is being explored.

1 Note the use of terminology here. An exposure can be a drug, a surgical procedure or a cellphone, while an outcome is the factor of interest. An outcome can be a disease, death, emotional well being, however we define it.

2 You will learn later that the strength of an association is often measured by the magnitude of the relative risk or odds ratio.

2. Types of Observational Studies:

There are two general types of observational studies: descriptive studies and analytic studies. Descriptive epidemiolgical studies and techniques provide a critical step in unraveling etiology because, as the term suggests, they attempt to describe patterns of disease and antecedent factors as they occur in “free-range” human populations. They often provide the first clues as to the relationship between exposures and outcomes. Examples include fluoride and dental caries, smoking and lung cancer, and Downs Syndrome and maternal age.

2.1. Descriptive Epidemiology

Descriptive epidemiology is concerned with understanding and describing the patterns of disease and illness in human populations in order to obtain clues to the possible etiology of disease. Every disease or health event you will encounter in medicine has a natural history, i.e. a signature pattern that occurs time and again in the absence of any outside intervention. This pattern often provides clues as to causes. The observation that sailors developed signs and symptoms of scurvy in the absence of citrus fruits led to the stockpiling of these items on British Navy ships even before the relationship was fully understood. Another example is the observation that suicides are directly related to latitude such that as one moves closer to the Arctic Circle, for example, suicide rates (and alcoholism rates) increase. In other words, simple observations of the distribution of these events in human populations led to these discoveries.

2.1.1. Descriptive observational studies fall into several types:

  • Case reports and case series. Example: I saw three patients this year who reported psychotic episodes immediately after watching a reality competition show on television.

Note: there is no comparison group; no detailed definition of either the exposure or the outcome; there is a suggestion that the exposure preceded the outcome but no proof. There is an implied relationship between the two events.

  • Cross-sectional studies: Example: A study of the severity of angina (heart disease-related chest pain) and exercise showed that there was an inverse relationship between the level of exercise and reported chest pain severity.

Note: It is not clear which is the exposure and which is the outcome. Does lack of exercise cause chest pain? Or does chest pain lead to lack of exercise?

  • Ecological studies: These are large-scale studies that provide population-wide estimates of disease rates vs. exposure status. For example, data are often presented comparing various countries around the world. An example is dietary fat intake by country and breast cancer. (See Gordis p. 185-186.)

Note: While these studies can provide intriguing insights, we are looking at summary estimates for the whole population. We never actually know whether or not individual women with breast cancer have high caloric fat intake.

While descriptive studies are important because they can lead to hypotheses that in turn can be investigated in analytic studies, they are not good for the purposes of providing definitive answers to causal questions. Too often one will see in the popular literature cross-sectional and ecological studies reported as if they were providing definitive answers to causal questions.

A final note. With cross-sectional studies there is an inherent weakness because there is an inability to ascertain the temporal relationship between the exposure and outcome.

2.2. Analytic Studies:

There are two general types of analytic observational studies: 1.) cohort studies, of which there are both prospective and retrospective types, and 2.) case control studies. The major differences in these studies relate to:

  1. How and when we identify our sample of subjects, and
  2. How and when we measure exposure and outcome.

2.2.1. Cohort Studies:

Cohort studies come in two types: prospective and retrospective. These terms describe the temporal relationship between the investigator and the outcomes of interest. (See the diagram below.) Studies in which the exposure and outcome have already occurred are termed retrospective, while studies in which the outcome has yet to occur are called prospective.

Types of Observational Studies

The term “cohort” is derived from the fact that, in these studies, one begins with the “exposure” by choosing a cohort of people with an exposure and comparing the number of new cases of disease in this group with the number of new cases in a cohort of non-exposed people.

Cohort Studies (prospective)

Often these studies follow a cohort over a relatively long period of time into the future, i.e. prospectively, in order to count the number of new cases of disease that occur. It is this often long follow-up period that can make prospective cohort studies quite expensive to conduct. Several Key Points:

  1. Implicit in the design is the notion that, in terms of the disease of interest , the subjects are “disease-free” on entering the study. (We want to make certain that as investigators we can warrant that the exposure occurred before the outcome.) In order to achieve this, investigators screen subjects at the beginning of the study to make certain that at the time of entry they are “healthy”.

  2. Assessing exposure. Once the cohort is assembled we need to assess the degree of exposure. The major options here are: 1) ask subjects to recall their exposure history; and 2) measure the subjects’ exposure directly.

    Questionnaires are often used to gather exposure information. The accuracy of this method of exposure assessment can be highly variable. For example, most people could accurately report on a questionnaire the number of times they have had a colonoscopy. On the other hand, if you use a questionnaire and ask people how many glasses of wine they drank in the last seven days the results are likely to be far less accurate for a variety of reasons. These include poor memory, social stigma associated with reporting too much consumption, and differences in the size of a “glass” from one household to the next. An additional problem is that there are some exposures that cannot be measured with questionnaires, e.g. radon gas and electromagnetic fields, simply because they are unknowable by the subject.

    Historical records are sometimes used in lieu of questionnaires, e.g. using actual telephone call records to measure the amount of cell phone use in a study of brain cancer among cell phone users.

    Direct measurement of exposure is often best. For example, in studies of caffeine exposure and headache, actual measurement of serum caffeine levels has been used in place of food and dietary histories.

  3. Comparability of exposure groups. At the heart of most difficulties in the conduct of epidemiologic studies is the question of comparability between groups being compared. If, in a study of coffee drinking and blood lipid levels, the groups being compared are different with respect to daily consumption of saturated fat, the results are likely be invalid. In RCTs we try to take care of this problem by randomly assigning subjects to treatment (exposure) groups, thus creating groups of subjects that are comparable to one another with respect to all but the exposure (treatment).

    In prospective cohort studies we are stuck with what we have in terms of the exposed and non-exposed groups. We often try to place certain restriction criteria on subject recruitment in order to avoid a range of factors that may invalidate the results. For example, in the illustration above we might restrict participation in the study only to people who consume a minimal amount of fat each day, thus reducing the likelihood that the groups will differ widely with respect to this factor.

    The other approach, as you will learn later, is that we use statistical measures or procedures to make the groups statistically similar to one another. In any case, none of these procedures is quite as effective as randomization in assuring comparability between groups.

  4. Assessing outcomes. The tools for assessing health outcomes parallel those for measuring exposures. In general, medical outcomes have harder endpoints than exposures, e.g. decreased serum lipid levels, blood pressure, or survival. In the case of cohort studies, however, the investigator has the enormous benefit of not having to rely on historical records since the outcomes occur after the initiation of the study. Since in a prospective study we know that the outcome always follows the exposure, problems with the temporal relationship between exposure and outcome that exist with cross-sectional studies are never a problem. In this way the prospective study is very similar to the RCT.

  5. Because long follow-up periods are often involved with these studies there is always the chance of losing subjects or having subjects “lost to follow-up”. RCT’s also have this problem. If loss to follow-up is random (i.e., unrelated to either exposure or outcome) it will reduce the size of the effect we see at the end of the study. If, on the other hand, loss to follow-up is related to exposure and outcome, it can invalidate our results.

    For example, suppose that in a study of colorectal cancer and alcohol consumption, subjects who were diagnosed with cancer were likely to drop out of the study and be lost to follow-up. If this occurred equally among drinkers and non-drinkers it would diminish the sample size of the study. If, on the other hand, drinkers were more likely to drop out, we might underestimate the risk of cancer as a result of drinking.

  6. Analytic methods. Most often in cohort studies the measure of interest is the ratio of the number of new events of illness (e.g. heart attacks, head aches, prostate cancers) in the group with the exposure vs. the number of similar events among those without the exposure. (This ratio, called the relative risk, will be discussed in greater detail in Lecture 3 - Descriptive Epidemiology and Descriptive Statistics).

    Prospective cohort studies share many similarities with RCTs, but the principal difference is that exposure in a RCT is assigned by the investigator while in the case of the cohort study, it is assigned by nature or by the subjects themselves, e.g. cigarette smoking. Cohort studies are a good choice for studying rare exposures, for example, people exposed to weightlessness, because the investigator can “assemble a cohort” of former astronauts and simply study their long term health outcomes. The problem then becomes one of finding a non-exposed group that is comparable in order to assess differential outcomes. A major problem with prospective studies involves the costs of following groups over the long period of times necessary to assess outcomes. This is especially true of diseases with long induction times, e.g., chronic diseases such as coronary heart disease or cancer. One problem is loss to follow-up, already discussed, while another is cost. If the diseases or outcomes being studied have long induction times then these costs can mount quickly. For example, it would not be unusual to spend $600 a year to follow a subject in a prospective study (costs of maintaining contact, collecting and storing information, several medical tests). Using these figures, a ten-year cohort study with 1000 subjects would cost a minimum of $6m.

2.2.2. Retrospective Studies:

An alternative to the prospective approach is the retrospective approach. Here the investigator uses data and information on exposure and outcomes that have already occurred. Using these “historical” sources, the investigator seeks to explore the relationships between exposure and outcome.

Types of Observational Studies

Epidemiologists would almost always prefer to conduct a prospective study if given a chance, but it is not always possible to do so. Retrospective studies have certain advantages:

  1. Cost. They are almost always cheaper to conduct than prospective studies since follow up is not required.
  2. Time. The absence of a need to follow-up means that they can be conducted quickly, often using data and information that have already been collected and stored in records.
  3. Rare outcomes. Prospective studies are not very efficient when studying diseases or health events that are rare, e.g. PKU that occurs 1 in 30,000 times in the population of newborns. It would be necessary to have a population of 3,000,000 newborns simply to find 90-100 cases of PKU. However, a group of newborns with PKU could be assembled relatively easily by pooling cases from multiple states. Retrospective Cohort Studies:

Ideally we would like to replicate in a retrospective study all of the features that are part of a prospective cohort study. Sadly, we often have to make serious compromises. Historical exposures are difficult to measure. Retrospective cohort studies, as with prospective studies, begin with exposure and look at subsequent outcomes. (see diagram)

Cohort Studies (retrospective)

It is critical that the investigator determines that exposure precedes the outcome. This is not always easy when utilizing historical data and information. Complicating the issue is the need to use historical data and records since tissue measurements and other forms of direct measurement of exposure are not feasible.

Suppose we wanted to measure the long-term impact of DDT exposure and breast cancer. (DDT has been banned in the U.S. for many years). As a consequence, it might be difficult to assemble an exposed cohort at this point in time. Even if you did assemble such a cohort, how can you be certain that the subsequent reported disease events did not precede the exposure? If we are using data from interviews and questionnaires, it is quite likely that subjects’ recall of these events can be distorted by temporal distance.

For example, a woman who was recently diagnosed with breast cancer and who has followed news stories about environmental estrogens may, quite naturally, tend to over-estimate her historical exposure as compared to women who are breast cancer free. Additionally, since DDT persists in the environment for long periods of time, how is it possible to find a non-exposed group? As a consequence of these factors, all manner of distortions and error can be introduced. (We will deal with these sources of error at much greater length in subsequent sessions.)

Similarly, in measuring outcomes, investigators often have to rely on medical records and the clinical and medical judgments of others in the past. In some instances, the key diagnostic tools and instruments that are commonly used today may not have been available in the past to validate diagnoses. This problem is best illustrated in the criminal justice system as we learn from cutting edge DNA technologies that many past judgments on guilt and innocence have been in error. The advantage of prospective investigations is that they allow the researcher to establish the criteria for outcomes and they do not require reliance on past medical judgments of others.

The principal axis of analysis with these studies is the same as with prospective cohort studies. In other words, the investigator looks to see if the amount of disease in the exposed group is greater than or less than the amount in the non-exposed group. This ratio, called the relative risk, will be examined in more detail in the next session.

2.2.3. Case Control Studies:

The key concept to grasp about case control studies is that they begin with the outcome and proceed to look back at the history of exposure. (See the figure below.) Case-control studies then have the unique advantage of guaranteeing a substantial number of subjects with the disease of interest for study. This is a substantial plus, especially when studying rare disease events. (Remember the example of PKU with an incidence of 1 per 30,000 live births.).

Case Control Studies (restrospective)

Similarly, it is important to note that the controls in a case-control study are not like the “controls” in a cohort study, i.e., a non-exposed group. The controls in this instance are subjects without disease, NOT the group without the exposure.3 Footnote

3 : The control group in case control studies is sometime referred to as the “case-referent group” and, as a consequence, these studies are referred to as Case-Referent Studies. However, the term case-control is far more common. As a consequence, we will use the term case-control study throughout this course.

2.2.4. Several Key Points:

  1. Selection of cases. Since, by definition, cases have the outcome of interest, the selection of cases in case-control studies is comparable to assessing the outcome in prospective studies. The investigator needs a stringent case definition, just as one is needed in cohort studies, to ensure that the real outcome of interest is being studied and not some phantom outcome. The difference is that the investigator begins with the outcome.

    Cases are selected either at the time of diagnosis or hospitalization. Some of the mechanisms that are used include case registries, hospitalized patients, specialty clinics or even death certificate records. The use of deceased subjects always presents problems since it rules out personal interviews to collect information on exposure history. This then forces the investigator to rely on next of kin, friends or acquaintances that can be problematic, especially when the information sought has associated social or other stigma, i.e. substance abuse, sexual behavior.

  2. Comparability and selection of controls. This can be the most problematic part of conducting these studies. Controls should “look” just like cases in every respect except that they do not have the outcome or disease of interest.

    Two general approaches are used. The first is to identify controls from the same hospital or clinic as the cases. We need in this case to identify controls with conditions that are unrelated to the outcomes under study. A second approach involves selecting controls from the community. Again they should represent the population that would have been cases had they developed the disease. For example, it would be important in the case of community controls to choose subjects from areas served by the same hospital as the cases.

    In addition, a procedure called matching, that will be discussed in a later lecture, is often used in selecting controls to ensure that the two groups are comparable with respect to certain demographic factors such as age, sex, education, race, etc. Failing this, statistical procedures are sometimes employed to make the groups more comparable. In essence, the investigator wants to be assured that in making comparisons between cases and controls s/he is comparing apples with apples.

  3. Assessment of exposure. Since exposure assessment is handled in the same way as in retrospective cohort studies, the issues are the same. Problems of reliance on record keeping, the veracity of the human memory and recall, and the need to capture events in the past can plague the conduct of these studies.

  4. Analytic methods. In case-control studies the analytic approach involves assessing the likelihood that someone with the disease (outcome) of interest will report a particular exposure. Since we start with disease and choose a group of control subjects to match these cases, we cannot measure in a meaningful way a measure of the incidence of new cases of disease as we can in cohort studies. The measure that is used is call the Odds Ratio, about which you will learn more next session.

NOTE: For a summary of strengths and weaknesses of cohort and case control studies see Table 12-1, P. 181 in Gordis.

3. Summary

Observational studies can be powerful tools for better understanding and unraveling causal relationships between exposures and outcomes. Often observational studies provide the only way to study a particular problem because of ethical, cost or time considerations. Because we must measure both exposure and outcomes, the likelihood of introducing errors increases dramatically. Further, the need to use techniques that are less effective than randomization to assure comparability between the groups leads to the potential of further compromising the validity of study findings.

However, sophisticated approaches to the design, conduct and analysis of data and information from these studies can be employed to help minimize these shortcomings. As a consequence, observational studies play an important role in both public health and clinical epidemiology.

4. Ancillary Material

4.1. Readings

4.1.1. Required

  • Read Chapter 8, Cohort Studies, Gordis Text
  • Read Chapter 9, Case-Control and Cross-Sectional Studies, Gordis Text
  • Read Chapter 12, A Pause for Review: Comparing Cohort and Case-Control Studies, Gordis Text
  • Read Chapter 13, Evidence for Causal Relationship, pages 192-201, Gordis Text