In epidemiological investigations, it is crucial to select the appropriate study design to gather accurate information. Each epidemiological study type has its weakness and strengths. The proper study selection design reduces the sources of bias and confounding.
Epidemiological studies can be classified as earthier observational or experimental (Bonita et al., 2006).
When epidemiological investigator chooses the observational studies, he or she allows the nature to take its course. The investigator measures but does not intervene (Bonita et al., 2006).
Observational studies are two types, descriptive and analytic studies.
A descriptive study is describing the occurrence of a disease in a specific population, and it is usually done as the first step in an epidemiological investigation.
An analytical study analyzes the relationships between health status and other variables. Most of the epidemiological studies are analytical. Investigators do not use the descriptive studies widely, but descriptive data in reports of health statistics are used as a source of ideas for further studies. Descriptive information in which patients with a specific disease and share some characteristics are described but are not compared with the reference population, often investigators conduct the more detailed epidemiological study (Bonita et al., 2006). For example, a descriptive study epidemiological, demographic, and clinical characteristics of 47 cases of Middle East respiratory syndrome coronavirus disease from Saudi Arabia (Asiri et al., 2013).
Experimental studies (Intervention studies):
Experimental or intervention studies must include an active attempt to change a disease determinant ( exposure or behavior ) or the progress of disease through treatment and are similar in design to experiments in other sciences. However, they are subject to further constraints, since the health of the people in the study group may be in danger. Major experimental study designs including randomized controlled trials using patients as subjects (also called clinical trials), field trials (healthy people are the subjects), and community trials which take communities as a participant to study (Bonita et al., 2006).
In all epidemiological studies, it is crucial to put a clear definition of a case of the investigated disease by explaining the symptoms, signs or other characteristics indicating that a person has the condition. A clear description of an exposed person is also essential. This definition must contain all the characteristics that identify a person as being exposed to the factor. If an epidemiological study has no clear explanations of exposure and definitions of disease, it is challenging to analyze and interpret the data obtained from such a study (Bonita et al., 2006).
1- Descriptive studies:
Description of the community health status of is the first step in any epidemiological investigation. In many countries, a national center for health statistics undertakes descriptive studies. In Saudi Arabia, the statistics and information general department at the ministry of health is responsible for conducting the epidemiological health data and highlights the features of the health situation). Pure descriptive studies do not attempt to analyze the relationship between exposure and effect. They usually represent mortality statistics and examine patterns of death by age, gender or ethnicity during specified time periods or in various states. Neonatal Mortality Rate / 1000 live birth from 2006 to 2016 (MOH, 2016).
Such data can be a great assistance when identifying factors that have caused a clear decreasing trend (Bonita et al., 2006). Epidemiologists develop hypotheses from descriptive studies about the causes of patterns and about the factors that increase the risk of disease (CDC, 2012).
2- Ecological studies
Ecological (or correlational) studies used to generate hypotheses. The ecological study takes groups of people rather than individuals as units of analysis. For example, a relationship was found between the spread of MERS-CoV and primary interaction with camel (Reeves et al., 2015). Researchers need to test such an observation by controlling all potential confounders to exclude the possibility that other characteristics have not contributed to such a relationship (Bonita et al., 2006).
Another way researchers use to conduct ecological studies is by comparing populations living in different places at the same time or in a specific period, for example, is the use of ecological data in the World Health Chart via global info base. This will help the epidemiologist to extract up-to-date health determinants in specific countries and compare them for further analysis (Bonita et al., 2006).
Time series could reduce some of the socioeconomic confoundings that are a potential problem in ecological studies. If the time in a time series is very short, as it is in daily time series studies, confounding is virtually zero as the people in the study serve as their controls (Bonita et al., 2006).
Although simple to conduct and attractive to epidemiologists, ecological studies are often difficult to interpret because it is rarely possible to examine the various potential explanations for findings directly. Ecological studies usually depend on data collected for other purposes; data on different exposures and socioeconomic factors may not be available. Also, since the unit of analysis is a group, the link between exposure and effect at the individual level cannot be concluded. One desirability of ecological studies is that data can be used from populations with widely differing characteristics or extracted from different data sources. The previous chart shows that. The increasing death ratio during the heat wave that struck France in 2003 correlated well with increasing of climate temperature, although increasing daily air pollution also played a role. This increase in deaths occurred mainly among elderly people, and the hospitals recorded the cause of death as heart, lung disease, or other diseases (Bonita et al., 2006).
Ecological fallacy or bias:
An ecological fallacy (bias) occurred when the researcher drew inappropriate conclusions based on ecological data. The bias happens because the association observed between variables at the group level does not always represent the existence of the same association between the same variables on the individual level. For example, the lack of relationship between maternal deaths and absence of skilled birth attendants in the four regions in the following figure. There are many other factors impact on the outcome of delivery (World Health Report, 2005). Such ecological inferences can provide a good start for more detailed epidemiological study (Bonita et al., 2006).
3- Cross-sectional studies
Cross-sectional studies measure the prevalence (incidence) of disease. Thus, cross-sectional are often called prevalence studies. In a cross-sectional study, the epidemiologist measures the exposure and effect at the same time. It is difficult to assess and examine the reasons for associations shown in cross-sectional studies. The critical question to be asked -and investigated- is whether the exposure precedes or follows the effect -what comes first-. If the exposure data are known to trigger exposure before the occurrence of any effect, the data from a cross-sectional study can be analyzed like data generated from a cohort study (Bonita et al., 2006).
Some of the strengths of cross-sectional, they are not tricky, inexpensive to conduct and useful for investigating exposures that are unchangeable characteristics of individuals, such as ethnicity and gender. In outbreaks of disease, a cross-sectional study can be the most convenient first step in investigating the cause to measure several exposures. Data from cross-sectional studies are a helpful resource to assess the health care needs of populations. Data collected from repeated cross-sectional surveys using independent random samples with standardized definitions and survey methods could indicate trends (Tolonen et al., 2004; Bonita et al., 2003). Each survey should have a clear aim and purpose. Valid surveys need well-made questionnaires, an appropriate sample of sufficient size, and an acceptable response rate. Many countries conduct cross-sectional surveys regularly of representative samples of their populations focusing on personal and demographic characteristics, illnesses and health-related behaviors and activities. The frequency of disease and risk factors could then be examined about age, sex, and ethnicity. Cross-sectional studies that study risk factors for chronic diseases have been conducted repeatedly in many countries. World health organization have publicly supplied the public with the data and info of risk factors for chronic diseases and the latest chronic diseases incident rate for each country and can be accessed via its website (Bonita et al., 2006).
4- Case-control studies:
Case-control studies provide a simple way to investigate the causes of diseases, and researchers prefer this design to study rare diseases. They compare people with a disease (or another outcome variable) of interest with appropriate control for comparison (sometimes called reference) group of people unaffected by the disease or outcome variable. The investigators collected data on disease occurrence at one point in time and exposures at a previous point in time (Bonita et al., 2006).
Case-control studies are longitudinal when compared to cross-sectional studies (see the Figure 3.5). Some researchers call the case-control studies retrospective studies since the investigator is looking backward from the disease to a possible cause. This can confuse many people since the terms retrospective and prospective are usually used to describe the timing of data collection about the current date. So, a case-control study could be retrospective, when all the data deal with the past, or prospective, when the process of data collection continues through time (Bonita et al., 2006).
In a case-control study, the epidemiologist begins by selecting the cases; these cases should represent all the cases in a specified population group. Cases are selected based on disease, not exposure. Controls are people without the disease within the community. Finding a cost-effective method to identify and enroll control subjects can be a challenging aspect of population-based case control studies (Bernstein., 2006). The most challenging task is to select controls to sample the exposure prevalence in the population that produced the cases (Bonita et al., 2006).
Moreover, the selection of controls and cases must not be influenced by exposure status, which should be determined in the same way for both. It is not essential for cases and controls to be all-inclusive; in reality, they can be restricted to any specified subgroup, such as males or females. The controls should represent people who could be designated study cases if they just had developed the disease. To avoid separating difficulties of factors related to causation and survival, case-control studies tend to use new cases (incident). Case-control studies have often been conducted using prevalence data (such as case-control studies of congenital malformations). Case-control studies can estimate the relative risk of disease, but they cannot determine the absolute incidence of disease (Bonita et al., 2006).
An essential aspect of case-control studies is the determination of the beginning and duration of exposure for both cases and controls. In the case-control studies, the exposure status of the cases is usually defined after the development of the disease (retrospective data) and the data is collected usually by direct questioning of the affected person, a relative or friend (Bonita et al., 2006).
The informant’s answers may be affected by knowledge about the hypothesis under investigation or the disease experience itself. For example, Researchers in Papua New Guinea compared the history of meat consumption in people who developed enteritis necrotic Ans, with people who did not have the disease. Respectively people who had the disease (50 of 61 cases) reported prior meat consumption more than who was not affected by the disease (16 of 57) (Millar et al., 1985). Biochemical measurements sometimes determine exposure (e.g., lead in blood or cadmium in urine), which may not accurately reflect the relevant past exposure. For example, lead in blood at age six years is not a good indicator of exposure at an earlier age (1 to 2 years), which is the age of highest sensitivity to lead. This problematic situation can be avoided if exposure could be estimated from a recording system (e.g., stored results of routine blood check) or if the case-control study is conducted before so that exposure data gathered are collected before the development of the disease (Bonita et al., 2006).
The odds ratio is the ratio of the odds of exposure among the cases to the odds of exposure among the controls. This shows that the cases were 11.6 times more likely than the controls to have recently eaten meat. Mainly if a disease is rare, the odds ratio is very similar to the risk ratio. For the odds ratio to be a useful approximation and indicator, the cases and controls must be representative of the population concerning exposure. However, because the incidence of the disease is unknown, the absolute risk cannot be calculated. An odds ratio should be accompanied by the confidence interval observed around the point estimate (Bonita et al., 2006).
5- Cohort studies:
Cohort studies, sometimes called follow-up or incidence studies, begin with a group of people who are free of disease and classified into subgroups according to exposure to a potential cause of disease or outcome (Bonita et al., 2006).
Researched define and measures the Variables of, and the cohort is followed up to see how the following development of new cases of the disease (or another outcome) differs between the subgroups with and without exposure. Since the data on exposure and disease refer to different points in time, cohort studies are longitudinal, similar to case-control studies. Some researchers called cohort studies prospective studies, but this term is confusing and should be avoided. The term “prospective” refers to the data collection timing, not to the association accuracy of the exposure and the appearance effect. That is why there can be both prospective and retrospective cohort studies. Cohort studies produce the best information about the causation of disease and a direct measurement of the risk of developing the disease. Although, cohort studies are significant undertakings and may require long periods of follow-up since disease may occur a long time after exposure. For example, the induction period – the time required for the specific cause to produce an outcome – for leukemia by radiation could take many years. Thus, it is necessary to follow up study participants for a long time. Researchers tend to conduct long-term studies for investigating many exposures, and accurate information about them requires data collection over long time periods.
Nevertheless, in the case of tobacco use, many tobacco users have quite stable habits, and researchers can collect information about the past, and current exposure at the time the cohort is defined (Bonita et al., 2006).
In situations with sudden acute exposures, the cause-effect relationship for acute effects may be apparent. Epidemiologists can also use the Cohort studies to investigate late or chronic effects (Bonita et al., 2006). For example, measuring effects over a long period is the catastrophic poisoning of residents around pesticide factory in Bhopal, India, in 1984 (Lapierre ; Moro., 2002).
As it is known in the case of cohort studies, they usually start with exposed and unexposed people of disease. The measurement difficulty or ability to find or reach existing data on individual exposures mostly determines the usefulness of conducting cohort studies. Epidemiologists face a problem when using cohort study to study a rare disease in the exposed group and unexposed group; there may also be problems in obtaining a large enough study group (Bonita et al., 2006).
Cohort study costs could be minimized by using routine sources information about mortality or morbidity which could be obtained from ecological studies, Such as disease registers or national registers of deaths as part of the follow-up. One example is the Nurses’ Health Study. It was a large cohort study that been developed to make it less expensive to run. In 1976, 121700 married female nurses aged 30–55 years completed the initial of the questionnaire of the Nurses’ Health Survey. For each two years period, self-administered questionnaires were sent to these nurses, who supplied information on their health behaviors and medical histories. The objective of the initial cohort study was to evaluate the health effects of oral contraceptive use. Investigators tested their methods on subgroups of the larger cohort and obtained information on disease outcomes from routine data sources (Colditz et al., 1986).
In addition to studying the relationship between oral contraceptive usage and the ovarian and breast cancer risk, they were also able to evaluate other diseases in this cohort such as heart disease and stroke, also, the relationship between smoking and the risk of stroke. Although stroke is a relatively common cause of death, it is a rare occurrence in younger women, and so a large cohort is needed (Bonita et al., 2006).
it is possible to study a large variety of outcomes because cohort studies take healthy people as their starting point. The Swedish twin registry is an excellent example of the type of data source that can be used to answer many epidemiological questions (Lichtenstein et al., 2002).
Historical cohort studies:
Costs can occasionally be reduced by using a historical cohort which is identified by records of previous exposure. The reson why it is called a historical cohort study because all the exposure and effect (disease) data have been collected before the actual study begins. For example, records of military personnel exposure to radioactive fall-out at nuclear bomb testing sites have been used to examine the possible causal role of fall-out in the development of cancer over the past 30 years (). This sort of design is relatively common for studies of cancer-related to occupational exposures.
Nested case-control studies:
The nested case-control design makes cohort studies less expensive. The cases and controls are both chosen from a defined cohort, for which some information on exposures and risk factors is already available (Figure 3.7). Additional information on new cases and controls, particularly selected for the study, is collected and analyzed. This design is particularly useful when measurement of exposure is expensive. An example of a nested case-control study is shown in Box 3.5.