class: split-70 with-border hide-slide-number bg-brand-red background-image: url("images/USydLogo-black.svg") background-size: 200px background-position: 2% 90% .column.white[.content[ <br><br><br> # Experimental design ## .black[STAT3022 Applied Linear Models Lecture 22] <br><br><br> ### .black[2020/02/20] ]] .column.bg-brand-charcoal[.content.white[ ## Today 1. Experimental unit and observational unit 2. Blocking 3. Confounding 4. The ideal and the reality of experimental design ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Meet Sir Ronald Fisher ]] .row[ .split-two[ .column[.content[ * Ronald Fisher worked at Rothamsted Experimental Station (England) from 1919--1933 analysing crop experiments. * Ronald Fisher (1935)'s *The Design of Experiments* laid the foundational work in experimental design. * Ronald Fisher (1890-1962) spent his last days in Adelaide and his remains were interred within St Peter's Cathedral. ]] .column.center.bg-brand-gray[.content[ <img src="images/fisher2.png" width="30%" height="30%"> ]] ]] --- class: split-10 count: false .row.bg-brand-blue.white[.content.vmiddle[ # Experimental Unit and Observational Unit ]] .row[.split-two[ .column[.content[ ## Definitions * A .brand-blue[treatment] is the description of the set of different experimental conditions to be tested. * An .brand-blue[experimental unit] is the smallest division of the experimental material such that any two units may receive different treatments in the actual experiment. * An .brand-blue[observational unit] is the smallest unit which the response will be measured on. * An .brand-blue[analytical unit] is the basic unit in a statistical analysis. ]] .column.bg-brand-gray[.content[ ## Example * A lady claimed that she was able to tell whether the tea or the milk was added first to a cup. * The lady was served 8 randomly ordered cups of tea. * Four were prepared by first adding milk and other four prepared by first adding the tea. She was to select the 4 cups prepared by one method. * She was fully informed of the experimental method. * Treatment: Tea preparation - adding milk first or tea first. * Experimental Unit: The cups. * Observational Unit: The cups. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: Tomatoes ]] .row[.split-70[ .column[.content[ * Different varieties of tomato are grown in pots, with different composts and different amounts of water. * Each plant is supported on a vertical stick until it is 1.5 metres high, then all further new growth is wound around a horizontal rail (within the same pot). * Groups of five adjacent plants are wound around the same rail. * When the tomatoes are ripe they are harvested and the weight of saleable tomatoes per rail is recorded. * Treatment: Variety-compost-water combination * Experimental Unit: Pots. * Observational Unit: Rails. * Each pot or experimental unit has a unique variety-compost-water combination. * Each experimental unit consists of several observational units. ]] .column.center.bg-brand-gray[.content[ <img src="images/tomato.jpg" width="80%" height="80%"> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: Asthma ]] .row[.split-60[ .column[.content[ * Several patients take part in an experiment to compare drugs intended to alleviate the symptoms of chronic asthma. * For each patient, the drugs are changed each month. * Their peak flow rate in their lungs is measured every three months. * Treatment: Drugs * Experimental Unit: Patient-month combination. * Observational Unit: Patient-three month combination. * Each observational unit consists of three experimental units. ]] .column.bg-brand-gray[.content[ <img src="images/AsthmaAdult-650x450.jpg" width="100%" height="100%"> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Relationship between EUs and OUs ]] .row[.split-60[ .column[.content[ 1. EUs and OUs are mostly the same. 2. Each EU consists of a number of OUs. E.g. tomatos experiment. 3. Each OU consists of multiple EUs. E.g. asthma experiment. 4. EUs and OUs partial overlap but neither is completely contained in the other. The cases 3 and 4 are poor designs and should be avoided. AU is often OU but it may change in different statistical analyses. In some analyses, AU may be aggregation of some OUs. ]] .column.bg-brand-gray[.content[ ## Mathematical formulation * Denote `\(\tau\)` as a whole set of treatments. The number of treatment `\(|\tau|= t\)`. * Denote `\(\Omega\)` as a whole set of OUs. The sample size is generally given by `\(|\Omega|=N\)`. * The .brand-blue[design] is the allocation of treatments to OUs. * Mathematically, the design is a function `\(T:\Omega \rightarrow \tau\)`. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Blocking ]] .row[.split-50[ .column[.content[ * .brand-blue[Blocking] is the arranging of EUs in groups (blocks) that are similar to one another. * Typically, a blocking factor is a source of variability that needs to be controlled as it is not of primary interest to the experimenter. * Blocking can lower the residual degrees of freedom with less source of unexplained variance and hence reduce the variance of treatment contrasts. * Consequently, it increases the power (the probability of rejecting `\(H_0\)` when it is false) of the experiment particularly when the sample size is small. * However a non-homogeneous block can decrease the power of an experiment. ]] .column.bg-brand-gray[.content[ * **Natural discrete divisions** between EUs. E.g. in an experiment with people, the gender make an obvious block. * **Continuous gradients** if the experiment is spread out in time or space then there will probably be continuous underlying trends across time or space but they have no natural boundaries. * E.g. OUs can be grouped into OUs which are contiguous in time or space such as months or districts. * E.g. in agricultural field trials, the underlying trend may be changes in soil fertility. * In this case, the choices of block boundary can be somewhat arbitrary. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: blocking in Weed Control experiment ]] .row[.split-60[ .column[.content[ ### Choice of block in field trials This experiment compares methods of controlling weeds in which crop is planted by hand. Here are some considerations in blocking to improve treatment comparison. * If possible, blocks should all have the same size. * If possible, blocks should be big enough to allow each treatment to occur at least once in each block. * Natural discrete blocks should always be used once they have been recognised. However the trial is too large for one person to do all the planting so workers are employed to do the planting. This may lead to additional source of individual worker effect. ]] .column.bg-brand-gray[.content[ <img src="images/narrabri.png" width="100%" height="100%"> .bottom_abs.width100[ .blue[Next we consider more factors, selection bias, observer bias and consent bias, that may affect the results ...] ] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: biases in Salk vaccine field trial ]] .row[.split-60[ .column[.content[ * The first polio epidemic hit the United States in 1916 claiming hundreds of thousands of victims, especially children. * National Foundation for Infantile Paralysis (NFIP) was ready to test the vaccine developed by Jonas Salk. * A controlled experiment was proposed to test the effectiveness of the vaccine on grade 1, 2 and 3 children at selected school districts throughout the country where the risk of polio was high. In total two million children were involved. Note not all parents consented to their children to be vaccinated. * Treatment? Vaccine * Experimental unit? Each child * Observational unit? Each child * Blocks? ]] .column.bg-brand-gray[.content[ ## Proposed Design <img src="images/Polio-Age_400x259.jpg" width="80%" height="80%"> * Vaccinate all grade 2 children whose parents would consent, leaving children in grades 1 and 3 as controls. * Can grade 2 children whose parents did not consent be included as control? * Or simply let the nonconsent group be the control group? ]] ]] --- layout: false class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: biases in Salk vaccine field trial ]] .row[.content[ * .brand-blue[Selection bias] - polio is a contact disease and the incidence rate may vary among children of different grades. Why? * .brand-blue[Consent bias] - higher-income parents are likely consent to treatment than lower-income parents. Why? * .brand-blue[Consent bias] - children of higher-income parents are more vulnerable to polio. Why? * .brand-blue[Observer/receiver bias] - the idea of receiving treatment can bias results. Why? This .brand-blue[placebo effect] can be settled by dispensing placebo to the control group; * .brand-blue[Observer/receiver bias] - many forms of polio are hard to diagnose and in borderline cases, the diagnosticians could be affected by knowing if the child was vaccinated, prompting to .brand-blue[double-blind experiments]. * In a .brand-blue[well-controlled experiment], the response in vaccinated and non-vaccinated groups should be different only due to treatment. <img src="images/Placebo1.jpg" width="15%" height="15%"> <img src="images/Placebo4.jpg" width="15%" height="15%"> <img src="images/ControlExperiment.png" width="15%" height="15%"> ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: biases in Salk vaccine field trial ]] .row[.split-60[ .column[.content[ * Polio is a contact disease and so the incidence rate may be higher among grade 2 children than among grade 1 & 3 children; * Higher-income parents are likely consent to treatment than lower-income parents probably because they may have more trust to the medical system; * Children of higher-income parents are more vulnerable to polio probably because they are less exposed to health-risked envoirnments; * Knowledge of receiving treatment may give children/parents feeling of protection from engaging more activities. What is called confounding? A .brand-blue[confounding factor] is a variable that influences both the dependent variable (treatment outcomes) and independent variable, causing a spurious association and hence bias the result. ]] .column.bg-brand-gray[.content[ <img src="images/lead_720_405.jpg" width="90%" height="90%"> * Human judgement often results in substantial bias in selecting treatments to children. To reduce selection bias, they use .brand-blue[randomisation] to select children to vaccinate in order to balance out the various observable or nonobservable factors that may affect the results other than treatments. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: bias in Lanarkshire milk experiment ]] .row[.split-60[ .column[.content[ * An experiment in Lanarkshire in the early twentieth century was conducted to test if extra milk will affect growth. Extra milk was to give to a random selection of students at some schools. * Treatment: Extra milk vs. no extra milk * Experimental Unit: student (same treatment over time) * Observational Unit: student (same treatment over time) * At the end of the experiment it was discovered that the teachers had altered the random allocation to ensure that extra milk was given to the most undernourished children. * This is an example of .brand-blue[selection bias]. Their good intentions ruined the experiment, by confounding the effect of the extra milk with the initial state of health. ]] .column.bg-brand-gray[.content[ <img src="images/milk.jpg" width="100%" height="100%"> .brand-blue[Random allocation] is a way out. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: bias in portacaval shunt experiment ]] .row[.split-60[ .column[.content[ * A portacaval shunt redirects blood flow in cases of cirrhosis of the liver. The surgery is long and dangerous. * A study in 1966 detected an increase in life expectancy compared to those without operation and claimed worthy of the surgery risk. * However, the design of experiment was biased toward surgery, another example of .brand-blue[selection bias], as healthier patients tended to have surgery. Hence the increase in life expectancy may due to healthier patients. * This is a .brand-blue[nonrandomised controlled experiment] as the assignment of subjects is based on investigator's judgment. ]] .column.bg-brand-gray[.content[ <img src="images/surgeons.jpg" width="100%" height="100%"> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: Bee attracting chemical ]] .row[.split-70[ .column[.content[ A professional apple-grower has written to you, making an appointment to discuss an experiment which he is proposing to conduct in the coming growing season. Part of his letter reads: * There is a new type of chemical spray available, which is supposed to make the apple flowers more attractive to bees. * Since bees are essential for setting the fruit, I want to investigate these chemicals. * Two manufacturers are selling the new sprays, under the trade names Buzz!! and Attractabee. * I propose dividing my orchard into three parts. I shall spray Attractabee onto one part, and Buzz!! onto the second part. * The third part will be managed in the normal way, with none of these new sprays. I shall then see which part of the orchard does best. ]] .column.bg-brand-gray[.content[ ## How would you design the experiment? <img src="images/bees.jpg" width="100%" height="100%"> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # The ideal and the reality ]] .row[.split-75[ .column[.content[ What a statistician thinks is desirable and what an experimenter wants may cause tension. In reality we have to balance what is statistically ideal and practical constraint. * Increasing the replication generally increase the power but also generally increase the cost of an experiment. * Testing material may be limited. E.g. the seed of some varieties may not be enough to be planted at all field trials. * Choice in the treatment. Does control mean "doing nothing"? E.g. in treatment of a disease, an already existing therapy exists. It is unethical to `do nothing' in an experiment. In this case the treatments should be the new therapy and the existing therapy. * .brand-blue[Consent bias] occurs when subjects choose whether or not to take part in an experiment. It may be an ethical issue in human or animal trials to withhold treatment for those in the control group or enforce treatment for those in the treatment group. ]] .column.bg-brand-gray[.content[ <img src="images/reality-check-ahead.png" width="100%" height="100%"> * What is the .brand-blue[**aim of the experiment**]? ]] ]]