class: split-70 with-border hide-slide-number bg-brand-red background-image: url("images/USydLogo-black.svg") background-size: 200px background-position: 2% 90% .column.white[.content[ <br><br><br> # Random Effect Models ## .black[STAT3022 Applied Linear Models Lecture 31] <br><br><br> ### .black[2020/02/20] ]] .column.bg-brand-charcoal[.content.white[ ## Today 1. Random effects model 2. Interclass correlation 3. Use of lmer function ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Fun fact ]] .row[.split-60[ .column[.content[ * Churchill Eisenhart (1913-1994) was chief of the Statistical Engineering Laboratory, National Bureau of Standards, USA. * Eisenhart (1947) made explicit the distinction between fixed and random interpretations of ANOVA and introduced this terminology. * The random effects model was routinely referred to as .brand-blue[Eisenhart's Model II]. .bottom_abs.width100[ Eisenhart C. (1947) The assumptions underlying the analysis of variance. *Biometrics* **47** 1-21 ] ]] .column.bg-brand-gray[.content[ <img src="images/Churchill-Eisenhart.png" width="50%" height="50%"> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: Sodium content in beer ]] .row[.split-50[ .column[.content[ * .brand-blue[Six] brands from the large number of North American beers were selected at random. * Per brand .brand-blue[eight] 12-ounce ( `\(\approx\)` 3.5 dl) bottles were sampled and the sodium content (in mg) of each bottle was measured. * What is of interest? * What is the expected value of sodium content for a given beer brand? * Does the sodium content differ significantly between brands? * What is the expected value of sodium content in a beer? * What is the variability of the sodium content between different beer brands? ]] .column.bg-brand-gray[.content[ .scroll-box-20[ ```r dat ``` ``` sodium brand rep 1 24.4 1 1 2 22.6 1 2 3 23.8 1 3 4 22.0 1 4 5 24.5 1 5 6 22.3 1 6 7 25.0 1 7 8 24.5 1 8 9 10.2 2 1 10 12.1 2 2 11 10.3 2 3 12 10.2 2 4 13 9.9 2 5 14 11.2 2 6 15 12.0 2 7 16 9.5 2 8 17 19.2 3 1 18 19.4 3 2 19 19.8 3 3 20 19.0 3 4 21 19.6 3 5 22 18.3 3 6 23 20.0 3 7 24 19.4 3 8 25 17.4 4 1 26 18.1 4 2 27 16.7 4 3 28 18.3 4 4 29 17.6 4 5 30 17.5 4 6 31 18.0 4 7 32 16.4 4 8 33 13.4 5 1 34 15.0 5 2 35 14.1 5 3 36 13.1 5 4 37 14.9 5 5 38 15.0 5 6 39 13.4 5 7 40 14.8 5 8 41 21.3 6 1 42 20.2 6 2 43 20.7 6 3 44 20.8 6 4 45 20.1 6 5 46 18.8 6 6 47 21.1 6 7 48 20.3 6 8 ``` ] .bottom_abs.width100[ Data from Kutner et al. (2005) *Applied Linear Statistical Models* P.1078. ] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: Sodium content in beer ]] .row[.split-50[ .column[.content[ ```r ggplot(dat, aes(brand, sodium, fill=brand)) + geom_boxplot() + theme_classic() + xlab("Brand") + ylab("Sodium") + theme(axis.text=element_text(size=16), axis.title=element_text(size=20)) + guides(fill=F) + geom_hline(yintercept=mean(sodium), linetype="dotted") ``` ]] .column.bg-brand-gray[.content[ <img src="lecture31_2020JC_files/figure-html/unnamed-chunk-6-1.svg" style="display: block; margin: auto;" /> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: Beer - one-way ANOVA model ]] .row[.split-50[ .column[.content[ * What is the expected value of the sodium content for given brands? ```r anova(lm(sodium ~ brand - 1)) ``` ``` Analysis of Variance Table Response: sodium Df Sum Sq Mean Sq F value Pr(>F) brand 6 15772.3 2628.72 3671.6 < 2.2e-16 *** Residuals 42 30.1 0.72 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ```r round(coef(M1 <-lm(sodium ~ brand-1)),1) #brand mean ``` ``` brand1 brand2 brand3 brand4 brand5 brand6 23.6 10.7 19.3 17.5 14.2 20.4 ``` * Do the sodium content differ significantly between brands? ]] .column.bg-brand-gray[.content[ ```r summary(aov(sodium ~ brand)) ``` ``` Df Sum Sq Mean Sq F value Pr(>F) brand 5 854.5 170.91 238.7 <2e-16 *** Residuals 42 30.1 0.72 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` * What is the expected value of sodium content? * What is the variability of sodium content between brands? ```r mi=tapply(sodium,brand,mean); var(mi) #sigA^2 ``` ``` [1] 21.36323 ``` ```r c(mean(sodium),deviance(M1)/42) #intercept and MSR ``` ``` [1] 17.6291667 0.7159524 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # One-way ANOVA model II ]] .row[.split-50[ .column[.content[ * Let `\(Y_{ij}\)` denote the `\(j\)`-th observation on factor level `\(i\)`. * We have so far modelled `\(Y_{ij}\)` as `$$Y_{ij}=\mu+\alpha_i+\epsilon_{ij}, \ j=1,\ldots,n_i, \ i=1,\ldots,t,$$` where `\(\epsilon_{ij}\sim NID(0,\sigma^2)\)` and `\(\alpha_i\)` is a .brand-blue[fixed effect]. This is referred to as one-way ANOVA model .brand-blue[I] at times. * For one-way ANOVA model .brand-blue[II], we model `\(Y_{ij}\)` as `$$Y_{ij}=\mu+A_i+\epsilon_{ij}, \ j=1,\ldots,n_i, \ i=1,\ldots,t,$$` where `\(\color{blue}{A_i\sim NID(0,\sigma^2_A)}\)`, `\(\epsilon_{ij}\sim NID(0,\sigma^2)\)`, `\(A_i\)` is a .brand-blue[random variable] independent of `\(\epsilon_{ij}\)`. This is more commonly referred as .brand-blue[random effects model]. * Note: by convention italic upper case letters are random variables (although we violate this at times for convenience). ]] .column.bg-brand-gray[.content[ ```r library(lme4) M0 <- lmer(sodium ~ 1+(1|brand), data=dat) summary(M0) ``` ``` Linear mixed model fit by REML ['lmerMod'] Formula: sodium ~ 1 + (1 | brand) Data: dat REML criterion at convergence: 148.9 Scaled residuals: Min 1Q Median 3Q Max -1.90551 -0.68338 0.08232 0.79247 1.64969 Random effects: Groups Name Variance Std.Dev. brand (Intercept) 21.274 4.6123 Residual 0.716 0.8461 Number of obs: 48, groups: brand, 6 Fixed effects: Estimate Std. Error t value (Intercept) 17.629 1.887 9.343 ``` So `\(\hat\mu=17.629\)`, `\(\hat\sigma^2=0.716\)` and `\(\hat{\sigma}^2_A=21.274\)`. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Prediction of random effects ]] .row[.split-50[ .column[.content[ The deviation from the average sodium content by brand is `\(A_i\)` given by ```r ranef(M0) ``` ``` $brand (Intercept) 1 5.9831634 2 -6.9250345 3 1.7011768 4 -0.1286256 5 -3.4023537 6 2.7716735 with conditional variances for "brand" ``` ```r round(tapply(sodium,brand,mean)-mean(sodium),3) ``` ``` 1 2 3 4 5 6 6.008 -6.954 1.708 -0.129 -3.417 2.783 ``` ]] .column.bg-brand-gray[.content[ * The estimation method REML refers to .brand-blue[restricted maximum likelihood]. * Note that we often say we .brand-blue[predict] random effects and .brand-blue[estimate] fixed effects and variance parameters. * That is, we .brand-blue[predict a realization of a random variable] while .brand-blue[estimate a parameter of a random variable]. * The model parameters of ANOVA II as a random effect model are `\(\mu\)`, `\(\sigma_a^2\)` and `\(\sigma^2\)`. We do not consider `\(A_i\)` as a model parameters but just a random realisation from the population of `\(A_i\)`. * Hence this model has less parameters and is preferred when there are large number of brands with few replicates so that there are large number of model parameters `\(\alpha_i\)` with insufficient replicates to estimate these `\(\alpha_i\)` parameters. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Differences between one-way ANOVA model I and II ]] .row[.split-50[ .column[.content[ Recall under .brand-blue[one-way ANOVA model I], <br> `\(E(Y_{ij}) = \mu + \alpha_i\)`, <br> `\(\text{Cov}(Y_{ij},Y_{ik})=\text{Cov}(\mu+\alpha_i+\epsilon_{ij},\mu+\alpha_i+\epsilon_{ik})\)` <br> `\(\hspace{40mm} =\text{Cov}(\epsilon_{ij},\epsilon_{ik})=0\)` <br> and so responses are independent.<br> .brand-blue[Note:] zero covariance does not imply independence unless the random variables are jointly Normally distributed. For .brand-blue[one-way ANOVA model II], <br> the model is `$$Y_{ij}=\mu+A_i+\epsilon_{ij}, \ j=1,\ldots,n_i, \ i=1,\ldots,t.$$` * `\(E(Y_{ij}) = \mu\)` i.e., all observations have the same expected value since `\(E(A_{i}) = 0\)`; * `\(\text{Var}(Y_{ij})=\sigma^2_A+\sigma^2\)`, where `\(\sigma^2\)` and `\(\sigma^2_A\)` are called the .brand-blue[variance components]; ]] .column.bg-brand-gray[.content[ * `\(\text{Cov}(Y_{ij},Y_{ik})=\text{Cov}(\mu+A_i+\epsilon_{ij},\mu+ A_i+\epsilon_{ik})\)` <br> `\(\hspace{40mm}=\text{Var}(A_i) = \sigma^2_A\)` <br> which means that the observations within groups are not statistically independent! * This naturally leads to the so-called .brand-blue[intraclass correlation] defined as `$$\color{blue}{\rho=\frac{\text{Cov}(Y_{ij},Y_{ik})}{\sqrt{\text{Var}(Y_{ij})\text{Var}(Y_{ik})}}=\frac{\sigma^2_A}{\sigma^2_A+\sigma^2}.}$$` * So far we assume results are observed at each level of the factor. * In many situations the factor levels are a sample from a larger population of factor levels. * This situation should be modelled via a .brand-blue[random effects model] also known as .brand-blue[ANOVA model II]. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Hypothesis test in a random-effects model ]] .row[.split-50[ .column[.content[ * In the random effects situation the hypothesis of interest may be `$$\color{blue}{H_0:\sigma^2_A=0\quad \text{vs.}\quad H_1: \sigma^2_A>0}$$` `\(H_0:\sigma^2_A=0\)` implies .blue[no variability in the factor levels] and so `\(A_i=0, \ \forall i\)` since `\(E(A_i)=0\)` or move to `\(\mu\)`. * To test this we use the classical `\(F\)`-test statistic. Other tests such as likelihood ratio, score and Wald test exist but it is out of scope for this course. `$$f=\frac{\text{SS}_A /(t-1)}{\text{RSS}/(n-t)}=\frac{\text{MS}_A}{\text{MSR}}.$$` * So `\(f\sim F_{t-1,n-t}\)` under `\(H_0\)` as before. * Thus it is the same test as the one-way ANOVA `\(F\)`-test but the interpretation changes. ]] .column.bg-brand-gray[.content[ ### Inference for the overall mean estimate * We have `\(\text{E}(\bar{Y}_{\bullet\bullet})=\mu\)` so `\(\bar{Y}_{\bullet\bullet}\)` unbiased for `\(\mu\)`. * And `\(\text{var}\left(\bar{Y}_{\bullet\bullet}\right)=\frac{1}{n}\sigma^2+\left(\frac{\sum_{i=1}^t {n_i^2}}{n^2}\right)\sigma^2_A\)`, which is estimated by `\(\frac{\text{SS}_A/(t-1)}{n}\)`. * Then a 100 `\((1-\alpha)\)` % CI for `\(\mu\)` is: `$$\hat{\mu} \pm t_{\color{blue}{(t-1)}}(1-\alpha/2) \sqrt{\frac{\color{blue}{\text{SS}_A/(t-1)}}{n}}$$` where `\(\hat{\mu}=\bar{Y}_{\bullet\bullet}\)`. * Note that as the `\(MSR=SSR/(n-t)\)` to estimate `\(\sigma^2\)` is now replaced by `\(\text{MS}_A=SS_A/(t-1)\)` to estimate `\(n^2 \sigma^2 + \sum_i n_i^2 \sigma_A^2\)`, the df also changes from `\(n-t\)` to `\(t-1\)`. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Hypothesis Testing ]] .row[.split-50[ .column[.content[ ```r alpha=0.01; t=6; n=6*8; mu.hat=mean(sodium) MSA <- anova(lm(sodium ~ brand))[1, 3] c(mu.hat-qt(1-alpha/2, t-1)*sqrt(MSA/n), mu.hat+qt(1-alpha/2, t-1)*sqrt(MSA/n)) ``` ``` [1] 10.02076 25.23757 ``` which is wider than the one sample 99% CI which is (15.94813, 19.31020) because there are extra source of variability from the random effects. ```r anova(lm(sodium ~ brand)) ``` ``` Analysis of Variance Table Response: sodium Df Sum Sq Mean Sq F value Pr(>F) brand 5 854.53 170.906 238.71 < 2.2e-16 *** Residuals 42 30.07 0.716 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ]] .column.bg-brand-gray[.content[ ```r t.test(sodium, conf.level=1 - alpha) #test mu = 0 ``` ``` One Sample t-test data: sodium t = 28.153, df = 47, p-value < 2.2e-16 alternative hypothesis: true mean is not equal to 0 99 percent confidence interval: 15.94813 19.31020 sample estimates: mean of x 17.62917 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Fixed vs. random effects ]] .row[.split-50[ .column[.content[ ### Theoretical considerations * Fixed effects are constant values but random effects follow a distribution. * Effects are fixed if they are the interest, ie the fixed `\(\alpha_i\)` or random if there is interest in the underlying population with variance estimate `\(\sigma_A^2\)`. * When a sample exhausts the population, the corresponding variable is fixed; when the sample is a small (i.e., negligible) part of the population the corresponding variable is random. * If an effect is assumed to be a realised value of a random variable, it is called a random effect. * Fixed effects are estimated using least squares and random effects are estimated with shrinkage like REML. ]] .column.bg-brand-gray[.content[ ### Question to ask How does inference change when a set of effects is changed from fixed to random? .bottom_abs.width100[ Gelman (2004) Analysis of variance - why it is more important than ever. *The Annals of Statistics*. **33** (1) 1-53. ] ]] ]]