class: split-70 with-border hide-slide-number bg-brand-red background-image: url("images/USydLogo-black.svg") background-size: 200px background-position: 2% 90% .column.white[.content[ <br><br><br> # Two-way ANOVA Part I ## .black[STAT3022 Applied Linear Models Lecture 18] <br><br><br> ### .black[2020/02/18] ]] .column.bg-brand-charcoal[.content.white[ ## Today 1. Two-way analysis of variance 1. Main effects 1. Interaction effects ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # .yellow[Main effects] two-way ANOVA model ]] .row[.split-two[ .column[.content[ * Suppose that there are two factors, .red[A] and .blue[B], which may be related to the response variable. * The most straightforward way of modelling these factors is to assume that their *effects are additive*. * A main effects two-way ANOVA model with factors .red[A] and .blue[B] is `$$Y_{ijk} = \mu + \color{red}{\alpha_i} + \color{blue}{\beta_j} + \epsilon_{ijk},\quad \epsilon_{ijk} \sim NID(0,\sigma^2)$$` where * `\(i=1,\ldots,\color{red}{a}\)`, `\(j=1,,\ldots,\color{blue}{b}\)` and `\(k=1,\ldots,n_{ij}\)`, * `\(Y_{ijk}\)` is the response for the `\(k\)`th unit at level `\(i\)` of factor A and level `\(j\)` of factor B. * `\(\color{red}{\alpha_1, \ldots, \alpha_a}\)` and `\(\color{blue}{\beta_1, \ldots, \beta_b}\)` are parameters describing the main effects of .red[A] and .blue[B], respectively. ]] .column.bg-brand-gray[.content[ * So here the total sample size, `$$\displaystyle n=\sum_{i=1}^a\sum_{j=1}^b n_{ij}.$$` ### Constraints * As before, we require constraints on `\(\color{red}{\alpha_1, \ldots, \alpha_a}\)` and `\(\color{blue}{\beta_1, \ldots, \beta_b}\)`. * If you include the intercept/overall mean in the model, `R` by default uses the treatment constraint, i.e. `\(\color{red}{\alpha_1 = 0}\)` and `\(\color{blue}{\beta_1 =0}\)`. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Fertilizer data ]] .row[.split-60[ .column[.content[ * Suppose there are two blocks under each treatment. * Hence the treatment and block make up two factors. * There are 3 replicates for each treatment and block combination. <br><br> <img src="lecture18_2020JC_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> ]] .column.bg-brand-gray[.content[ .scroll-box-20[ ```r dat #long form ``` ``` Y Trt Block Rep 1 8 A 1 1 2 10 A 1 2 3 12 A 1 3 4 13 A 2 1 5 14 A 2 2 6 12 A 2 3 7 20 B 1 1 8 19 B 1 2 9 22 B 1 3 10 8 B 2 1 11 10 B 2 2 12 14 B 2 3 13 9 C 1 1 14 2 C 1 2 15 7 C 1 3 16 5 C 2 1 17 11 C 2 2 18 8 C 2 3 19 14 D 1 1 20 12 D 1 2 21 7 D 1 3 22 8 D 2 1 23 12 D 2 2 24 8 D 2 3 25 5 0 1 1 26 3 0 1 2 27 4 0 1 3 28 7 0 2 1 29 11 0 2 2 30 5 0 2 3 ``` ] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Writing the model as a multiple linear regression ]] .row[.split-50[ .column[.content[ * Specifically with treatment constraints, `$$\boldsymbol{Y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}; \quad \text{where }\boldsymbol{\epsilon}\sim N(\boldsymbol{0}, \sigma^2\mathbf{I}_{n})$$` `\(\boldsymbol{\beta} = (\mu, \alpha_2, ..., \alpha_5, \beta_2)^\top\)`, `\(\boldsymbol{\epsilon}=(\epsilon_{ijk})\)`, `\(\boldsymbol{Y}=(Y_{ijk})\)`, * Design matrix `\(\mathbf{X}\)` is a `\(30\times 6\)` matrix given to the right. * Estimation can be performed by the method of least squares or ML in the usual way, ie `\(\hat{\boldsymbol{\beta}}=(\mathbf{X}^\top \mathbf{X})^{-1}\mathbf{X}^\top \boldsymbol{Y}\)`. * Fitted values and residuals are also defined as usual, ie `\(\widehat{\boldsymbol{Y}} = \mathbf{X}\hat{\boldsymbol{\beta}}\)` and `\(\boldsymbol{R}=\boldsymbol{Y}-\mathbf{X}\hat{\boldsymbol{\beta}}\)`. ]] .column.bg-brand-gray[.content[ .scroll-box-20[ ```r (X <- model.matrix(~ Trt + Block, data=dat)) ``` ``` (Intercept) TrtA TrtB TrtC TrtD Block2 1 1 1 0 0 0 0 2 1 1 0 0 0 0 3 1 1 0 0 0 0 4 1 1 0 0 0 1 5 1 1 0 0 0 1 6 1 1 0 0 0 1 7 1 0 1 0 0 0 8 1 0 1 0 0 0 9 1 0 1 0 0 0 10 1 0 1 0 0 1 11 1 0 1 0 0 1 12 1 0 1 0 0 1 13 1 0 0 1 0 0 14 1 0 0 1 0 0 15 1 0 0 1 0 0 16 1 0 0 1 0 1 17 1 0 0 1 0 1 18 1 0 0 1 0 1 19 1 0 0 0 1 0 20 1 0 0 0 1 0 21 1 0 0 0 1 0 22 1 0 0 0 1 1 23 1 0 0 0 1 1 24 1 0 0 0 1 1 25 1 0 0 0 0 0 26 1 0 0 0 0 0 27 1 0 0 0 0 0 28 1 0 0 0 0 1 29 1 0 0 0 0 1 30 1 0 0 0 0 1 attr(,"assign") [1] 0 1 1 1 1 2 attr(,"contrasts") attr(,"contrasts")$Trt [1] "contr.treatment" attr(,"contrasts")$Block [1] "contr.treatment" ``` ] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Estimation result ]] .row[.split-60[ .column[.content[ ```r M1 <- lm(Y ~ Trt + Block, data=dat) broom::tidy(M1) ``` ``` # A tibble: 6 x 5 term estimate std.error statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> 1 (Intercept) 6.10 1.62 3.76 0.000974 2 TrtA 5.67 2.10 2.70 0.0124 3 TrtB 9.67 2.10 4.61 0.000112 4 TrtC 1.17 2.10 0.556 0.583 5 TrtD 4.33 2.10 2.07 0.0497 6 Block2 -0.533 1.33 -0.402 0.691 ``` ```r (beta <- solve(t(X) %*% X) %*% t(X) %*% dat$Y) ``` ``` [,1] (Intercept) 6.1000000 TrtA 5.6666667 TrtB 9.6666667 TrtC 1.1666667 TrtD 4.3333333 Block2 -0.5333333 ``` ]] .column.bg-brand-gray[.content[ .scroll-box-20[ ```r data.frame(fitted = X %*% beta, residual = dat$Y - X %*% beta) ``` ``` fitted residual 1 11.766667 -3.7666667 2 11.766667 -1.7666667 3 11.766667 0.2333333 4 11.233333 1.7666667 5 11.233333 2.7666667 6 11.233333 0.7666667 7 15.766667 4.2333333 8 15.766667 3.2333333 9 15.766667 6.2333333 10 15.233333 -7.2333333 11 15.233333 -5.2333333 12 15.233333 -1.2333333 13 7.266667 1.7333333 14 7.266667 -5.2666667 15 7.266667 -0.2666667 16 6.733333 -1.7333333 17 6.733333 4.2666667 18 6.733333 1.2666667 19 10.433333 3.5666667 20 10.433333 1.5666667 21 10.433333 -3.4333333 22 9.900000 -1.9000000 23 9.900000 2.1000000 24 9.900000 -1.9000000 25 6.100000 -1.1000000 26 6.100000 -3.1000000 27 6.100000 -2.1000000 28 5.566667 1.4333333 29 5.566667 5.4333333 30 5.566667 -0.5666667 ``` ] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example: student battery experiment ]] .row[.split-40[ .column[.content[ * A student experiment was run to test the performance of .red[4 brands of batteries] under .blue[2 different environments] (room temperature and cold). * For each of the 8 treatment combindations, .grey[2 batteries] of a particular brand were put into a flashlight. * The flashlight was then turned on and allowed to run until the light went out. * The number of minutes the flashlight stayed on was recorded. Each treatment condition was run twice. ]] .column.bg-brand-gray[.content[ ```r skimr::skim(dat) ``` <img src="images/L18Skim.png" width="90%" height="90%"> ```r Y=dat$TimetoFail; t=dat$Temp; b=dat$Brand; g=dat$Group m=mean(Y); mt=tapply(Y,t,mean); mb=tapply(Y,b,mean) c(m,mt,mb) #overall mean & group means ``` ``` Cold Room A B C D 132.9375 89.3750 176.5000 132.7500 136.5000 125.5000 137.0000 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Main Effect Model 1: Temp + Brand ]] .row[.split-50[ .column[.content[ ``` TimetoFail Temp Brand Group 1 181 Room A RA 2 187 Room B RB 3 150 Room C RC 4 173 Room D RD 5 85 Cold A CA 6 80 Cold B CB 7 93 Cold C CC 8 87 Cold D CD 9 180 Room A RA 10 192 Room B RB 11 159 Room C RC 12 190 Room D RD 13 85 Cold A CA 14 87 Cold B CB 15 100 Cold C CC 16 98 Cold D CD ``` ```r mu=mt[1]+mb[1]-m #calculation of coeff c(unname(mu),mt[2]-mt[1],mb[2]-mb[1], mb[3]-mb[1],mb[4]-mb[1]) ``` ``` Room B C D 89.1875 87.1250 3.7500 -7.2500 4.2500 ``` ]] .column.bg-brand-gray[.content[ ```r *M1 <- lm(TimetoFail ~ Temp + Brand, data=dat) print(broom::tidy(M1), digits = 4) ``` ``` # A tibble: 5 x 5 term estimate std.error statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> 1 (Intercept) 89.2 6.71 13.3 0.0000000405 2 TempRoom 87.1 6.00 14.5 0.0000000161 3 BrandB 3.75 8.49 0.442 0.667 4 BrandC -7.25 8.49 -0.854 0.411 5 BrandD 4.25 8.49 0.501 0.627 ``` ```r anova(M1) ``` ``` Analysis of Variance Table Response: TimetoFail Df Sum Sq Mean Sq F value Pr(>F) Temp 1 30363.1 30363.1 210.630 1.612e-08 *** Brand 3 338.2 112.7 0.782 0.5284 Residuals 11 1585.7 144.2 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` SE within a factor is the same. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Main Effect Model 1: Temp + Brand ]] .row[.split-50[ .column.bg-brand-gray[.content[ Calculate predicted values with baseline Cold & A. ```r beta=summary(M1)$coeff[,1] predc=c(beta[1]+beta[2], beta[1], beta[1]+beta[2]+beta[3],beta[1]+beta[3], beta[1]+beta[2]+beta[4],beta[1]+beta[4], beta[1]+beta[2]+beta[5],beta[1]+beta[5]) predc=as.vector(predc) newdata <- expand.grid( Temp=c("Room", "Cold"), Brand=c("A", "B", "C", "D")) preddata<-cbind(newdata,Pred=predict(M1,newdata)) cbind(preddata,predc) ``` ``` Temp Brand Pred predc 1 Room A 176.3125 176.3125 2 Cold A 89.1875 89.1875 3 Room B 180.0625 180.0625 4 Cold B 92.9375 92.9375 5 Room C 169.0625 169.0625 6 Cold C 81.9375 81.9375 7 Room D 180.5625 180.5625 8 Cold D 93.4375 93.4375 ``` ]] .column.bg-white[.content[ ```r (X <- model.matrix(~ Temp + Brand, data=dat)) ``` ``` (Intercept) TempRoom BrandB BrandC BrandD 1 1 1 0 0 0 2 1 1 1 0 0 3 1 1 0 1 0 4 1 1 0 0 1 5 1 0 0 0 0 6 1 0 1 0 0 7 1 0 0 1 0 8 1 0 0 0 1 9 1 1 0 0 0 10 1 1 1 0 0 11 1 1 0 1 0 12 1 1 0 0 1 13 1 0 0 0 0 14 1 0 1 0 0 15 1 0 0 1 0 16 1 0 0 0 1 attr(,"assign") [1] 0 1 2 2 2 attr(,"contrasts") attr(,"contrasts")$Temp [1] "contr.treatment" attr(,"contrasts")$Brand [1] "contr.treatment" ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # One-way ANOVA Model 2: Group ]] .row[.split-50[ .column[.content[ ```r *M2=lm(TimetoFail~Group,data=dat) print(summary(M2)$coeff,digits=4) #baseline Cold & A ``` ``` Estimate Std. Error t value Pr(>|t|) (Intercept) 85.0 4.384 19.3890 5.198e-08 GroupCB -1.5 6.200 -0.2419 8.149e-01 GroupCC 11.5 6.200 1.8549 1.007e-01 GroupCD 7.5 6.200 1.2097 2.609e-01 GroupRA 95.5 6.200 15.4037 3.136e-07 GroupRB 104.5 6.200 16.8554 1.556e-07 GroupRC 69.5 6.200 11.2100 3.596e-06 GroupRD 96.5 6.200 15.5650 2.892e-07 ``` ```r anova(M2) ``` ``` Analysis of Variance Table Response: TimetoFail Df Sum Sq Mean Sq F value Pr(>F) Group 7 31979 4568.5 118.86 1.894e-07 *** Residuals 8 307 38.4 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ]] .column.bg-brand-gray[.content[ ```r newdata2=data.frame(Group=levels(dat$Group)) preddata2=cbind(newdata2,Pred=predict(M2,newdata2)) cbind(preddata2,tapply(Y,g,mean)) ``` ``` Group Pred tapply(Y, g, mean) 1 CA 85.0 85.0 2 CB 83.5 83.5 3 CC 96.5 96.5 4 CD 92.5 92.5 5 RA 180.5 180.5 6 RB 189.5 189.5 7 RC 154.5 154.5 8 RD 181.5 181.5 ``` * Intercept is the mean of group CA and others are difference in mean from CA. * RSS is much less but the number of regression parameters also increases. * Predicted values are their group means. ]] ]] --- layout: false class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # One-way ANOVA Model 2: Group ]] .row[.content[ <br> <img src="lecture18_2020JC_files/figure-html/unnamed-chunk-21-1.svg" style="display: block; margin: auto;" /> * The black dot is the fitted value. * The fitted value is the mean (middle) of the two replicates. ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Main effects model vs. one-way ANOVA model ]] .row[.split-two[ .column[.content[ ```r ggplot(dat, aes(Brand, TimetoFail, color=Temp)) + geom_point(size=5, alpha=1/2) + theme_bw(base_size=18) + labs(y="Time to Fail") + geom_point(data=preddata, aes(Brand, Pred), color="black", size=5, alpha=1/2) + ggtitle("Model 1: Temp+Brand") ``` <img src="lecture18_2020JC_files/figure-html/unnamed-chunk-22-1.png" style="display: block; margin: auto;" /> * The black dot is the fitted value under `M1`. * Which looks better? ]] .column.bg-brand-gray[.content[ ```r ggplot(dat, aes(Brand, TimetoFail, color=Temp)) + geom_point(size=5, alpha=1/2) + theme_bw(base_size=18) + labs(y="Time to Fail") + geom_point(data=preddata2r, aes(Brand, Pred), color="black", size=5, alpha=1/2) + ggtitle("Model 2: Group") ``` <img src="lecture18_2020JC_files/figure-html/unnamed-chunk-24-1.png" style="display: block; margin: auto;" /> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # .yellow[Interaction] model ]] .row[.split-two[ .column[.content[ * Another possibility is to model all `\(t=ab\)` treatment (combinations): `$$M_{AB} : Y_{ijk} = \mu_{ij} + \epsilon_{ijk},\; i=1,\ldots,a,\; j=1,\ldots,b.$$` * This is essentially a one-way ANOVA Model 2. * For such a model we can calculate * TSS with `\((t-1) = (ab-1)\)` df * the RSS with `\((n-ab)\)` df * the F-test for the hypothesis of .brand-blue[overall] treatment effects i.e. `$$H_0\,:\, \mu_{11} = \mu_{12} = \ldots = \mu_{ab}$$` ]] .column.bg-brand-gray[.content[ ```r anova(M2) ``` ``` Analysis of Variance Table Response: TimetoFail Df Sum Sq Mean Sq F value Pr(>F) Group 7 31979 4568.5 118.86 1.894e-07 *** Residuals 8 307 38.4 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` * If `\(H_0\)` of null model is rejected, the treatment effects is partitioned by considering the mean response `\(\mu_{ij}\)`: `$$\mu_{ij} = \mu + \color{red}{\alpha_i} + \color{blue}{\beta_j} + \color{purple}{(\alpha\beta)_{ij}},$$` with treatment constraints `\(\alpha_1 = \beta_1=0\)`. * The consequence of these constraints mean is `\((\alpha\beta)_{1j}=0\)` for `\(j=1,...,b\)` and `\((\alpha\beta)_{i1}=0\)` for `\(i=1,...,a\)`. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Design matrix `\(\mathbf{X}\)` under treatment constraints ]] .row[.content[ ```r model.matrix(~ Temp*Brand, data=dat) ``` ``` (Intercept) TempRoom BrandB BrandC BrandD TempRoom:BrandB TempRoom:BrandC TempRoom:BrandD 1 1 1 0 0 0 0 0 0 2 1 1 1 0 0 1 0 0 3 1 1 0 1 0 0 1 0 4 1 1 0 0 1 0 0 1 5 1 0 0 0 0 0 0 0 6 1 0 1 0 0 0 0 0 7 1 0 0 1 0 0 0 0 8 1 0 0 0 1 0 0 0 9 1 1 0 0 0 0 0 0 10 1 1 1 0 0 1 0 0 11 1 1 0 1 0 0 1 0 12 1 1 0 0 1 0 0 1 13 1 0 0 0 0 0 0 0 14 1 0 1 0 0 0 0 0 15 1 0 0 1 0 0 0 0 16 1 0 0 0 1 0 0 0 attr(,"assign") [1] 0 1 2 2 2 3 3 3 attr(,"contrasts") attr(,"contrasts")$Temp [1] "contr.treatment" attr(,"contrasts")$Brand [1] "contr.treatment" ``` ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Design matrix `\(\mathbf{X}\)` ]] .row[.split-30[ .column[.content[ Under treatment constraints `$$\small\mathbf{X}=\begin{pmatrix}1 & \color{blue}{1} & \color{red}{0} & \color{red}{0} & \color{red}{0} & \color{purple}{0} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{1} & \color{red}{1} & \color{red}{0} & \color{red}{0} & \color{purple}{1} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{1} & \color{red}{0} & \color{red}{1} & \color{red}{0} & \color{purple}{0} & \color{purple}{1} & \color{purple}{0} \\ 1 & \color{blue}{1} & \color{red}{0} & \color{red}{0} & \color{red}{1} & \color{purple}{0} & \color{purple}{0} & \color{purple}{1} \\ 1 & \color{blue}{0} & \color{red}{0} & \color{red}{0} & \color{red}{0} & \color{purple}{0} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{0} & \color{red}{1} & \color{red}{0} & \color{red}{0} & \color{purple}{0} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{0} & \color{red}{0} & \color{red}{1} & \color{red}{0} & \color{purple}{0} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{0} & \color{red}{0} & \color{red}{0} & \color{red}{1} & \color{purple}{0} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{1} & \color{red}{0} & \color{red}{0} & \color{red}{0} & \color{purple}{0} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{1} & \color{red}{1} & \color{red}{0} & \color{red}{0} & \color{purple}{1} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{1} & \color{red}{0} & \color{red}{1} & \color{red}{0} & \color{purple}{0} & \color{purple}{1} & \color{purple}{0} \\ 1 & \color{blue}{1} & \color{red}{0} & \color{red}{0} & \color{red}{1} & \color{purple}{0} & \color{purple}{0} & \color{purple}{1} \\ 1 & \color{blue}{0} & \color{red}{0} & \color{red}{0} & \color{red}{0} & \color{purple}{0} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{0} & \color{red}{1} & \color{red}{0} & \color{red}{0} & \color{purple}{0} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{0} & \color{red}{0} & \color{red}{1} & \color{red}{0} & \color{purple}{0} & \color{purple}{0} & \color{purple}{0} \\ 1 & \color{blue}{0} & \color{red}{0} & \color{red}{0} & \color{red}{1} & \color{purple}{0} & \color{purple}{0} & \color{purple}{0} \\\end{pmatrix}$$` ]] .column.bg-brand-gray[.content[ Without constraints `$$\boldsymbol{Y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}\quad\text{where}\quad\mathbf{X}=\begin{pmatrix}\boldsymbol{1}_n & \mathbf{X}_A & \mathbf{X}_B & \mathbf{X}_{AB}\end{pmatrix}$$` `$$\boldsymbol{Y} = \boldsymbol{1}_n\mu + \mathbf{X}_A\begin{pmatrix}\beta_1 \\\beta_2\end{pmatrix} + \mathbf{X}_B\begin{pmatrix}\alpha_1\\\alpha_2\\\alpha_3\\\alpha_4\end{pmatrix}+ \mathbf{X}_{AB}\begin{pmatrix}(\alpha\beta)_{11} \\(\alpha\beta)_{12}\\\vdots\\(\alpha\beta)_{24}\end{pmatrix} + \boldsymbol{\epsilon}$$` .scroll-box-10[ `$$\small \mathbf{X}_A=\begin{matrix}\mathbf{X}_A\\\begin{pmatrix}0 & 1 \\ 0 & 1 \\ 0 & 1 \\ 0 & 1 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 0 & 1 \\ 0 & 1 \\ 0 & 1 \\ 0 & 1 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ 1 & 0 \\ \end{pmatrix}\end{matrix}\quad \mathbf{X}_B=\begin{matrix}\mathbf{X}_B\\\begin{pmatrix}1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{pmatrix}\end{matrix}\quad \mathbf{X}_{AB}=\begin{matrix}\mathbf{X}_{AB}\\\begin{pmatrix}0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ \end{pmatrix}\end{matrix}$$` ] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Column spaces with examples ]] .row[.content[ `$$\mathbf{X}_B\begin{pmatrix}\beta_1\\\beta_2\\\beta_3\\\beta_4\end{pmatrix}=\begin{pmatrix}\beta_1\\\beta_2\\\beta_3\\\beta_4\\\beta_1\\\beta_2\\\beta_3\\\beta_4\\\beta_1\\\beta_2\\\beta_3\\\beta_4\\\beta_1\\\beta_2\\\beta_3\\\beta_4\\\end{pmatrix},\quad V_{B}=\text{col}(\mathbf{X}_B)=\left\{\begin{pmatrix}a\\b\\c\\d\\a\\b\\c\\d\\a\\b\\c\\d\\a\\b\\c\\d\\\end{pmatrix} \middle| ~a,b,c,d \in \mathbb{R}\right\}, \qquad \begin{pmatrix}1 \\ 8 \\ 2 \\ -3\\1 \\ 8 \\ 2 \\ -3\\1 \\ 8 \\ 2 \\ -3\\1 \\ 8 \\ 2 \\ -3\\\end{pmatrix},\begin{pmatrix}3\\3\\3\\3\\3\\3\\3\\3\\3\\3\\3\\3\\3\\3\\3\\3\\\end{pmatrix}\in V_B$$` ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Vector spaces for two-way ANOVA model ]] .row[.split-30[ .column[.content[ * `\(V_0 = \text{col}(\boldsymbol{1}_n)\)` * `\(V_A = \text{col}(\mathbf{X}_A)\)` * `\(V_B = \text{col}(\mathbf{X}_B)\)` * `\(V_{AB} = \text{col}(\mathbf{X}_{AB})\)` Note if `\(\boldsymbol{v} \in V_0\)` then `\(\boldsymbol{v} \in V_A\)` as well as `\(\boldsymbol{v} \in V_B\)` and `\(\boldsymbol{v} \in V_{AB}\)`. So we define additional vector subspaces: * `\(W_A = V_A - V_0\)` (mean adj) * `\(W_B = V_B - V_0\)` * `\(W_{AB} = V_{AB} - V_0 - W_A - W_B\)` * `\(\hspace{17mm} = V_{AB} - V_A - V_B + V_0\)` * `\(W_A + W_B + W_{AB} = V_{AB} - V_0\)` ]] .column.bg-brand-gray[.split-40[ .row[.content[ Projection matrices for these subspaces without constraints are: .pull-left[ * `\(\mathbf{P}_{V_0} = \boldsymbol{1}_n(\boldsymbol{1}_n^\top\boldsymbol{1}_n)^{-1}\boldsymbol{1}_n^\top\)` * `\(\mathbf{P}_{V_A} = \mathbf{X}_A(\mathbf{X}_A^\top\mathbf{X}_A)^{-1}\mathbf{X}_A^\top\)` * `\(\mathbf{P}_{V_B} = \mathbf{X}_B(\mathbf{X}_B^\top\mathbf{X}_B)^{-1}\mathbf{X}_B^\top\)` * `\(\mathbf{P}_{V_{AB}} = \mathbf{X}_{AB}(\mathbf{X}_{AB}^\top\mathbf{X}_{AB})^{-1}\mathbf{X}_{AB}^\top\)` ] .pull-right[ * `\(\mathbf{P}_{W_A} = \mathbf{P}_{V_A} - \mathbf{P}_{V_0}\)` * `\(\mathbf{P}_{W_B} = \mathbf{P}_{V_B} - \mathbf{P}_{V_0}\)` * `\(\mathbf{P}_{W_{AB}} = \mathbf{P}_{V_{AB}} - \mathbf{P}_{W_A}-\mathbf{P}_{W_B} -\mathbf{P}_{V_0}\)` ] ]] .row[.content[ `\begin{eqnarray} \underbrace{(\mathbf{I}_n - \mathbf{P}_{V_0})}_{\text{Total}}\boldsymbol{Y}&=&\underbrace{\mathbf{P}_{W_A}}_{\text{Factor } A}\boldsymbol{Y}+ \underbrace{\mathbf{P}_{W_B}}_{\text{Factor } B}\boldsymbol{Y}+ \underbrace{\mathbf{P}_{W_{AB}}}_{A,B \text{ interaction}}\boldsymbol{Y} + \underbrace{(\mathbf{I}_n - \mathbf{P}_{V_{AB}})}_{\text{Residual}}\boldsymbol{Y}\\ &=& \left(\mathbf{P}_{W_{A}} + \mathbf{P}_{W_{B}} + \mathbf{P}_{W_{AB}}\right) \boldsymbol{Y} + (\mathbf{I}_n - \mathbf{P}_{V_{AB}})\boldsymbol{Y}\\ &=& \underbrace{(\mathbf{P}_{V_{AB}} - \mathbf{P}_{V_0})}_{\text{Treatment}}\boldsymbol{Y}+ \underbrace{(\mathbf{I}_n - \mathbf{P}_{V_{AB}})}_{\text{Residual}}\boldsymbol{Y} \end{eqnarray}` * assuming that `\(W_A \perp W_B\)` or `\(\mathbf{P}_{W_A}\mathbf{P}_{W_B} = \mathbf{0}\)` (check in Lecture 19). * `\(V_{AB} - V_0\)` gives the mean adjusted SS for .brand-blue[Group] in Model 2 and so it is the SS for .brand-blue[Temp*Brand] in Model 3 as it equals to `\(W_A + W_B + W_{AB}\)`. ]]]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Two-way ANOVA (interaction) model ]] .row[.split-two[ .column[.content[ ```r X0 <- model.matrix(~ 1, data=dat) XA <- model.matrix(~ -1 + Temp, data=dat) XB <- model.matrix(~ -1 + Brand, data=dat) XAB <- model.matrix(~ -1 + Temp:Brand, data=dat) y <- dat$TimetoFail PV0 <-X0 %*% solve(t(X0) %*% X0) %*% t(X0) PWA <- XA %*% solve(t(XA) %*% XA) %*% t(XA) - PV0 PWB <- XB %*% solve(t(XB) %*% XB) %*% t(XB) - PV0 PVAB <- XAB %*% solve(t(XAB) %*% XAB) %*% t(XAB) PWAB <- PVAB - PWA - PWB - PV0 PRes <- diag(nrow(dat)) - PVAB data.frame("Source"=c("A", "B", "A:B", "Res"), "df"=c(qr(PWA)$rank, qr(PWB)$rank, qr(PWAB)$rank, qr(PRes)$rank), "SS"=c(t(y) %*% PWA %*% y, t(y) %*% PWB %*% y, t(y) %*% PWAB %*% y, t(y) %*% PRes %*% y)) ``` ``` Source df SS 1 A 1 30363.0625 2 B 3 338.1875 3 A:B 3 1278.1875 4 Res 8 307.5000 ``` ```r anova(lm(TimetoFail ~ Brand*Temp, data=dat)) ``` ``` Analysis of Variance Table Response: TimetoFail Df Sum Sq Mean Sq F value Pr(>F) Brand 3 338.2 112.7 2.9328 0.099409 . Temp 1 30363.1 30363.1 789.9333 2.774e-09 *** Brand:Temp 3 1278.2 426.1 11.0846 0.003198 ** Residuals 8 307.5 38.4 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ]] .column.bg-brand-gray[.content[ ```r M3 <- lm(TimetoFail ~ Temp + Brand + Temp:Brand, data=dat) # or *M3 <- lm(TimetoFail ~ Temp*Brand, data=dat) anova(M3) ``` ``` Analysis of Variance Table Response: TimetoFail Df Sum Sq Mean Sq F value Pr(>F) Temp 1 30363.1 30363.1 789.9333 2.774e-09 *** Brand 3 338.2 112.7 2.9328 0.099409 . Temp:Brand 3 1278.2 426.1 11.0846 0.003198 ** Residuals 8 307.5 38.4 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` * `\(\text{RSS}_{M3}=\text{RSS}_{M2}\)`. * `\(\text{SS}_{AB:M3}+\text{RSS}_{M3}=\text{RSS}_{M1}\)`. * `\(\text{SS}_{A:M3}+\text{SS}_{B:M3}+\text{SS}_{AB:M3}=\text{TSS}_{M2}\)`. * `\(\text{DF}_{Trt:M3}=1+3+3=7=\text{DF}_{Trt:M2}\)`. ]] ]] --- layout: false class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Matrices X0, XA, XB, XAB ]] .row[.content[ ```r cbind(X0,XA,XB,XAB) ``` ``` (Intercept) TempCold TempRoom BrandA BrandB BrandC BrandD TempCold:BrandA TempRoom:BrandA TempCold:BrandB TempRoom:BrandB TempCold:BrandC TempRoom:BrandC TempCold:BrandD TempRoom:BrandD 1 1 0 1 1 0 0 0 0 1 0 0 0 0 0 0 2 1 0 1 0 1 0 0 0 0 0 1 0 0 0 0 3 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 4 1 0 1 0 0 0 1 0 0 0 0 0 0 0 1 5 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 6 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 7 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 8 1 1 0 0 0 0 1 0 0 0 0 0 0 1 0 9 1 0 1 1 0 0 0 0 1 0 0 0 0 0 0 10 1 0 1 0 1 0 0 0 0 0 1 0 0 0 0 11 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 12 1 0 1 0 0 0 1 0 0 0 0 0 0 0 1 13 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 14 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 15 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0 16 1 1 0 0 0 0 1 0 0 0 0 0 0 1 0 ``` * The first column is .brand-blue[X0], the next two is .brand-blue[XA], the next 4 is .brand-blue[XB] and the rest is .brand-blue[XAB]. * Matrices .brand-blue[XA], .brand-blue[XB] and .brand-blue[XAB] have ranks equal to the numbers of columns individually. ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Matrices PWA, PWB ]] .row[.split-50[ .column[.content[ ```r PWA #1/8-1/16; 0-1/16 ``` ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 2 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 3 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 4 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 5 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 6 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 7 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 8 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 9 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 10 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 11 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 12 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 13 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 14 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 15 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 16 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 -0.0625 -0.0625 -0.0625 -0.0625 0.0625 0.0625 0.0625 0.0625 ``` ]] .column.bg-brand-gray[.content[ ```r PWB #1/4-1/16; 0-1/16 ``` ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 2 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 3 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 4 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 5 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 6 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 7 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 8 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 9 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 10 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 11 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 12 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 13 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 14 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 15 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 16 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 -0.0625 -0.0625 -0.0625 0.1875 ``` ]] ]] --- layout: false class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Two way ANOVA table for factorial design ]] .row[.content[ Source | df | SS | MS | F ------------- | ------------- | ------------- | ------------- | ------------- Factor A | `\(a-1\)` | `\(\text{SS}_A=\sum\limits_{i=1}^a \frac{Y_{i \bullet \bullet}^2}{br}-\frac{Y_{\bullet \bullet \bullet}^2}{n}=br\sum\limits_{i=1}^a \bar{Y}_{i \bullet \bullet}^2-n\bar{Y}_{\bullet \bullet \bullet}^2\)` | `\(\text{MS}_A=\frac{\text{SS}_A}{a-1}\)` | `\(F_A=\frac{\text{MS}_A}{\text{MSR}}\)` Factor B | `\(b-1\)` | `\(\text{SS}_B=\sum\limits_{j=1}^b \frac{Y_{\bullet j \bullet}^2}{ar}-\frac{Y_{\bullet \bullet \bullet}^2}{n}=ar\sum\limits_{j=1}^b \bar{Y}_{\bullet j \bullet}^2-n\bar{Y}_{\bullet \bullet \bullet}^2\)` | `\(\text{MS}_B=\frac{\text{SS}_B}{b-1}\)` | `\(F_B=\frac{\text{MS}_B}{\text{MSR}}\)` Interactions | `\((a-1)(b-1)\)` | `\(\text{SS}_{AB}=\sum\limits_{i=1}^a\sum\limits_{j=1}^b \frac{Y_{ij \bullet }^2}{n_{ij}}-\sum\limits_{i=1}^a\frac{Y_{i \bullet \bullet}^2}{br}-\sum\limits_{j=1}^b \frac{Y_{\bullet j \bullet}^2}{ar}+\frac{Y_{\bullet \bullet \bullet}^2}{n}\)` | `\(\text{MS}_{AB}=\frac{\text{SS}_{AB}}{(a-1)(b-1)}\)` | `\(F_{AB}=\frac{\text{MS}_{AB}}{\text{MSR}}\)` Residuals | `\(ab(r-1)\)` | `\(\text{SSR}=(r-1)\sum\limits_{i=1}^a\sum\limits_{j=1}^b s_{ij}^2\)` | `\(\text{MSR}=\frac{\text{SSR}}{ab(r-1)}\)` | **Total** | `\(abr-1\)` | `\(\text{SSTo}=\sum\limits_{i=1}^a\sum\limits_{j=1}^b\sum\limits_{k=1}^r Y_{ijk}^2 - n\bar{Y}^2\)` | | * `\(\bar{Y}_{i \bullet \bullet}=\frac{1}{br} \sum\limits_{j=1}^{b}\sum\limits_{k=1}^r Y_{ijk}\)`, `\(\bar{Y}_{\bullet j \bullet}=\frac{1}{ar} \sum\limits_{i=1}^{a} \sum\limits_{k=1}^r Y_{ijk}\)`, `\(\bar{Y}=\frac{1}{n} \sum\limits_{i=1}^{a} \sum\limits_{j=1}^{b}\sum\limits_{k=1}^r Y_{ijk}\)` and `\(s_{ij}^2=\frac{1}{r-1}(\sum\limits_{k=1}^r Y_{ijk}^2-r\bar{Y}_{ij \cdot}^2)\)`. * `\(\text{SS}_{AB} = r \sum\limits_{i=1}^a \sum\limits_{j=1}^b\bar Y_{ij \bullet}-br\sum\limits_{i=1}^a \bar{Y}_{i \bullet \bullet}^2-ar\sum\limits_{j=1}^b \bar{Y}_{\bullet j \bullet}^2 +n\bar{Y}_{\bullet \bullet \bullet}^2=\color{blue}{\text{TSS}}-\text{SS}_A - \text{SS}_B\)` since `\(W_{AB} = V_{AB} - V_A - V_B + V_0\)`. ]] --- layout: false class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Decomposition of total variation in `\(Y\)` ]] .row[.content[ <img src="images/FTest3.png" width="80%" height="60%"> ]]