ANCOVA

class: split-70 with-border hide-slide-number bg-brand-red
background-image: url("images/USydLogo-black.svg")
background-size: 200px
background-position: 2% 90%

.column.white[.content[

# ANCOVA
## .black[STAT3022 Applied Linear Models Lecture 28]

### .black[2020/02/20]

]]

.column.bg-brand-charcoal[.content.white[
## Today

1. Analysis of covariance
2. Concomitant variable
3. Common slope model

]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Fun fact
]]
.row[.split-60[
.column[.content[

Eden and Fisher (1927) *Studies in crop variation: IV*. The experimental determination of the value of top dressings with cereals introduced a method to factor out the effects of conditions that are not part of the experimental design.

]]
.column.bg-brand-gray[.content[

<img src="images/Youngronaldfisher2.jpg" width="80%" height="80%"> 
]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Example: Optimal fish meal for ducklings
]]
.row[.split-70[
.column[.content[
* Experiment was to determine the optimum amount of fish meal to be used as a supplement in rations for young growing ducklings.

* There were four groups of differing amount of supplementary rations (5%, 10%, 15% and 20%) with five ducks in each group.  
*	The `age` of the ducks (in weeks) and the `weight` of the ducks (in grams) are measured at the end of the experiment.

|Weight | Age | Weight | Age | Weight | Age | Weight | Age |
|---|---|---|---|---|---|---|---|
|46.1 | 1 | 67.8 | 2 | 52.2 | 1 | 100.0 | 2 |
|151.6| 5| 142.8| 4   | 190.8| 4| 249.2|4 |
|287.1| 8| 365.2| 7   | 367.2| 6| 525.0| 6|
|545.9| 11| 637.7| 10| 547.8| 7| 942.8|9 |
|634.1| 12| 855.9| 12| 1012.8| 11| 1210.2|12 |

]]
.column.bg-brand-gray[.content[

.bottom_abs.width100[
Castillo (1938) Determination of the optimum amount of fsh meal rations. Philippine Agriculturist 27 (18).
]

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Example: Fish meal one-way ANOVA
]]
.row[.split-60[
.column[.content[

```r
Meal<-factor(rep(c("5%","10%","15%","20%"),times=5),
 levels=c("5%", "10%", "15%", "20%"))
Age<-c(1,2,1,2,5,4,4,4,8,7,6,6,11,10,7,9,12,12,11,12)
Weight<-c( 46.1, 67.8,52.2,100,151.6,142.8,190.8,
 249.2,287.1,365.2,367.2,525,545.9,637.7,547.8,
 942.8,634.1,855.9,1012.8,1210.2)
dat <- data.frame(Age, Weight, Meal)
summary(aov(log(Weight) ~ Meal, data=dat))
```

```
              Df Sum Sq Mean Sq F value Pr(>F)
  Meal         3  0.989  0.3296   0.288  0.834
  Residuals   16 18.331  1.1457
```

* The meal effect is insignificant. Why?

]]
.column.bg-brand-gray[.content[

```r
dat
```

```
     Age Weight Meal
  1    1   46.1   5%
  2    2   67.8  10%
  3    1   52.2  15%
  4    2  100.0  20%
  5    5  151.6   5%
  6    4  142.8  10%
  7    4  190.8  15%
  8    4  249.2  20%
  9    8  287.1   5%
  10   7  365.2  10%
  11   6  367.2  15%
  12   6  525.0  20%
  13  11  545.9   5%
  14  10  637.7  10%
  15   7  547.8  15%
  16   9  942.8  20%
  17  12  634.1   5%
  18  12  855.9  10%
  19  11 1012.8  15%
  20  12 1210.2  20%
```

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Example: Fish meal plots
]]
.row[.split-50[
.column[.content[

```r
ggplot(dat, aes(Meal, log(Weight), col=Meal)) +  
  geom_boxplot() + theme_classic() + 
  guides(col=FALSE) + coord_flip()
```

<img src="lecture28_2020JC_files/figure-html/unnamed-chunk-5-1.svg" style="display: block; margin: auto;" />
* No clear difference between 4 boxplots.

]]
.column.bg-brand-gray[.content[

```r
ggplot(dat, aes(Age, log(Weight), col=Meal)) + 
  geom_point(size=3) + geom_line() + 
  theme_classic()
```

* Consistent order in Meal across Age.

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Analysis of Covariance
]]
.row[.split-40[
.column[.content[

* Analysis of covariance (ANCOVA) incorporates regression with (treatment) groups into the analysis of designed experiments to adjust treatment comparisons for concomitant variables}.  
* .brand-blue[Concomitant variables] ( `$x_1, x_2, \dots, x_k$` ) are quantitative extraneous variables that are thought to influence the response that was not taken into account in the design of experiment. 
* It is assumed that the concomitant variables are .brand-blue[not affected] by the treatments.	
* That is, the two factors of meal and age are .brand-blue[independent].
]]
.column.bg-brand-gray[.content[

* The model for one treatment factor and one concomitant variable is given as

`$Y_{ij} = \mu + \beta x_{ij} + \alpha_i + \epsilon_{ij},\hspace{3mm} \epsilon\sim NID(0,\sigma^2),\hspace{2mm} i = 1,\ldots,t,$` 
`$j= 1,\dots, n_i.$`
* The ANOVA is effectively carried out .brand-blue[on the residuals after regression] on the true but unknown `$\beta$`'s,

`$$\underbrace{Y_{ij}-\beta_1 x_{1ij}-\ldots -\beta_k x_{kij}}_{\approx R_{ij} \text{ from regression on conc. var.}}=\mu_{i}+\epsilon_{ij},\hspace{3mm} i=1,\ldots,t.$$`

* This is a regression model with .brand-blue[different intercepts] for the treatment groups but .brand-blue[common slope].

`$Y_{ij}=\underbrace{(\mu+\alpha_i)}_{\beta_{0,i}}+\beta x_{ij} + \epsilon_{ij},\hspace{2mm}\epsilon\sim NID(0,\sigma^2),\hspace{2mm} i=1,\ldots,t,$`
`$j= 1,\dots,n_i.$`

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
#  Example: Fish meal ANCOVA (not orthogonal)
]]
.row[.content[

```r
summary(M1 <- aov(log(Weight) ~ Age + Meal, data = dat)) #Meal depends on age; Meal is significant now.
```

```
              Df Sum Sq Mean Sq F value   Pr(>F)    
  Age          1 16.458  16.458  258.89 7.18e-11 ***
  Meal         3  1.909   0.636   10.01 0.000716 ***
  Residuals   15  0.954   0.064                     
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

```r
summary(aov(log(Weight) ~ Meal + Age, data = dat)) #Age depends on Mean. Make sense? Which one is correct?
```

```
              Df Sum Sq Mean Sq F value   Pr(>F)    
  Meal         3  0.989   0.330   5.184   0.0117 *  
  Age          1 17.378  17.378 273.365 4.88e-11 ***
  Residuals   15  0.954   0.064                     
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```

```r
round(lm(M1)$coeff,3) #So age comes first.
```

```
  (Intercept)         Age     Meal10%     Meal15%     Meal20% 
        3.566       0.255       0.306       0.622       0.818
```

]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Example: Fish meal - common slope model
]]
.row[.split-50[
.column[.content[

* Meal becomes significant after allowing for the variability due to age within each meal group.

`$$Y_{ij} = \mu + \beta x_{ij} + \alpha_i + \epsilon_{ij},\hspace{3mm} \epsilon\sim NID(0,\sigma^2),$$`
`$i = 1,\ldots,t,\;\; j= 1,\dots, n_i.$`
So for each meal group,
$$
\widehat{Y}=
`\begin{cases}
(3.566 + 0.000) + 0.255x, & \text{if in Meal 5% group} \\
(3.566 + 0.306) + 0.255x, & \text{if in Meal 10% group}\\
(3.566 + 0.622) + 0.255x, & \text{if in Meal 15% group}\\
(3.566 + 0.818) + 0.255x, & \text{if in Meal 20% group}\\
\end{cases}`
$$
### Thinking Question
How is this different to ftting a simple linear regression model to each group?

Can we have different intercept to each group as well?

]]
.column.bg-brand-gray[.content[

```r
gg_color_hue <- function(n) {
hues = seq(15,375,length = n + 1)
hcl(h = hues, l = 65, c = 100)[1:n]}
line.df <- data.frame(intercept=lm(M1)$coeff[1] + 
 c(0, lm(M1)$coeff[3:5]), slope=lm(M1)$coef[2], 
 Meal=c("5%", "10%", "15%", "20%"))
ggplot(dat, aes(Age, log(Weight), col=Meal)) + 
 geom_point(size=3) + theme_classic() +
 geom_abline(slope=line.df$slope, 
 intercept=line.df$intercept, col=gg_color_hue(4))
```

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# ANCOVA: Treatment contrast
]]
.row[.content[
* Consider a contrast `$\sum_{i=1}^t c_i \alpha_i,\hspace{3mm}  \sum_{i=1}^t c_i = 0.$`
* In one-way ANOVA, it is estimated by

`$$\sum_{i=1}^t c_i\hat{\alpha}_i=\sum_{i=1}^t c_i \Big(\bar{Y}_{i\bullet}-\hat{\beta} \bar{x}_{i\bullet}\Big),$$`
where `$\hat{\beta}$` is the estimate for the coefficient of `$x$`.

* Then the variance of this contrast estimate is

`$$\text{var}\left(\sum_{i=1}^t c_i\Big(\bar{Y}_{i\bullet}- \hat{\beta}\bar{x}_{i\bullet}\Big)\right)
=\sigma^2\bigg( \sum_{i=1}^t\frac{c_i^2}{n_i}+\frac{(\sum_{i=1}^t c_i \bar{x}_{i\bullet})^2}{S_{0xx}}\bigg).$$`

* `$S_{0xx}$` is the residual SS if a one-way model is fitted to the `$x$`-data, ie taking Age as `$Y$` variable and Meal as `$x$` variable. As `$x$` is the outcome variable, this is Total `$\text{S}_{xx}$` minus SS for meal.

]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# ANCOVA: Treatment contrast
]]
.row[.split-50[
.column[.content[

* Find a 95% CI for `$\alpha_3 - \alpha_2$` with `$c_1=0,c_2=-1$`, `$c_3=1,c_4=0$` and `$n_i\equiv 5$` and `$\hat{\sigma}^2 = 0.064$` (P.7).

```r
round(c(tapply(Age, Meal, mean),
        tapply(log(Weight),Meal,mean)),3)
```

```
     5%   10%   15%   20%    5%   10%   15%   20% 
  7.400 7.000 5.800 6.600 5.453 5.658 5.668 6.067
```
* From one-way ANOVA on the `$x$`-values, `$S_{0xx} = 267.2.$`

```r
anova(lm(Age ~ Meal))
```

```
  Analysis of Variance Table
  
  Response: Age
            Df Sum Sq Mean Sq F value Pr(>F)
  Meal       3    7.0  2.3333  0.1397 0.9347
  Residuals 16  267.2 16.7000
```

]]
.column.bg-brand-gray[.content[

```r
sig2=sigma(M1)^2; S0xx=sum(lm(Age ~ Meal)$res^2) 
round(c(sig2,S0xx),4)
```

```
  [1]   0.0636 267.2000
```

`$\sum_{i=1}^t c_i \Big(\bar{Y}_{i\bullet}-\hat{\beta} \bar{x}_{i\bullet}\Big)$` 
`$=(5.668-0.255\times 5.8)-(5.658-0.255\times 7)=0.316$`

* So standard error for `$\hat\alpha_3 - \hat\alpha_2$` is `$\sqrt{ \sigma^2 \bigg[ \sum_{i=1}^t\frac{c_i^2}{n_i} + \frac{(\sum_{i=1}^t c_i \bar{x}_{i\bullet})^2}{S_{0xx}} \bigg] }$`
`$=\sqrt{0.0636\bigg[ 2(\frac{1}{5})+\frac{(5.8-7)^2}{267.2}\bigg]}=0.161$`

So the 95% CI for `$\alpha_3 - \alpha_2$` is
`$0.316 \pm t_{15}(0.975) \times 0.161= (-0.026114, \ 0.658219)$`

and so the difference is insignificant.

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Example: Linear ANCOVA model
]]
.row[.split-50[
.column[.content[
* For more complex models it is better to proceed via the linear model theory.
* Construct the design matrix `$\boldsymbol{X}$` with `$\mathbf{1}$` in the first column, 
	the `$x_{ij}$` values in the second (if there are `$k$` covariates use columns `$2,\ldots,k+1$`) and the remaining columns are 0-1 factor codes (treatment constraint incorporated).
*  `$\boldsymbol{\theta}=(\mu,\beta,\alpha_2,\dots,\alpha_t)$`
* `$\hat{\boldsymbol{\theta}} = (\boldsymbol{X}^\top \boldsymbol{X})^{-1}\boldsymbol{X}^\top \boldsymbol{Y}$`
* Write the contrasts of interest as `$\mathbf{c}^\top\boldsymbol{\theta}$`.
* The estimator is `$\mathbf{c}^\top\hat{\boldsymbol{\theta}}$` and its variance is `$\sigma^2 \mathbf{c}^\top(\boldsymbol{X}^\top \boldsymbol{X})^{-1}\mathbf{c}$`.
* These results can be used to construct CIs and tests. How?

]]
.column.bg-brand-gray[.content[
### LS estimators

```r
x.t <- model.matrix(~Meal - 1) #dummy var for factor 
X <- cbind(1, Age, x.t[,-1])
b <- solve( t(X) %*% X) %*% t(X) %*% log(Weight)
round(b[,1],3)
```

```
              Age Meal10% Meal15% Meal20% 
    3.566   0.255   0.306   0.622   0.818
```

```r
round(coef(M1)[1:4],3); round(coef(M1)[5],3)
```

```
  (Intercept)         Age     Meal10%     Meal15% 
        3.566       0.255       0.306       0.622
```

```
  Meal20% 
    0.818
```

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Example: design matrix and LS estimate
]]
.row[.split-40[
.column[.content[

```
       Age Meal10% Meal15% Meal20%
  1  1   1       0       0       0
  2  1   2       1       0       0
  3  1   1       0       1       0
  4  1   2       0       0       1
  5  1   5       0       0       0
  6  1   4       1       0       0
  7  1   4       0       1       0
  8  1   4       0       0       1
  9  1   8       0       0       0
  10 1   7       1       0       0
  11 1   6       0       1       0
  12 1   6       0       0       1
  13 1  11       0       0       0
  14 1  10       1       0       0
  15 1   7       0       1       0
  16 1   9       0       0       1
  17 1  12       0       0       0
  18 1  12       1       0       0
  19 1  11       0       1       0
  20 1  12       0       0       1
```

]]
.column.bg-brand-gray[.content[
### Covariance of LS estimators

```r
res <- log(Weight) - X %*% b
RSS <- t(res) %*% res; RSS <- RSS[1,1]
covLS <- RSS/15 * solve(t(X)%*%X)
round(covLS, 4)
```

```
                      Age Meal10% Meal15% Meal20%
           0.0257 -0.0018 -0.0134 -0.0155 -0.0141
  Age     -0.0018  0.0002  0.0001  0.0004  0.0002
  Meal10% -0.0134  0.0001  0.0255  0.0129  0.0128
  Meal15% -0.0155  0.0004  0.0129  0.0260  0.0130
  Meal20% -0.0141  0.0002  0.0128  0.0130  0.0256
```

```r
round(vcov(M1),4)
```

```
              (Intercept)     Age Meal10% Meal15% Meal20%
  (Intercept)      0.0257 -0.0018 -0.0134 -0.0155 -0.0141
  Age             -0.0018  0.0002  0.0001  0.0004  0.0002
  Meal10%         -0.0134  0.0001  0.0255  0.0129  0.0128
  Meal15%         -0.0155  0.0004  0.0129  0.0260  0.0130
  Meal20%         -0.0141  0.0002  0.0128  0.0130  0.0256
```

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Example: Predict weight and pairwise constrast
]]
.row[.split-50[
.column[.content[

### Predict weight at Age=10 and Meal=10%

```r
predict.at=data.frame(Age=10,Meal="10%")
pred0=predict(M1,newdata=predict.at,
        interval="prediction",level=0.95)
exp(pred0) #take exp for weight
```

```
         fit      lwr      upr
  1 615.7019 338.9529 1118.412
```

```r
x0=c(1,10,1,0,0); y0=sum(x0*b) #check calculation
sig2=sigma(M1)^2
sey0=sqrt(t(x0)%*%covLS%*%x0+sig2)
round(exp(c(y0,y0-qt(0.975,15)*sey0,
            y0+qt(0.975,15)*sey0)),4)
```

```
  [1]  615.7019  338.9529 1118.4115
```

]]
.column.bg-brand-gray[.content[
###  Pairwise CI
Construct a 95% CI to compare the average effect of the 10% and 15% fish meal diets using the transformed data.

```r
c <- matrix(c(0,0,-1, 1,0))
est <- t(c)%*%b
se <- sqrt(t(c)%*%covLS%*%c)
round(c(est,se),4)
```

```
  [1] 0.3161 0.1605
```

```r
t <- qt(0.975, 15)
round(c(est - t * se, est + t * se),6) #agree
```

```
  [1] -0.026114  0.658219
```

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Example: different slopes for each Meal group?
]]
.row[.split-50[
.column[.content[

```r
M2 <- lm(log(Weight) ~ Age*Meal, data = dat)
round(coef(M2)[1:4],4); round(coef(M2)[5:8],4)
```

```
  (Intercept)         Age     Meal10%     Meal15% 
       3.7039      0.2364      0.1866      0.2389
```

```
      Meal20% Age:Meal10% Age:Meal15% Age:Meal20% 
       0.7464      0.0160      0.0610      0.0085
```

```r
b=coef(M2)
```

So for each meal group,
$$
\widehat{Y}=
`\begin{cases}
(3.704 + 0.000) + (0.236+0.000)x, & \text{if 5% meal} \\
(3.704 + 0.187) + (0.236+0.016)x, & \text{if 10% meal}\\
(3.704 + 0.239) + (0.236+0.061)x, & \text{if 15% meal}\\
(3.704 + 0.746) + (0.236+0.009)x, & \text{if 20% meal}\\
\end{cases}`
$$
]]
.column.bg-brand-gray[.content[

```r
gg_color_hue <- function(n) {
hues = seq(15,375,length = n + 1)
hcl(h = hues, l = 65, c = 100)[1:n]}
line.df <- data.frame(intercept=coef(M2)[1]+ 
 c(0,coef(M2)[3:5]), slope=coef(M2)[2]+c(0,coef(M2)[6:8]), 
 Meal=c("5%", "10%", "15%", "20%"))
ggplot(dat, aes(Age, log(Weight), col=Meal)) + 
 geom_point(size=3) + theme_classic() +
 geom_abline(slope=line.df$slope, 
 intercept=line.df$intercept, col=gg_color_hue(4))
```

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# Example: design matrix with interaction terms
]]
.row[.split-60[
.column[.content[

```r
X
```

```
       Age Meal10% Meal15% Meal20% Meal10% Meal15% Meal20%
  1  1   1       0       0       0       0       0       0
  2  1   2       1       0       0       2       0       0
  3  1   1       0       1       0       0       1       0
  4  1   2       0       0       1       0       0       2
  5  1   5       0       0       0       0       0       0
  6  1   4       1       0       0       4       0       0
  7  1   4       0       1       0       0       4       0
  8  1   4       0       0       1       0       0       4
  9  1   8       0       0       0       0       0       0
  10 1   7       1       0       0       7       0       0
  11 1   6       0       1       0       0       6       0
  12 1   6       0       0       1       0       0       6
  13 1  11       0       0       0       0       0       0
  14 1  10       1       0       0      10       0       0
  15 1   7       0       1       0       0       7       0
  16 1   9       0       0       1       0       0       9
  17 1  12       0       0       0       0       0       0
  18 1  12       1       0       0      12       0       0
  19 1  11       0       1       0       0      11       0
  20 1  12       0       0       1       0       0      12
```

]]
.column.bg-brand-gray[.content[

```r
round(anova(M2),2)
```

```
 Analysis of Variance Table
 
 Response: log(Weight)
 Df Sum Sq Mean Sq F value Pr(>F) 
 Age 1 16.46 16.46 240.82 <2e-16 ***
 Meal 3 1.91 0.64 9.31 <2e-16 ***
 Age:Meal 3 0.13 0.04 0.65 0.6 
 Residuals 12 0.82 0.07 
 ---
 Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
```
* It is clear that the interaction term is not significant. 
* There is no significant difference in slope.

]]
]]

---

.row.bg-brand-blue.white[.content.vmiddle[
# A note on ANCOVA 
]]
.row[.content[
* ANCOVA is sensible only if common slope model is a good fit to the data and if the covariate values are unrelated to the groups.

]]