class: split-70 with-border hide-slide-number bg-brand-red background-image: url("images/USydLogo-black.svg") background-size: 200px background-position: 2% 90% .column.white[.content[ <br><br><br> # Nested Factors ## .black[STAT3022 Applied Linear Models Lecture 26] <br><br><br> ### .black[2020/02/20] ]] .column.bg-brand-charcoal[.content.white[ ## Today 1. Nested factors 2. Factors in R 3. Symbolic model formulae ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Statistical packages ]] .row[.split-60[ .column[.content[ * John Chambers was the primary developer of the statistical programming language S in 1976. * R, based on S, was developed by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand in 1993. * Wilkinson & Rogers (1973) discusses model formulae which now major statistical programming languages are based on (S-Plus, R, Genstat, etc). .bottom_abs.width100[ Wilkinson, G. N. and Rogers, C. E. (1973) Symbolic Description of Factorial Models for Analysis of Variance. *Journal of the Royal Statistical Society: Series C (Applied Statistics)* **22** (3) 392-399 ] ]] .column.bg-brand-gray[.content[ <img src="images/PackageDevelopers.png" width="90%" height="90%"> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example -Scientists in labs ]] .row[.split-70[ .column[.content[ * An experiment is conducted involving the measurement of the concentration of a certain chemical in a compound. * Primary aim of the experiment: .brand-blue[Do different laboratories provide systematically different results?] * Here, the scientist is said to be .brand-blue[nested] within the labs. * In particular, nesting occurs if one must .brand-blue[not include the main effect of the nested factor], but only the effect of the nested factor .brand-blue[within] the factor nesting it. Here one cannot consider scientist as the main effect and an effect nested within lab. ]] .column.bg-brand-gray[.content[ <img src="images/nest.jpg" width="100%" height="100%"> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Modelling with nested factors ]] .row[.split-50[ .column[.content[ * An two factor ANOVA model with factor B nested within factor A is written as `$$Y_{ijk} = \mu + \alpha_i + (\alpha\beta)_{ij} + \epsilon_{ijk},\quad \epsilon_{ijk} \sim NID(0,\sigma^2)$$` where `\(i=1,\ldots,\color{red}{a}\)`, `\(j=1,,\ldots,\color{blue}{b}\)` and `\(k=1,\ldots,n_{ij}=r\)` for simplicity. * As per usual, the parameters will be constrainted, say by setting the parameter of the first (basline) level to be zero for any factors. * The model above essentially contains the main effect for A and an interaction term A `\(\times\)` B, but no main effect for B and can be written symbolically with the grand mean taken as implicit, `$$\text{A + A:B}$$` ]] .column.bg-brand-gray[.content[ ### Labelling factor in R ```r data <- c(1,2,2,3,1,2,3,3,1,2,3,3,1) fdata <- factor(data) fdata ``` ``` [1] 1 2 2 3 1 2 3 3 1 2 3 3 1 Levels: 1 2 3 ``` ```r rdata <- factor(data,labels=c("I","II","III")) rdata ``` ``` [1] I II II III I II III III I II III III I Levels: I II III ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Labelling Factor in R ]] .row[.split-50[ .column[.content[ ```r mons <- factor(c("March", "April", "January", "November","January","September", "October", "September", "November", "August","January", "November", "November", "February", "May", "August", "July", "December", "August", "August", "September", "November","February", "April")) ``` Above is ordered alphabetically by default so consequently, ```r table(mons) ``` ``` mons April August December February January July March May 2 4 1 2 3 1 1 1 November October September 5 1 3 ``` ]] .column.bg-brand-gray[.content[ ### Changing Factor Order in R ```r mons <- factor(mons, levels=c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December")) ``` The factors are now ordered in a specified order. ```r table(mons) ``` ``` mons January February March April May June July August 3 2 1 2 1 0 1 4 September October November December 3 1 5 1 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Numeric Factor in R ]] .row[.split-50[ .column[.content[ ```r fert <- factor(c(10,20,20,50,10,20,10,50,20)) mean(fert) ``` ``` Warning in mean.default(fert): argument is not numeric or logical: returning NA ``` ``` [1] NA ``` ```r mean(as.numeric(fert)) ``` ``` [1] 1.888889 ``` `as.numeric` function returns the internal integer values of the factor. ]] .column.bg-brand-gray[.content[ ```r mean(as.numeric(levels(fert)[fert])) ``` ``` [1] 23.33333 ``` ```r mean(as.numeric(as.character(fert))) ``` ``` [1] 23.33333 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Symbolic model formulae ]] .row[.split-50[ .column[.content[ Recall `\(Y_{ijk} = \mu + \alpha_i + (\alpha\beta)_{ij} + \epsilon_{ijk}\)` can be written symbolically as A + A:B ### Operators: * .brand-blue[interaction] operator: e.g. A:B * .brand-blue[crossing] operator: e.g. * A `\(\ast\)` B = A + B + A:B * (A + B) `\(\ast\)` C = A + B + C + (A + B):C <br> = A + B + C + A:C + B:C * .brand-blue[nesting] operator, e.g. A/B = A + A:B, suppressing marginal term B, or more generally, <br> L/M = L + FAC(L):M <br> where FAC(L) is the interaction of all factors in L. Eg. * (A `\(\ast\)` B) / C = A + B + A:B + A:B:C * (A + B) / C = A + B + A:B:C ]] .column.bg-brand-gray[.content[ * .brand-blue[deletion] operator: delete specifed terms from the preceding term, e.g. * A `\(\ast\)` B `\(\ast\)` C `\(-\)` A : B : C <br> = A + B + C + A : B + A : C + B : C * A `\(\ast\)` B `\(\ast\)` C `\(-\)` A `\(\ast\)` B <br> = C + A : C + B : C + A : B : C * Simplifcation e.g. * A : B : A = A : B * A + B + A = A + B. * The interaction operators is .brand-blue[distributive]. E.g., <br> (A + B) : C = A : C + B : C ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example - Pollen data ]] .row[.split-70[ .column[.content[ * A Canadian botanist was interested in the abundance of pine pollen in cores taken from the bottom of several bogs in northern Alberta. * The botanist sampled pollen at .brand-blue[three depths]: shallow, medium and deep respectively (corresponding to 0.5, 2 and 3 meter depths respectively). * She took .brand-blue[two samples of peat] at each of these depths, and prepared .brand-blue[2 slides] from each of the six samples .brand-blue[for microscopic examination]. * The .brand-blue[number of pollen grains] from the microscope slides is the response. **Sample** |**Shallow**|**Sum**|**Medium**|**Sum**|**Deep**|**Sum**|**Total**| ------------|-----------|-------|----------|-------|--------|-------|---------| **Sample 1**| 12,14 | 26 | 16,12 | 28 | 21,29 | 50 | 104 **Sample 2**| 10,7 | 17 | 10,19 | 29 | 33,30 | 63 | 109 **Total** | | 43 | | 57 | | 113| 213 ]] .column.bg-brand-gray[.content[ <img src="images/pollen.jpg" width="70%" height="70%"> <img src="images/pollen2.png" width="70%" height="70%"> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example - Pollen data: Nested ANOVA tables ]] .row[.split-50[ .column[.content[ ```r dat <- read.table(file = "data/pollen.txt", header = T) dat ``` ``` Sample.id Depth Sample Count 1 no1 shallow s1 12 2 no1 shallow s1 14 3 no2 shallow s2 10 4 no2 shallow s2 7 5 no3 medium s1 16 6 no3 medium s1 12 7 no4 medium s2 10 8 no4 medium s2 19 9 no5 deep s1 21 10 no5 deep s1 29 11 no6 deep s2 33 12 no6 deep s2 30 ``` * .brand-blue[Sample.id] groups contain .brand-blue[Depth] groups by .brand-blue[Sample] groups. * Will see that the SS for Sample.id contains SS for Depth (as well Sample). ]] .column.bg-brand-gray[.content[ ```r M1 <- lm(Count~Depth/Sample, data=dat) anova(M1) ``` ``` Analysis of Variance Table Response: Count Df Sum Sq Mean Sq F value Pr(>F) Depth 2 686.00 343.00 22.4918 0.00163 ** Depth:Sample 3 62.75 20.92 1.3716 0.33847 Residuals 6 91.50 15.25 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` * To see why the interaction term has 3 df, look at the interaction parameters. * As .brand-blue[Sample] is to be suppressed, `\(\text{SS}_{\text{Sample}}\)` is added to this `\(\text{SS}^{\text{pooled}}_{\text{Depth:Sample}}=\text{SS}_{\text{Sample}}+\text{SS}_{\text{Depth:Sample}}\)`. * `\(\text{SS}_{\text{Depth}}+\text{SS}^{\text{pooled}}_{\text{Depth:Sample}}=\text{SS}_{\text{Sample.id}}\)` like TSS ( `\(\text{SS}_{\text{A}}+\text{SS}_{\text{B}}+\text{SS}_{\text{A:B}}\)` ) for 6 groups in 1-way ANOVA. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Calculation of SS ]] .row[.split-50[ .column[.content[ ```r Y=dat$Count; d=dat$Depth s=dat$Sample; si=dat$Sample.id syd=tapply(Y,d,sum) sys=tapply(Y,s,sum) sysi=tapply(Y,si,sum) sy=sum(Y); n=length(Y) nd=n/length(unique(d)) ns=n/length(unique(s)) nsi=n/length(unique(si)) c(n,nd,ns,nsi) ``` ``` [1] 12 4 6 2 ``` ```r c(sy,syd,sys) ``` ``` deep medium shallow s1 s2 213 113 57 43 104 109 ``` ```r sysi ``` ``` no1 no2 no3 no4 no5 no6 26 17 28 29 50 63 ``` ]] .column.bg-brand-gray[.content[ ```r syy=sum(Y^2)-sy^2/n #SS total ssd=sum(syd^2)/nd-sy^2/n #SS depth (A) sss=sum(sys^2)/ns-sy^2/n #SS sample (B) sssi=sum(sysi^2)/nsi-sy^2/n #SS sam.id, TSS (A+B+A:B) ssi=sssi-ssd #SS (pooled) depth:sampe (B+A:B) ssr=syy-sssi #RSS round(c(syy,ssd,sss,sssi,ssi,ssr),2) ``` ``` [1] 840.25 686.00 2.08 748.75 62.75 91.50 ``` `\begin{eqnarray*} \text{SS}_{\text{Total}}&=&12^2+14^2+\dots+30^2-\frac{213^2}{12}=840.25\\ \text{SS}_{\text{Depth}}&=&\frac{113^2+57^2+43^2}{4}-\frac{213^2}{12}=686\\ \text{SS}_{\text{Sample}}&=&\frac{104^2+109^2}{6}-\frac{213^2}{12}=2.083\\ \text{SS}_{\text{SamID}}&=&\frac{26^2+\dots+63^2}{2}-\frac{213^2}{12}=748.75 \ \text{(TSS; gps)}\\ \text{SS}^{\text{pooled}}_{\text{Dep:Sam}}&=&\text{SS}_{\text{SamID}}-\text{SS}_{\text{Depth}}=748.75-686=62.75\\ \text{RSS}&=&\text{SS}_{\text{Total}}-\text{SS}_{\text{SamID}}=840.25-748.75=91.5 \end{eqnarray*}` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Pollen data: nested ANOVA parameter estimates ]] .row[.split-60[ .column[.content[ ```r summary(M1) ``` ``` Call: lm(formula = Count ~ Depth/Sample, data = dat) Residuals: Min 1Q Median 3Q Max -4.500 -1.625 0.000 1.625 4.500 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 25.000 2.761 9.054 0.000102 *** Depthmedium -11.000 3.905 -2.817 0.030482 * Depthshallow -12.000 3.905 -3.073 0.021861 * Depthdeep:Samples2 6.500 3.905 1.664 0.147072 Depthmedium:Samples2 0.500 3.905 0.128 0.902303 Depthshallow:Samples2 -4.500 3.905 -1.152 0.293020 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.905 on 6 degrees of freedom Multiple R-squared: 0.8911, Adjusted R-squared: 0.8004 F-statistic: 9.82 on 5 and 6 DF, p-value: 0.007459 ``` ]] .column.bg-brand-gray[.content[ ```r mysi=tapply(Y,si,mean); my=mean(Y) mysi ``` ``` no1 no2 no3 no4 no5 no6 13.0 8.5 14.0 14.5 25.0 31.5 ``` No1 to no6 are 6 group means in the order (shallow, s1), (shallow, s2), <br> (medium, s1), (medium, s2), <br> .brand-blue[(deep, s1)], (deep, s2) Note: Baseline is deep and Samples1. <br> This is given by Sample.id no5. ]] ]] --- layout: false class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Calculation of parameter estimates (not asked) ]] .row[.content[ For 2-way factorial design under treatment constraints with `\((i,j)=(1,1)\)` as the baseline: `$$\mu=\bar{Y}_{11\bullet},\hspace{2mm}\alpha_i=\bar{Y}_{i 1 \bullet}-\bar{Y}_{11\bullet},\hspace{2mm}\beta_j=\bar{Y}_{1 j \bullet}-\bar{Y}_{11\bullet},\hspace{2mm}(\alpha\beta)_{ij}= \bar{Y}_{ij\bullet}-\bar{Y}_{i1 \bullet}-\bar{Y}_{1 j\bullet}+\bar{Y}_{11\bullet}$$` For 2-way nested design with `\(\beta_j\)` absorbed to `\((\alpha\beta)_{ij}\)` `$$(\alpha\beta)'_{ij}=\beta_j+(\alpha\beta)_{ij}=\bar{Y}_{1 j \bullet}-\bar{Y}_{11\bullet}+\bar{Y}_{ij\bullet}-\bar{Y}_{i1 \bullet}-\bar{Y}_{1 j\bullet}+\bar{Y}_{11\bullet}=\bar{Y}_{ij\bullet}-\bar{Y}_{i1 \bullet}$$` `\(\mu=\bar{Y}_{11\bullet}\)` is (deep, s1) in no5. <br> `\(\alpha_2=\bar{Y}_{2 1 \bullet}-\bar{Y}_{11\bullet}\)` where `\(\bar{Y}_{2 1 \bullet}\)` is (medium, s1) in no3<br> `\(\alpha_3=\bar{Y}_{3 1 \bullet}-\bar{Y}_{11\bullet}\)`, `\(\bar{Y}_{3 1 \bullet}\)` is (shallow, s1) in no1<br> `\((\alpha\beta)_{12}= \bar{Y}_{12\bullet}-\bar{Y}_{11 \bullet}\)` where `\(\bar{Y}_{12\bullet}\)` is (deep, s2) in no6<br> `\((\alpha\beta)_{22}= \bar{Y}_{22\bullet}-\bar{Y}_{21 \bullet}\)` where `\(\bar{Y}_{22\bullet}\)` is (medium, s2) in no4<br> `\((\alpha\beta)_{32}= \bar{Y}_{32\bullet}-\bar{Y}_{31 \bullet}\)` where `\(\bar{Y}_{32\bullet}\)` is (shallow, s2) in no2<br> ```r unname(c(mysi[5],mysi[3]-mysi[5],mysi[1]-mysi[5],mysi[6]-mysi[5],mysi[4]-mysi[3],mysi[2]-mysi[1])) #par est ``` ``` [1] 25.0 -11.0 -12.0 6.5 0.5 -4.5 ``` ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example - Pollen data: Reordering factors ]] .row[.split-40[ .column[.content[ ### Reordering factors ```r dat$Depth ``` ``` [1] shallow shallow shallow shallow medium medium medium medium deep [10] deep deep deep Levels: deep medium shallow ``` ```r Depth.fac <- factor(dat$Depth,levels= c("shallow","medium","deep")) Depth.fac ``` ``` [1] shallow shallow shallow shallow medium medium medium medium deep [10] deep deep deep Levels: shallow medium deep ``` ```r unname(c(mysi[1],mysi[3]-mysi[1],mysi[5]-mysi[1], mysi[2]-mysi[1],mysi[4]-mysi[3],mysi[6]-mysi[5])) ``` ``` [1] 13.0 1.0 12.0 -4.5 0.5 6.5 ``` ]] .column.bg-brand-gray[.content[ ```r M2 <- lm(Count ~ Depth.fac/Sample) summary(M2) ``` ``` Call: lm(formula = Count ~ Depth.fac/Sample) Residuals: Min 1Q Median 3Q Max -4.500 -1.625 0.000 1.625 4.500 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 13.000 2.761 4.708 0.0033 ** Depth.facmedium 1.000 3.905 0.256 0.8064 Depth.facdeep 12.000 3.905 3.073 0.0219 * Depth.facshallow:Samples2 -4.500 3.905 -1.152 0.2930 Depth.facmedium:Samples2 0.500 3.905 0.128 0.9023 Depth.facdeep:Samples2 6.500 3.905 1.664 0.1471 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.905 on 6 degrees of freedom Multiple R-squared: 0.8911, Adjusted R-squared: 0.8004 F-statistic: 9.82 on 5 and 6 DF, p-value: 0.007459 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Why not simply label each sample differently? ]] .row[.split-50[ .column[.content[ ### Yes you could using .brand-blue[Sample.id] instead. ```r anova(lm(Count ~ Depth/Sample.id, data=dat)) #nested ``` ``` Analysis of Variance Table Response: Count Df Sum Sq Mean Sq F value Pr(>F) Depth 2 686.00 343.00 22.4918 0.00163 ** Depth:Sample.id 3 62.75 20.92 1.3716 0.33847 Residuals 6 91.50 15.25 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` * This is exactly the same as M1 using .brand-blue[Sample] instead of .brand-blue[Sample.id]. * This is because .brand-blue[Sample] and .brand-blue[Sample.id] become the same when nested within Depth. * This .brand-blue[nested] model is the same as .brand-blue[interaction] or .brand-blue[main] effect models. ]] .column.bg-brand-gray[.content[ ```r anova(lm(Count ~ Depth*Sample.id, data=dat)) #interact ``` ``` Analysis of Variance Table Response: Count Df Sum Sq Mean Sq F value Pr(>F) Depth 2 686.00 343.00 22.4918 0.00163 ** Sample.id 3 62.75 20.92 1.3716 0.33847 Residuals 6 91.50 15.25 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ```r anova(lm(Count ~ Depth+Sample.id, data=dat)) #main ``` ``` Analysis of Variance Table Response: Count Df Sum Sq Mean Sq F value Pr(>F) Depth 2 686.00 343.00 22.4918 0.00163 ** Sample.id 3 62.75 20.92 1.3716 0.33847 Residuals 6 91.50 15.25 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Swap the order and main vs interaction ]] .row[.split-50[ .column[.content[ #### .brand-blue[To the previous page,] * SS for Depth is .brand-blue[marginal] as it enters first. * SS for Sample.id is .brand-blue[conditional]. The conditional `\(\text{SS}_{\text{Sample.id}}\)` is `\(\text{SS}_{\text{Sample.id}}-\text{SS}_{\text{Depth}}=\text{SS}^{\text{pooled}}_{\text{Depth:Sample.id}}\)` ie `\(748.75-686=62.75\)`. Pooled SS = conditional SS. * So if the order is swapped, the SS will change. * This change of SS applies to .brand-blue[interaction] and .brand-blue[main effect] models. #### .brand-blue[To the right,] * SS for Sample.id is marginal `\(\text{SS}_{\text{Sample.id}}=748.75\)`. * The six Sample.id groups contain Depth groups and so absorbs Depth factor. * This is .brand-blue[one-way ANOVA] with six sample.id groups and `\(\text{SS}_{\text{Sample.id}}\)` is the TSS with 6-1 df. ]] .column.bg-brand-gray[.content[ ```r anova(lm(Count ~ Sample.id*Depth, data=dat)) #interact ``` ``` Analysis of Variance Table Response: Count Df Sum Sq Mean Sq F value Pr(>F) Sample.id 5 748.75 149.75 9.8197 0.007459 ** Residuals 6 91.50 15.25 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ```r anova(lm(Count ~ Sample.id+Depth, data=dat)) #main ``` ``` Analysis of Variance Table Response: Count Df Sum Sq Mean Sq F value Pr(>F) Sample.id 5 748.75 149.75 9.8197 0.007459 ** Residuals 6 91.50 15.25 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Parameter estimates of 1-way ANOVA on Sample.id ]] .row[.split-60[ .column[.content[ ```r summary(lm(Count ~ Sample.id, data=dat)) ``` ``` Call: lm(formula = Count ~ Sample.id, data = dat) Residuals: Min 1Q Median 3Q Max -4.500 -1.625 0.000 1.625 4.500 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 13.000 2.761 4.708 0.0033 ** Sample.idno2 -4.500 3.905 -1.152 0.2930 Sample.idno3 1.000 3.905 0.256 0.8064 Sample.idno4 1.500 3.905 0.384 0.7141 Sample.idno5 12.000 3.905 3.073 0.0219 * Sample.idno6 18.500 3.905 4.737 0.0032 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 3.905 on 6 degrees of freedom Multiple R-squared: 0.8911, Adjusted R-squared: 0.8004 F-statistic: 9.82 on 5 and 6 DF, p-value: 0.007459 ``` ]] .column.bg-brand-gray[.content[ ```r mysi=tapply(Y,si,mean); my=mean(Y) mysi ``` ``` no1 no2 no3 no4 no5 no6 13.0 8.5 14.0 14.5 25.0 31.5 ``` ```r c(mysi[1],mysi[2]-mysi[1], mysi[3]-mysi[1],mysi[4]-mysi[1], mysi[5]-mysi[1],mysi[6]-mysi[1]) ``` ``` no1 no2 no3 no4 no5 no6 13.0 -4.5 1.0 1.5 12.0 18.5 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Two way ANOVA model ]] .row[.split-50[ .column[.content[ ```r anova(lm(Count ~ Depth*Sample, data=dat)) ``` ``` Analysis of Variance Table Response: Count Df Sum Sq Mean Sq F value Pr(>F) Depth 2 686.00 343.00 22.4918 0.00163 ** Sample 1 2.08 2.08 0.1366 0.72437 Depth:Sample 2 60.67 30.33 1.9891 0.21742 Residuals 6 91.50 15.25 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ```r anova(lm(Count ~ Depth/Sample, data=dat)) ``` ``` Analysis of Variance Table Response: Count Df Sum Sq Mean Sq F value Pr(>F) Depth 2 686.00 343.00 22.4918 0.00163 ** Depth:Sample 3 62.75 20.92 1.3716 0.33847 Residuals 6 91.50 15.25 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ]] .column.bg-brand-gray[.content[ ```r anova(lm(Count ~ Sample*Depth, data=dat)) ``` ``` Analysis of Variance Table Response: Count Df Sum Sq Mean Sq F value Pr(>F) Sample 1 2.08 2.08 0.1366 0.72437 Depth 2 686.00 343.00 22.4918 0.00163 ** Sample:Depth 2 60.67 30.33 1.9891 0.21742 Residuals 6 91.50 15.25 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` * `\(\text{TSS} = 686+2.08+60.67=748.75=\text{SS}_{\text{Sample.id}}\)` * `\(\text{SS}^{\text{pooled}}_{\text{Dep:Sam}} = 62.75=60.67+2.08=\text{SS}_{\text{Dep:Sam}}+\text{SS}_{\text{Sample}}\)` * The .brand-blue[nested] model `Count ~ Depth/Sample` and .brand-blue[factorial] model `Count ~ Depth*Sample` are now different, unlike the case of .brand-blue[Sample.id]. * The factors in interaction effect model are .brand-blue[orthogonal] as the SS are unchange after reordering. ]] ]] --- layout: false class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example - Pollen data: Comments ]] .row[.content[ * The levels of `Depth` are ordered `deep`, `medium` and `shallow` by (alphabetical) default, so `deep` is the reference level for the treatment constraint. * There is clear evidence that pollen varies with depth (p = 0.0016). * There is no evidence of an effect of `Sample` within `Depth`. In other words, there do not seem to be systematic differences between the two samples taken at each depth. * The table of parameter estimates indicates (from the fixed effects) that pollen count is highest at in the deep bog. * The interaction terms estimate the effect of Sample s2 in comparison to `Sample s1` (baseline) at each depth. ]]