class: split-70 with-border hide-slide-number bg-brand-red background-image: url("images/USydLogo-black.svg") background-size: 200px background-position: 2% 90% .column.white[.content[ <br><br><br> # Exam Revision ## .black[STAT3022 Applied Linear Models] <br><br><br> ### .black[2020/05/23] ]] .column.bg-brand-charcoal[.content.white[ 1. Important information 2. Revision ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Exam Information ]] .row[.content[ * The exam is 2 hours with 10 minutes reading time and 15 minutes document upload time. * Read exam information sheet. * There are Three questions NOT of equal value. * All materials from week 1 to 13 are covered. * Exam consultation hours to be announced a little later. Do complete the course survey! ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # What to study ]] .row[.split-two[ .column[.content[ ## Focus will be on applied * Go through tutorial and assignment questions. * Know well in particular (not exclusive): * Multiple linear regression * Outliers and high leverage points * Variable selection * General F-test * Contrasts * ANOVA * Experimental design * Nested design * Random effect model ]] .column.bg-brand-gray[.content[ ## Don't put too much focus * The derivation of REML * Robust regression (only how to interpret and use it) ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Averages by block of corn yields, for treatment 111 ]] .row[.split-50[ .column[.content[ Lets read a data and play around. ```r dat <- read.csv(file="data/ant111br.csv",header=T) dat[7:10,]; dim(dat) ``` ``` site parcel id ears harvwt 7 WEAN I 115.0 43.5 5.040 8 WLAN I 151.0 50.0 2.020 9 DBAN II 15.5 46.0 4.805 10 LFAN II 51.0 46.5 4.770 ``` ``` [1] 32 5 ``` * site: a factor with levels DBAN LFAN NSAN ORAN OVAN TEAN WEAN WLAN * parcel: a factor with levels I II III IV * parcel & site are 2 crossed factors, each with equal size. Others are numeric. * take harvwt as outcomes. .brand-blue[Q0. What design is it?] ]] .column.bg-brand-gray[.content[ Full model ```r M0=lm(harvwt ~ 1,data=dat) Mf=lm(harvwt ~ . ,data=dat) anova(Mf) ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) site 7 70.339 10.0485 30.9528 4.3e-09 *** parcel 3 4.406 1.4686 4.5238 0.01481 * id 1 0.411 0.4114 1.2673 0.27429 ears 1 2.876 2.8756 8.8578 0.00776 ** Residuals 19 6.168 0.3246 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` Andrews DF; Herzberg AM, 1985. Data. A Collection of Problems from Many Fields for the Student and Research Worker. Springer-Verlag. (pp. 339-353) ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Visualisation of the data ]] .row[.split-50[ .column[.content[ ```r par(mfrow=c(1,2)) boxplot(harvwt~site) boxplot(harvwt~parcel) ``` <img src="ExamRevision_2020JC_files/figure-html/unnamed-chunk-6-1.svg" style="display: block; margin: auto;" /> ]] .column.bg-brand-gray[.content[ ```r plot(dat) ``` <img src="ExamRevision_2020JC_files/figure-html/unnamed-chunk-7-1.svg" style="display: block; margin: auto;" /> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Stepwise variable selection for M1 and general F test ]] .row[.split-50[ .column[.content[ ```r M1=step(Mf,scope=list(lower=harvwt~1,upper=harvwt~.), direction="both",k=log(32)) #AIC should be BIC ``` ``` Start: AIC=-7.63 harvwt ~ site + parcel + id + ears Df Sum of Sq RSS AIC - parcel 3 0.173 6.342 -17.138 - id 1 0.076 6.244 -10.702 <none> 6.168 -7.628 - ears 1 2.876 9.044 1.151 - site 7 50.230 56.399 38.929 Step: AIC=-17.14 harvwt ~ site + id + ears Df Sum of Sq RSS AIC <none> 6.342 -17.1384 - id 1 1.903 8.244 -12.2067 + parcel 3 0.173 6.168 -7.6283 - ears 1 3.331 9.672 -7.0952 - site 7 51.289 57.631 29.2235 ``` ]] .column.bg-brand-gray[.content[ ```r anova(M1) ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) site 7 70.339 10.0485 34.860 1.769e-10 *** id 1 4.189 4.1885 14.531 0.0009531 *** ears 1 3.331 3.3308 11.555 0.0025755 ** Residuals 22 6.342 0.2883 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ```r c(var(harvwt),var(harvwt)*(length(harvwt)-1)) ``` ``` [1] 2.716135 84.200180 ``` .brand-blue[Q1. What is the F test stat to test M1 against Mf?] .brand-blue[Q2. What is the F test stat to test M0 against M1?] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # General F test and adjusted `\(R^2\)` ]] .row[.split-50[ .column[.content[ ```r anova(M0,M1,Mf) #check ``` ``` Analysis of Variance Table Model 1: harvwt ~ 1 Model 2: harvwt ~ site + id + ears Model 3: harvwt ~ site + parcel + id + ears Res.Df RSS Df Sum of Sq F Pr(>F) 1 31 84.200 2 22 6.342 9 77.859 26.648 6.229e-09 *** 3 19 6.168 3 0.173 0.178 0.91 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` .brand-blue[Q3. What is the adjusted] `\(\color{blue}{R^2}\)` .brand-blue[in M1?] ```r summary(M1)$adj.r.squared #check ``` ``` [1] 0.8938746 ``` ]] .column.bg-brand-gray[.content[ ```r M2=lm(harvwt~id+parcel,data=dat) round(c(sigma(M2),augment(M2)$.hat[1]),4) ``` ``` [1] 1.6838 0.1997 ``` ```r unname(summary(M2)$res[1]) ``` ``` [1] 1.955654 ``` .brand-blue[Q4. What is the Cooks distance of the 1st obs in M2? Is it a Y-outlier?] ```r c(augment(M2)$.cooksd[1],qf(0.5,5,32-5))#check ``` ``` [1] 0.08414524 0.89244181 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Prediction ]] .row[.split-60[ .column[.content[ ```r round(coef(M2),6) ``` ``` (Intercept) id parcelII parcelIII parcelIV 3.192797 0.003850 0.621552 0.631907 0.917547 ``` ```r round(vcov(M2),6) ``` ``` (Intercept) id parcelII parcelIII parcelIV (Intercept) 0.576330 -0.001695 -0.338289 -0.324411 -0.308732 id -0.001695 0.000013 -0.000123 -0.000229 -0.000349 parcelII -0.338289 -0.000123 0.709951 0.356566 0.357704 parcelIII -0.324411 -0.000229 0.356566 0.712832 0.360559 parcelIV -0.308732 -0.000349 0.357704 0.360559 0.718176 ``` ```r c(qt(0.975,32-5)) ``` ``` [1] 2.051831 ``` .brand-blue[Q5. What is the predicted harvwt for parcel=I (baseline) when id=10 and 95% PI using M2?] ]] .column.bg-brand-gray[.content[ ```r pred0=data.frame(id=10,parcel="I")#check predict(M2,newdata=pred0, interval="prediction",level=0.95) ``` ``` fit lwr upr 1 3.231294 -0.5403101 7.002899 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Constrast and multiple comparison ]] .row[.split-50[ .column[.content[ ```r M1=lm(harvwt~site + id + ears,data=dat) M2=lm(harvwt~id + parcel,data=dat) ``` .brand-blue[Q6. What are the model equations for M1 and M2?] ```r M2a=lm(harvwt~id*parcel) round(coef(M2a)[1:4],6); round(coef(M2a)[5:8],6); ``` ``` (Intercept) id parcelII parcelIII 3.396007 0.002298 0.319258 0.591695 ``` ``` parcelIV id:parcelII id:parcelIII id:parcelIV 0.410399 0.002257 0.000455 0.003477 ``` .brand-blue[Q7. What is the design matrix] `\(\color{blue}{\mathbf{X}}\)` .brand-blue[of obs 7-10 in P.4 for M2a? What are the regression lines of harvwt against id for parcel=I & III using M2a?] ]] .column.bg-brand-gray[.content[ ```r M3=lm(harvwt~parcel,data=dat); sigma(M3) ``` ``` [1] 1.688135 ``` ```r tapply(harvwt,parcel,length) ``` ``` I II III IV 8 8 8 8 ``` ```r tapply(harvwt,parcel,mean) ``` ``` I II III IV 3.696875 4.355000 4.396875 4.718125 ``` .brand-blue[Q8. What is the CI for the diff between parcel 1 and others using M3? Is the diff significant? Is this diff orthogonal to diff of parcel 1 & 2?] .brand-blue[Q9. What if the diff is corrected for all possible pairwise tests using Bonferroni, Tukey and Scheffe?] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Contrast in ANCOVA model ]] .row[.split-60[ .column[.content[ ```r M2=lm(harvwt~id + parcel,data=dat); round(coef(M2),6) ``` ``` (Intercept) id parcelII parcelIII parcelIV 3.192797 0.003850 0.621552 0.631907 0.917547 ``` ```r S0xx=deviance(lm(id~parcel,data=dat)); S0xx ``` ``` [1] 219012.2 ``` ```r round(tapply(harvwt,parcel,mean),4) ``` ``` I II III IV 3.6969 4.3550 4.3969 4.7181 ``` ```r tapply(id,parcel,mean) ``` ``` I II III IV 130.9375 140.4375 148.6250 157.8750 ``` ]] .column.bg-brand-gray[.content[ ```r sigma(M2) ``` ``` [1] 1.683784 ``` ```r qt(0.975,27) ``` ``` [1] 2.051831 ``` .brand-blue[Q10. What if the diff is with an ANCOVA model using M2?] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Quantitative factors ]] .row[.split-40[ .column[.content[ Suppose the four parcels have 1.1, 1.4, 1.8, 2.1 kg of nitrogen respectively. ```r x=rep(c(1.1, 1.4, 1.8, 2.1),each=8) M3a=lm(harvwt~x) ``` .brand-blue[Q11. What is the linear component of the SS that measures the linear effect of nitrogen on harvwt? How to select the suitable higher order?] ```r anova(M3a) #check ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) x 1 3.715 3.7153 1.3849 0.2485 Residuals 30 80.485 2.6828 ``` ]] .column.bg-brand-gray[.content[ ```r MOP <- lm(harvwt ~ poly(x, degree = 3)) #higer deg is singular print(broom::tidy(MOP), digits=2) ``` ``` # A tibble: 4 x 5 term estimate std.error statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> 1 (Intercept) 4.29 0.298 14.4 1.86e-14 2 poly(x, degree = 3)1 1.93 1.69 1.14 2.63e- 1 3 poly(x, degree = 3)2 -0.476 1.69 -0.282 7.80e- 1 4 poly(x, degree = 3)3 0.681 1.69 0.403 6.90e- 1 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Two-way `\(8 \times 4\)` factorial design ]] .row[.split-50[ .column[.content[ ```r M4=lm(harvwt~site*parcel,data=dat) anova(M4) #why RSS=0? ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) site 7 70.339 10.0485 parcel 3 4.406 1.4686 site:parcel 21 9.455 0.4502 Residuals 0 0.000 ``` Suppose 2 similar sites can be combined to create 2 repeats. ```r siteC=rep(c("A","B","B","A","C","D","C","D"),4) ``` So parcel and sitC are two .brand-blue[crossed] factors with 2 repeats for each combination. ]] .column.bg-brand-gray[.content[ ```r M4a=lm(harvwt~siteC*parcel,data=dat) sigma(M4a) ``` ``` [1] 1.151487 ``` ```r tapply(harvwt,parcel,sum) ``` ``` I II III IV 29.575 34.840 35.175 37.745 ``` ```r tapply(harvwt,siteC,sum) ``` ``` A B C D 47.200 25.190 41.435 23.510 ``` .brand-blue[Q12. What is the ANOVA table for M4a given SS_SC=52.091? Is the parcel effect significant?] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Two-way `\(4^2\)` factorial design and ANOVA table ]] .row[.split-50[ .column[.content[ ```r anova(M4a) #check ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) siteC 3 52.091 17.3637 13.0956 0.0001415 *** parcel 3 4.406 1.4686 1.1076 0.3750266 siteC:parcel 9 6.488 0.7209 0.5437 0.8220754 Residuals 16 21.215 1.3259 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` Remark: We can also have `\(2^4\)` factorial design by dichotomise the 4 variables. Remember standard order. ]] .column.bg-brand-gray[.content[ ```r anova(lm(harvwt~parcel*siteC,data=dat))#Same as M4a? ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) parcel 3 4.406 1.4686 1.1076 0.3750266 siteC 3 52.091 17.3637 13.0956 0.0001415 *** parcel:siteC 9 6.488 0.7209 0.5437 0.8220754 Residuals 16 21.215 1.3259 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` .brand-blue[Q13. Why the SS remains unchanged if the two categorial variables swap order in M4a? What if we have ancova model with one categorial and one continuous variables?] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # ANCOVA and nested factor ]] .row[.split-40[ .column[.content[ ```r anova(lm(harvwt~parcel*id,data=dat))#ANCOVA ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) parcel 3 4.406 1.4686 0.4630 0.7107 id 1 3.246 3.2459 1.0234 0.3218 parcel:id 3 0.430 0.1434 0.0452 0.9869 Residuals 24 76.118 3.1716 ``` ```r anova(lm(harvwt~id*parcel,data=dat))#change ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) id 1 4.102 4.1025 1.2935 0.2666 parcel 3 3.549 1.1831 0.3730 0.7732 id:parcel 3 0.430 0.1434 0.0452 0.9869 Residuals 24 76.118 3.1716 ``` ]] .column.bg-brand-gray[.content[ Create also nested factor siteP under parcel (I to IV) with **16 levels**. 2 repeats each. |I|II|III|IV| |-----|-----|------|-----| |AI BI CI DI| AII BII CII DII | AIII BIII CIII DIII | AIV BIV CIV DIV| Note that crossed factor siteC under parcel (I to IV) has **4 levels**. 2 repeats each. |I|II|III|IV| |-----|-----|------|-----| |A B C D|A B C D|A B C D|A B C D| ```r siteP=paste(siteC,parcel) ``` ```r M4b=lm(harvwt~parcel/siteP,data=dat) M4c=lm(harvwt~parcel/siteC,data=dat) ``` .brand-blue[Q14. If nested factor siteP (or crossed factor siteC) are to be hidden in the output, what is the SS for parcel:siteP?] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Nested factor ]] .row[.split-50[ .column[.content[ ```r anova(M4b)#check with M4c. Same siteP & siteC. ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) parcel 3 4.406 1.4686 1.1076 0.37503 parcel:siteP 12 58.580 4.8816 3.6817 0.00845 ** Residuals 16 21.215 1.3259 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ```r anova(M4a)#compare with nested design ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) siteC 3 52.091 17.3637 13.0956 0.0001415 *** parcel 3 4.406 1.4686 1.1076 0.3750266 siteC:parcel 9 6.488 0.7209 0.5437 0.8220754 Residuals 16 21.215 1.3259 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ]] .column.bg-brand-gray[.content[ ```r anova(M4c) ``` ``` Analysis of Variance Table Response: harvwt Df Sum Sq Mean Sq F value Pr(>F) parcel 3 4.406 1.4686 1.1076 0.37503 parcel:siteC 12 58.580 4.8816 3.6817 0.00845 ** Residuals 16 21.215 1.3259 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` .brand-blue[Q15. If siteP (repeat your answer with siteC), which are nested under parcel forms the EU and the OU are each repeats, how is RSS splitted for the EU and OU under the main effect model (repeat your answer with the interaction model with siteC as a nested factor)?] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Nested design, main, crossed & nested factors ]] .row[.split-50[ .column[.content[ ```r summary(aov(harvwt~parcel+Error(siteP),data=dat))#main ``` ``` Error: siteP Df Sum Sq Mean Sq F value Pr(>F) parcel 3 4.41 1.469 0.301 0.824 Residuals 12 58.58 4.882 Error: Within Df Sum Sq Mean Sq F value Pr(>F) Residuals 16 21.21 1.326 ``` EU: parcel with 4-1 df is within siteP with 16-1 df. .brand-blue[RSS is SS for siteC & siteC:parcel with 3+9=12 df.] OU: .brand-blue[Remaining RSS with 31-15=16 df.] RSS and its 28 df is splitted into part in EU with SS for siteC & .brand-blue[siteC:parcel] (12 df) and in OU with remaining RSS (16 df). ]] .column.bg-brand-gray[.content[ ```r summary(aov(harvwt~parcel+Error(siteC),data=dat))#main ``` ``` Error: siteC Df Sum Sq Mean Sq F value Pr(>F) Residuals 3 52.09 17.36 Error: Within Df Sum Sq Mean Sq F value Pr(>F) parcel 3 4.406 1.469 1.325 0.288 Residuals 25 27.703 1.108 ``` EU: siteC has 4-1 df. .brand-blue[RSS is SS for siteC.] OU: parcel is not nested but crossed with siteC. So parcel with 4-1 df goes outside siteC (EU) with 32-1 df in total. .brand-blue[RSS has SS for siteC:parcel & RSS with 6.49+21.22=27.71 at 9+16=25 df.] RSS and its 28 df is splitted into part in EU with SS for siteC (3 df) and in OU with SS for .brand-blue[siteC:parcel] and remaining RSS (9+16 df). ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Nested design, interaction, crossed & nested factors ]] .row[.split-50[ .column[.content[ ```r summary(aov(harvwt~parcel/siteC+Error(siteP), data=dat)) #interaction ``` ``` Error: siteP Df Sum Sq Mean Sq parcel 3 4.41 1.469 parcel:siteC 12 58.58 4.882 Error: Within Df Sum Sq Mean Sq F value Pr(>F) Residuals 16 21.21 1.326 ``` EU: parcel and parcel:site (RSS in main effect) are within siteP. OU: remaining RSS same as main effect model. Diff is on treatment SS. RSS are the same with 16 df. The 9+3 df of RSS in main model goes to parcel:siteC as treatment SS. ]] .column.bg-brand-gray[.content[ ```r summary(aov(harvwt~parcel/siteC+Error(siteC), data=dat)) #interaction ``` ``` Error: siteC Df Sum Sq Mean Sq parcel:siteC 3 52.09 17.36 Error: Within Df Sum Sq Mean Sq F value Pr(>F) parcel 3 4.406 1.4686 1.108 0.375 parcel:siteC 9 6.488 0.7209 0.544 0.822 Residuals 16 21.215 1.3259 ``` EU: SS for parcel:siteC is SS for siteC. The nested model parcel/siteC does not allow siteC to appear. OU: parcel & parcel:siteC are within OU, outside siteC. RSS is the remaining RSS with 16 df. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Latin square & balance incomplete block design ]] .row[.split-50[ .column[.content[ No LSD is possible for this data with 2 categorical variables as LSD has 2 block factors and 1 treatment factor. For a 4 level treatment say, since `\(n=16\)`, only main effect model (no repeats for interaction) is possible. .brand-blue[Q16. What is the cyclic LSD and its main effect model equation?] To create BICBD, take parcel as `\(k=4\)` treatments, set `\(r=6\)` (instead of `\(r=8\)`), `\(b=8\)` and `\(k=3\)`. Total `\(n=24\)` and so drop 8 observations. BICBD is ```r datr=dat[-c(7,8,13,14,19,20,25,26),] ``` |.gray[A]|.gray[A]|.gray[A]|.gray[A]|.gray[A]|.gray[A]|.red[B]|.red[B]| |-----|-----|------|-----|-----|-----|-----|-----| |.red[B]|.red[B]|.red[B]|.red[B]|.blue[C]|.blue[C]|.blue[C]|.blue[C]| |.blue[C]|.blue[C]|.green[D]|.green[D]|.green[D]|.green[D]|.green[D]|.green[D]| ]] .column.bg-brand-gray[.content[ Randomisation: col (1,2,3,4,5,6,7,8) to (3,6,1,7,4,8,2,5) |.gray[A]|.gray[A]|.gray[A]|.red[B]|.gray[A]|.red[B]|.gray[A]|.gray[A]| |-----|-----|------|-----|-----|-----|-----|-----| |.red[B]|.blue[C]|.red[B]|.blue[C]|.red[B]|.blue[C]|.red[B]|.blue[C]| |.green[D]|.green[D]|.blue[C]|.green[D]|.green[D]|.green[D]|.blue[C]|.green[D]| Randomisation: row (2,3,1),(3,1,2),(1,3,2),(1,2,3),(3,1,2),(3,2,1),(2,3,1),(2,1,3) |.red[B]|.green[D]|.gray[A]|.red[B]|.green[D]|.green[D]|.red[B]|.blue[C]| |-----|-----|------|-----|-----|-----|-----|-----| |.green[D]|.gray[A]|.blue[C]|.blue[C]|.gray[A]|.blue[C]|.blue[C]|.gray[A]| |.gray[A]|.blue[C]|.red[B]|.green[D]|.red[B]|.red[B]|.gray[A]|.green[D]| Data is |.red[3.17]|.green[3.515]|.gray[5.16]|.red[4.6]|.green[7.07]|.green[2.65]|.red[4.77]|.blue[5.565]| |-----|-----|------|-----|-----|-----|-----|-----| |.green[1.97]|.gray[2.65]|.blue[5.065]|.blue[6.34]|.gray[6.79]|.blue[3.16]|.blue[4.33]|.gray[3.255]| |.gray[1.73]|.blue[2.79]|.red[4.805]|.green[6.125]|.red[7.365]|.red[2.665]|.gray[2.93]|.green[6.235]| ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Balance incomplete block design ]] .row[.split-50[ .column[.content[ ```r M5=lm(harvwtr~block+parcelr) round(tapply(harvwtr,parcelr,mean),4) ``` ``` I II III IV 3.7525 4.5625 4.5417 4.7392 ``` ```r round(tapply(harvwtr,block,mean)[1:4],4) ``` ``` 1 2 3 4 2.2900 2.9850 5.0100 5.6883 ``` ```r round(tapply(harvwtr,block,mean)[5:8],4) ``` ``` 5 6 7 8 7.0750 3.1150 4.0100 5.0183 ``` ]] .column.bg-brand-gray[.content[ ```r anova(M5) ``` ``` Analysis of Variance Table Response: harvwtr Df Sum Sq Mean Sq F value Pr(>F) block 7 53.483 7.6404 15.0542 2.642e-05 *** parcelr 3 4.160 1.3866 2.7322 0.08646 . Residuals 13 6.598 0.5075 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` .brand-blue[Q17. Why is it a BICBD? What is the diff between parcel I and II and SE of the diff?] ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Random effect model ]] .row[.split-50[ .column[.content[ ```r library(lme4) M6=lmer(harvwt~parcel+(1|site)) summary(M6); ranef(M6) ``` ``` $site (Intercept) DBAN 0.56668029 LFAN -0.08044263 NSAN -2.10300025 ORAN 2.50566115 OVAN 0.51653423 TEAN -1.19917728 WEAN 1.17917855 WLAN -1.38543406 with conditional variances for "site" ``` .brand-blue[Q18. What is model equation in terms of the matrics and vectors] `\(\color{blue}{\mathbf{X}}\)`, `\(\color{blue}{\boldsymbol{\beta}}\)`, `\(\color{blue}{\mathbf{Z}}\)`, `\(\color{blue}{\boldsymbol{u}}\)` .brand-blue[for M5? What are their estimates and the matrix ] `\(\color{blue}{\mathbf{X}^\top\mathbf{X}}\)`.brand-blue[? When should random effect model be used?] ]] .column.bg-brand-gray[.content[ ``` Linear mixed model fit by REML ['lmerMod'] Formula: harvwt ~ parcel + (1 | site) REML criterion at convergence: 87.2 Scaled residuals: Min 1Q Median 3Q Max -1.4282 -0.6298 0.1011 0.4391 1.4907 Random effects: Groups Name Variance Std.Dev. site (Intercept) 2.3983 1.5486 Residual 0.4503 0.6711 Number of obs: 32, groups: site, 8 Fixed effects: Estimate Std. Error t value (Intercept) 3.6969 0.5967 6.195 parcelII 0.6581 0.3355 1.961 parcelIII 0.7000 0.3355 2.086 parcelIV 1.0212 0.3355 3.044 Correlation of Fixed Effects: (Intr) prclII prcIII parcelII -0.281 parcelIII -0.281 0.500 parcelIV -0.281 0.500 0.500 ``` ]] ]]