class: split-70 with-border hide-slide-number bg-brand-red background-image: url("images/USydLogo-black.svg") background-size: 200px background-position: 2% 90% .column.white[.content[ <br><br><br> # Latin Square Design ## .black[STAT3022 Applied Linear Models Lecture 25] <br><br><br> ### .black[2020/02/17] ]] .column.bg-brand-charcoal[.content.white[ ## Today 1. Latin square design ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example - brands of tyres ]] .row[.split-70[ .column[.content[ * A fleet owner wants to determine whether or not different brands of tyres exhibits different amounts of tread loss after 20,000 kms. * There are 4 brands A, B, C and D. * Four different cars are available for the road test. What are: * Treatment: four tyre brands * Experimental unit: tyre of a car * Observational unit: tyre of a car and how many of there are each? ]] .column.bg-brand-gray[.content[ <img src="images/typre.jpg" width="100%" height="100%"> ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example - Tyres: Poor designs? ]] .row[.split-50[ .column[.content[ <br> |**Car**|**1**|**2**|**3**|**4**| |-----|-----|------|-----|-----| |Front L|A|B|C|D| |Front R|A|B|C|D| |Back L|A|B|C|D| |Back R|A|B|C|D| * This design is poor because any differences in brands are completely confounded with car difference. * Indeed, it is .brand-blue[not a complete block design] since not all treatments occur in each block if each car forms a block. * Additionally, it is unlikely to be randomised. ]] .column.bg-brand-gray[.content[ <br> |**Car**|**1**|**2**|**3**|**4**| |-----|-----|------|-----|-----| |Front L|A|A|A|A| |Front R|B|B|B|B| |Back L|C|C|C|C| |Back R|D|D|D|D| * Reduce the car difference effect by testing each brand of tyre on each car. * The cars are the blocks in the design. * Each treatment (tyre brand) is observed in each block (car). * The resulting design is a .brand-blue[complete block design] as every treatment occurs exactly the same number of times in each block. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Latin square design ]] .row[.split-50[ .column[.content[ In the previous example, it is quite likely that the position of the tyre on the car will affect the response. * Rear tyres wear differently to front tyres. * Introduce position as a second blocking variable in the design to control variation in two dimensions. The introduction of such a second blocking variable results in a .brand-blue[Latin square] design. ### Definition of standard Latin square design A standard `\(t^2\)` Latin square design with treatment labels A, B, ... has first column and first row sorted alphabetically and .brand-blue[every letter occurs in every column and every row exactly once]. (Latin refers to the use of Latin letters for the treatments). ]] .column.bg-brand-gray[.content[ ### Example - `\(2 \times 2\)` LSD There is only one standard `\(2\times 2\)` LSD because from the definition we have the first row and column are given by | |**Column 1**|**Column 2**| |-----|------------|------------| |**Row 1**|A|B| |**Row 2**|B|?| Thus ? = A is fixed. * Small squares have very few degrees of freedom for experimental error. * The number of treatments must equal the number of replicates. * Cannot evaluate interactions between any pair. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # The standard `\(4\times 4\)` LSD ]] .row[.split-50[ .column[.content[ * We aim to have every brand in every car and in every position. We begin with setting the first row and column to be ABCD to create a standard `\(4 \times 4\)` LSD. Other designs can be obtained by relabelling. |**A**|**B**|**C**|**D**| |-----|-----|------|-----| |**B**|x |x |x | |**C**| | | | |**D**| | | || * For the three cells, it can be ACD, .red[**ADC**], CAD, .red[**CDA**], .red[**DAC**], DCA but only the red color ones are possible. * The second column can be filled to maintain symmetric pattern to fulfill the requirement of all treatments in each row and column. ]] .column.bg-brand-gray[.content[ * Taking .red[**ADC**] as an example, there are two designs: |**A**|**B**|**C**|**D**||**A**|**B**|**C**|**D**| |-----|-----|------|-----||-----|-----|------|-----| |**B**|.red[A] |.red[D] |.red[C] ||**B**|.red[A] |.red[D] |.red[C] | |**C**|.red[D] |.blue[B] |.blue[A] ||**C**|.red[D] |.blue[A] |.blue[B] | |**D**|.red[C] |.blue[A] | B ||**D**|.red[C] |.blue[B] | A | * Take .red[**DAC**] as an example, there is one design: |**A**|**B**|**C**|**D**| |-----|-----|------|-----| |**B**|.red[D] |.red[A] |.red[C] | |**C**|.red[A] |.blue[D] |.blue[B] | |**D**|.red[C] |.blue[B] | A | ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Cyclic permutation in Latin square design ]] .row[.split-50[ .column[.content[ * We can also quickly fill the remaining cells by shifting the first row by one spot to the left when writing the second row etc which is the .red[**CDA**] case. |**A**|**B**|**C**|**D**| |-----|-----|------|-----| |**B**|C|D|A| |**C**|D|A|B| |**D**|A|B|C| ]] .column.bg-brand-gray[.content[ * For `\(t\)` treatments we can construct a Latin square design by cyclic permutation of the front row, as left. * Other possible non-standard Latin squares can be obtained by permuting rows and columns, eg from that cyclic design to the design below by ordering the columns (1,2,3,4) to (4,1,3,2). |**D**|**A**|**C**|**B**| |-----|-----|------|-----| |**A**|B|D|C| |**B**|C|A|D| |**C**|D|B|A| * Not systematic but still have all treatments in each row and column. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Linear regression model for LSD's ]] .row[.split-50[ .column[.content[ Linear regression model for the Latin square design is `$$Y_{ij} = \mu + \alpha_i + \beta_j + \gamma_{k(i,j)} + \epsilon_{ijk},\; \epsilon_{ijk} \sim NID(0,\sigma^2)$$` where `\(i,j=1,\ldots,t\)`, `\(k(i,j)\ne k(i,j')\)` if `\(j\ne j'\)` and `\(k(i,j)\ne k(i',j)\)` if `\(i\ne i'\)`, `\(Y_{ij}\)` is the observation on the `\(i\)` row and `\(j\)` column receiving treatment `\(k(i,j)\)`. * Constraints apply to the above model as per usual. * There are `\(n=t^2\)` observations in total. <br> .brand-blue[For the cyclic LSD] `$$\begin{array} {cccc} k(1,1)=1, & k(1,2)=2, & k(1,3)=3, & k(1,4)=4 \\ k(2,1)=2, & k(2,2)=3, & k(2,3)=4, & k(3,4)=1 \\ k(3,1)=3, & k(3,2)=4, & k(3,3)=1, & k(3,4)=2 \\ k(4,1)=4, & k(4,2)=1, & k(4,3)=2, & k(4,4)=3 \end{array}$$` ]] .column.bg-brand-gray[.content[ ### Analysis of Latin square design * The analysis now proceeds in much the same way as for a block design. * Except we have two blocking terms (rows and columns) plus a treatment term. ### Three-way ANOVA for LSD The sums of squares are `\begin{eqnarray*} \text{SS}_{\text{Row}}&=&\sum_{i=1}^t\frac{Y_{i \bullet}^2}{t}-\frac{Y^2_{\bullet\bullet}}{n}\\ \text{SS}_{\text{Col}}&=&\sum_{i=1}^t\frac{Y_{\bullet j}^2}{t}-\frac{Y^2_{\bullet\bullet}}{n}\\ \text{SS}_{\text{Trt}}&=&\sum_{i=1}^t\frac{Y_{k( \bullet)}^2}{t}-\frac{Y^2_{\bullet\bullet}}{n} \end{eqnarray*}` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Three-way ANOVA for LSD: ANOVA table ]] .row[.split-60[ .column[.content[ Again, report the blocking variables before the treatment factor: Source of Variation | Df | Sum Sq | Mean Sq ------------- | ------------- | ------------- | ------------- Row | `\(t-1\)` | `\(\text{SS}_{\text{Row}}\)` | `\(\text{SS}_{\text{Row}}/(t-1)\)` Column | `\(t-1\)` | `\(\text{SS}_{\text{Col}}\)` | `\(\text{SS}_{\text{Col}}/(t-1)\)` Treatment | `\(t-1\)` | `\(\text{SS}_{\text{Trt}}\)` | `\(\text{SS}_{\text{Trt}}/(t-1)\)` Residual | `\((t-1)(t-2)\)` | `\(\text{RSS}\)` | `\(\text{RSS}/\text{Df}_{H_1}\)` Total | `\(t^2-1\)` | `\(\text{S}_{yy}\)` | * The degrees of freedom for the RSS under `\(H_1\)` are $$ \text{df}_{H_1} = t^2 - 1 - 3(t-1)=t^2-3t+2=(t-1)(t-2).$$ ]] .column.bg-brand-gray[.content[ ### Example - Tyres: Old design ```r set.seed(17) T <- c("A","B","C","D") des2 <- replicate(4, sample(T)) dimnames(des2) <- list( c("Front L", "Front R", "Back L", "Back R"), c(1:4)) des2 ``` ``` 1 2 3 4 Front L "B" "C" "C" "A" Front R "A" "D" "B" "D" Back L "D" "A" "D" "B" Back R "C" "B" "A" "C" ``` which is **not** a Latin square design because the first row has two C. ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example - Modified tread tyre loss experiment ]] .row[.split-50[ .column[.content[ The design is |**B**|**D**|**C**|**A**| |-----|-----|------|-----| |**D**|B|A|C| |**C**|A|D|B| |**A**|C|B|D| The observed data are: | |**1**|**2**|**3**|**4**|.brand-blue[**Total**]| |-----|-----|------|-----|-----|-----| |Front L|.red[17]|.green[14]|.blue[13]|.gray[13]|.brand-blue[**57**]| |Front R|.green[14]|.red[14]|.gray[13]|.blue[8]|.brand-blue[**49**]| |Back L|.blue[12]|.gray[12]|.green[10]|.red[9]|.brand-blue[**43**]| |Back R|.gray[13]|.blue[11]|.red[11]|.green[9]|.brand-blue[**44**]| |.brand-blue[**Total**]|.brand-blue[**56**]|.brand-blue[**51**]|.brand-blue[**47**]|.brand-blue[**39**]|.brand-blue[**193**]| Brand total are A: `\(Y_{1(\bullet)}=\color{grey}{51}\)`, B: `\(Y_{2(\bullet)}=\color{red}{51}\)`, C: `\(Y_{3(\bullet)}=\color{blue}{44}\)`, D: `\(Y_{4(\bullet)}=\color{green}{47}\)`. ]] .column.bg-brand-gray[.content[ The factor sums of squares are `\begin{eqnarray*} \text{SS}_{\text{Car}}&=&\frac{56^2+\dots+39^2}{4}-\frac{193^2}{16}=38.7\\ \text{SS}_{\text{Position}}&=&\frac{57^2+\dots+44^2}{4}-\frac{193^2}{16}=30.7\\ \text{SS}_{\text{Brand}}&=&\frac{51^2+\dots+47^2}{4}-\frac{193^2}{16}=8.7 \end{eqnarray*}` So we get the following ANOVA table **Source of Variation** | **Df** | **Sum Sq** | **Mean Sq** | **F** | ------------- | ------------- | ------------- | -------------|-------| Cars | 3 | 38.7 | 13.8 |26.9 Positioin | 3 | 30.7 | 11.0 | 21.3 Tyre | 3 | 8.7 | 2.9 | 6.0 Residual | 6 | 2.8 |0.47 | Total |15 | 80.9 | 5.4 | ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Example - Modified tyres: Treatment effect? ]] .row[.split-50[ .column[.content[ The test for differences in mean tread loss ie. `\(H_0: \gamma_1=\gamma_2=\gamma_3=\gamma_4\)` is ```r 1-pf(6.0,3,6) ``` ``` [1] 0.03079579 ``` so there is sufficient evidence that the tyre brands are different in tread loss. ### Error variance in LSD * An estimate for the error variance is `$$\hat \sigma^2=\text{RSS}/\text{df}_{H_1}= \frac{\text{RSS}}{(t-2)(t-1)}.$$` ]] .column.bg-brand-gray[.content[ ### Analysis in R ```r Y <- c(17,14,12,13,14,14,12,11,13,13,10,11,13,8,9,9) car <- factor(rep(1:4,each=4)) position <- factor(rep(1:4,4)) brand <- c("B","D","C","A", "D","B","A","C", "C","A","D","B", "A","C","B","D") dat <- data.frame(Y, car, brand, position) M1 <- lm(Y ~ car + position + brand, dat) anova(M1) ``` ``` Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) car 3 38.688 12.8958 26.9130 0.000705 *** position 3 30.688 10.2292 21.3478 0.001330 ** brand 3 8.688 2.8958 6.0435 0.030318 * Residuals 6 2.875 0.4792 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ]] ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # ANOVA table from a balance design ]] .row[.split-50[ .column[.content[ ```r M2 <- lm(Y ~ position + car + brand, dat) anova(M2) ``` ``` Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) position 3 30.688 10.2292 21.3478 0.001330 ** car 3 38.688 12.8958 26.9130 0.000705 *** brand 3 8.688 2.8958 6.0435 0.030318 * Residuals 6 2.875 0.4792 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` ]] .column.bg-brand-gray[.content[ ```r M3 <- lm(Y ~ brand + position + car, dat) anova(M3) ``` ``` Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) brand 3 8.688 2.8958 6.0435 0.030318 * position 3 30.688 10.2292 21.3478 0.001330 ** car 3 38.688 12.8958 26.9130 0.000705 *** Residuals 6 2.875 0.4792 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` So swapping the order of the three factors in the model has no effect on the Sum of Squares. The desing is .brand-blue[orthogonal]. ]] ]] --- layout: false class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Parameter estimates ]] .row[.content[ ```r options(scipen=999); round(coef(M1),4) #coeff ``` ``` (Intercept) car2 car3 car4 position2 position3 16.875 -1.250 -2.250 -4.250 -2.000 -3.500 position4 brandB brandC brandD -3.250 0.000 -1.750 -1.000 ``` ```r yc=tapply(Y,car,mean); yp=tapply(Y,position,mean); yb=tapply(Y,brand,mean); ym=mean(Y) c(yc,yp,yb,ym) #gp means by car, position and treatment; overall mean ``` ``` 1 2 3 4 1 2 3 4 A B 14.0000 12.7500 11.7500 9.7500 14.2500 12.2500 10.7500 11.0000 12.7500 12.7500 C D 11.0000 11.7500 12.0625 ``` ```r c(yc[1]+yp[1]+yb[1]-2*ym,yc[2]-yc[1],yc[3]-yc[1],yc[4]-yc[1],yp[2]-yp[1],yp[3]-yp[1],yp[4]-yp[1],yb[2]-yb[1], yb[3]-yb[1],yb[4]-yb[1]) #check coeff ``` ``` 1 2 3 4 2 3 4 B C D 16.875 -1.250 -2.250 -4.250 -2.000 -3.500 -3.250 0.000 -1.750 -1.000 ``` ]] --- layout: false class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Predict values ]] .row[.content[ Note: `\(\hat Y_{ij}=\mu+\alpha_i+\beta_j+\gamma_{k(i,j)}\)`, `\(i,j,k=1, \dots,4\)`, `\(\alpha_1=\beta_1=\gamma_1=0\)`, `\((i,j,k)\)` denotes car, position and treatment. E.g `\(Y_2 = \mu+\beta_2+\gamma_4\)`, `\((i,j,k)=(1,2,4)\)` and `\(Y_8 = \mu + \alpha_2 + \beta_4 + \gamma_3\)`, `\((i,j,k)=(2,4,3)\)`. ```r ypred=Y-residuals(M1); ypred #predict values ``` ``` 1 2 3 4 5 6 7 8 9 10 11 16.875 13.875 11.625 13.625 14.625 13.625 12.125 10.625 12.875 12.625 10.125 12 13 14 15 16 11.375 12.625 8.875 9.125 8.375 ``` ```r c=coef(M1) # mu, c2,c3,c4, p2,p3,p4, t2,t3,t4 yp=c(c[1]+c[8], c[1]+c[5]+c[10], c[1]+c[6]+c[9], c[1]+c[7], c[1]+c[2]+c[10],c[1]+c[2]+c[5]+c[8],c[1]+c[2]+c[6], c[1]+c[2]+c[7]+c[9], c[1]+c[3]+c[9], c[1]+c[3]+c[5], c[1]+c[3]+c[6]+c[10],c[1]+c[3]+c[7]+c[8], c[1]+c[4], c[1]+c[4]+c[5]+c[9],c[1]+c[4]+c[6]+c[8], c[1]+c[4]+c[7]+c[10]) unname(yp) #check predict values ``` ``` [1] 16.875 13.875 11.625 13.625 14.625 13.625 12.125 10.625 12.875 12.625 [11] 10.125 11.375 12.625 8.875 9.125 8.375 ``` ]] --- layout: false class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Treatment contrasts ]] .row[.content[ It follows for any treatment contrast `$$\sum_{k=1}^t c_k \gamma_k, \hspace{3mm} \text{where} \hspace{3mm} \sum_{k=1}^t c_k = 0,$$` that `\(\text{E}\left(\sum\limits_{k=1}^t c_k \bar{Y}_{k(\bullet)}\right)=\sum\limits_{k=1}^t c_k \gamma_k\)`, and the estimator `\(\sum\limits_{k=1}^t c_k \bar{Y}_{k(\bullet)}\)` can be shown to have standard error `$$\text{SE}\left(\sum\limits_{k=1}^t c_k \bar{Y}_{k(\bullet)}\right)=\sqrt{\sum\limits_{k=1}^t c_k^2 \frac{\hat{\sigma}^2}{t}},$$` where `\(\hat{\sigma}^2\)` is the estimated error variance of the LSD. <br> Note: `\(\text{E}\left(\bar{Y}_{k(\bullet)}\right)= \mu + \frac{1}{t}\sum\limits_{i=1}^t \alpha_i + \frac{1}{t}\sum\limits_{j=1}^t \beta_j + \gamma_k+\bar{\epsilon}_{k(\bullet)}\)`. Hence `\(\text{E}\left(\bar{Y}_{k_1(\bullet)}-\bar{Y}_{k_2(\bullet)}\right)= \gamma_{k_1}-\gamma_{k_2}\)` since `\(\text{E}\left(\bar{\epsilon}_{k(\bullet)}\right)=0\)`. ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Examples ]] .row[.split-40[ .row[.split-one[ .column[.content[ ### Sudoku * Any solution to a Sudoku puzzle is a Latin square, with the additional restriction that the nine `\(3 \times 3\)` disjoint subsquares must also contain the digits 1-9. * Find at least one Latin square solution for the following Sudoku-type LSD (easy with Sudoku solver otherwise very hard): ]] ]] .row[.split-three[ .column.bg-gray.white[.content[ ### Find all `\(5 \times 5\)` LSD solutions | **1** | | | | | |-------|-------|-------|-------|-------| | | | **1** | **5** | **3** | | | **5** | | **2** | | | | **1** | **5** | | **2** | | | **3** | | **1** | || ]] .column.bg-gray.white[.content[ ### One solution: | **1** | .yellow[**2**] | .yellow[**4**] | .yellow[**3**] | .yellow[**5**] | |-------|-------|-------|-------|-------| | .yellow[**2**] | .yellow[**4**] | **1** | **5** | **3** | | .yellow[**4**] | **5** | .yellow[**3**] | **2** | .yellow[**1**] | | .yellow[**3**] | **1** | **5** | .yellow[**4**] | **2** | | .yellow[**5**] | **3** | .yellow[**2**] | **1** | .yellow[**4**] || ]] .column.bg-gray.white[.content[ ### Another solution: | **1** | .yellow[**2**] | .yellow[**3**] | .yellow[**4**] | .yellow[**5**] | |-------|-------|-------|-------|-------| | .yellow[**2**] | .yellow[**4**] | **1** | **5** | **3** | | .yellow[**3**] | **5** | .yellow[**4**] | **2** | .yellow[**1**] | | .yellow[**4**] | **1** | **5** | .yellow[**3**] | **2** | | .yellow[**5**] | **3** | .yellow[**2**] | **1** | .yellow[**4**] || ]] ]]]]