class: split-70 with-border hide-slide-number bg-brand-red background-image: url("images/USydLogo-black.svg") background-size: 200px background-position: 2% 90% .column.white[.content[ <br><br><br> # Data Wrangling and Visualisation in R ## .black[STAT3022 Applied Linear Models Lecture 2] <br><br><br> ### .black[2020/02/14] ]] .column.bg-brand-charcoal[.content.white[ ## Today 1. Using .yellow[ggplot2] for data visualisation. 2. Using .yellow[dplyr] and .yellow[tidyr] for data wrangling. ]] --- class: split-10 .row.bg-brand-blue.white[.content.vmiddle[ # Data Visualisation ]] .row[.content.black[ ## Why make your graphs in `R`? * The graphs are easily .indigo[reproducible]. * You can make .indigo[publication quality] graphs. ## How to make your graphs in `R`? * `R` has many contributed packages that extend from the standard `base` installation. * Today we will learn about `ggplot2` R package. .blockquote[ * What is `base`? * Can you name some functions that are in `base` that generates a graph? ] ]] --- class: split-30 with-border .column.bg-brand-blue[.content.center.vmiddle[ <img src="images/hex-ggplot2.png" width="80%"/> ]] .column[.content[ # `ggplot2` R package * `ggplot2` is a powerful data visualisation R package with a large community following that is built on the .indigo[layered grammar of graphics] by Wickham (2008). * One of the reason that makes it powerful is because of its ease in extensibility resulting in many extension packages. * `ggplot2` uses `qplot` or `ggplot` to make graphics * `qplot` is useful for making quick graphs (especially when data is not in a `data.frame`) but `ggplot` is advisable for most occasions. * We will only cover `ggplot`. * To get started, load the package: ```r library(ggplot2) # or library(tidyverse) ``` .bottom_abs.width100.font_small[ Wickham (2008) Practical tools for exploring data and models. PhD Thesis. ] ]] --- class: split-70 with-border .column[.content[ # Layered Grammar of Graphics * Every `ggplot2` object has three key components: 1. .indigo[data], 1. A set of .indigo[aesthestic mapping] between variables in the data and visual properties (e.g color, size etc) 1. At least one .indigo[layer] describing how to render each observation; usually created with .indigo[geom] function. ```r str(iris) ``` ``` 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ... ``` ]] .column.bg-brand-blue[.content[ ```r ggplot(data=iris) + aes(x=Sepal.Length, y=Sepal.Width) + geom_point() ``` <img src="images/irisplot.svg" width="100%"> ]] --- class: split-70 with-border .column[.content[ # Every .black[layer] has: 1. `geom` - the geometric object to use display the data, and `stat` - statistical transformation to use on the data for this layer. 1. .indigo[data] and .indigo[mapping] (aesthestics) which is usually inherited from `ggplot()` object. 1. `position` - position in the coordinate system. ```r p <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) p + geom_point() # blank + geom layer ``` which is a short-hand for: ```r p + layer(geom="point", stat="identity", position="identity") ``` ]] .column.bg-brand-blue.white[.content[ Every ggplot object has: 1. Data 1. Aesthesitc mapping 1. Layer(s) Purpose of a layer is to display: * the raw .yellow[data], * a .yellow[statistical summary], or * additional .yellow[metadata] such as context, annotations, and references. ]] --- class: split-40 .row[ .split-50[ .column[.content[ # Some `geom` objects ```r p <- ggplot(iris, aes(Species, Sepal.Width)) class(p) ``` ``` [1] "gg" "ggplot" ``` ]] .column[.content.vmiddle[ .img-fill[![](images/iris.png)] <br> .font_small[ Image source:<br> http://suruchifialoke.com/2016-10-13-machine-learning-tutorial-iris-classification/ ] ]]] ] .row[ .split-four[ .column.center[ ```r p + geom_blank() ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-9-1.png)<!-- --> ] .column.center[ ```r p + geom_point() ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-10-1.png)<!-- --> ] .column.center[ ```r p + geom_boxplot() ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-11-1.png)<!-- --> ] .column.center[ ```r p + geom_violin() ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-12-1.png)<!-- --> ] ]] --- class: split-20 .row[.content.nopadding[ # Drawing lines ```r p <- ggplot(iris, aes(Petal.Length, Petal.Width)) + geom_point(colour="gray") ``` ]] .row[ .split-two[ .row[ .split-two[ .column.center[ ```r p + geom_abline(intercept=-0.4,slope=0.4) ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-14-1.png" style="display: block; margin: auto;" /> ] .column.center[ ```r p + geom_smooth(method="lm") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-15-1.png" style="display: block; margin: auto;" /> ] ]] .row[ .split-two[ .column.center[ ```r p + geom_hline(yintercept=0) ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-16-1.png" style="display: block; margin: auto;" /> ] .column.center[ ```r p + geom_vline(xintercept=0) ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-17-1.png" style="display: block; margin: auto;" /> ] ]]]] --- class: split-20 .row[.content.nopadding[ # Distribution by group ```r p <- ggplot(iris, aes(Petal.Width, fill=Species)) ``` ]] .row[ .split-two[ .row[ .split-two[ .column.center[ ```r p + geom_dotplot() ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-19-1.png" style="display: block; margin: auto;" /> ] .column.center[ ```r p + geom_histogram() ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-20-1.png" style="display: block; margin: auto;" /> ] ]] .row[ .split-two[ .column.center[ ```r p + geom_density() ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-21-1.png" style="display: block; margin: auto;" /> ] .column.center[ ```r p + geom_freqpoly(aes(color=Species)) ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-22-1.png" style="display: block; margin: auto;" /> ] ]]]] --- class: split-70 with-border .column[.content.pad10px[ .font_sm55[ geom Description ---------------- ---------------------------------------------------------------- geom_abline Reference lines: horizontal, vertical, and diagonal geom_bar Bar charts geom_bin2d Heatmap of 2d bin counts geom_blank Draw nothing geom_boxplot A box and whiskers plot (in the style of Tukey) geom_contour 2d contours of a 3d surface geom_count Count overlapping points geom_density Smoothed density estimates geom_density_2d Contours of a 2d density estimate geom_dotplot Dot plot geom_errorbarh Horizontal error bars geom_hex Hexagonal heatmap of 2d bin counts geom_freqpoly Histograms and frequency polygons geom_jitter Jittered points geom_crossbar Vertical intervals: lines, crossbars & errorbars geom_map Polygons from a reference map geom_path Connect observations geom_point Points geom_polygon Polygons geom_qq_line A quantile-quantile plot geom_quantile Quantile regression geom_ribbon Ribbons and area plots geom_rug Rug plots in the margins geom_segment Line segments and curves geom_smooth Smoothed conditional means geom_spoke Line segments parameterised by location, direction and distance geom_label Text geom_raster Rectangles geom_violin Violin plot ]]] .column.bg-brand-blue[.content.vmiddle.white.center[ # .font-mono.font_large[geom] ]] --- class: split-70 .column[.content.font_sm100[ # Statistical Tranformation ```r head(iris[, c("Petal.Width", "Species")]) # raw data ``` ``` Petal.Width Species 1 0.2 setosa 2 0.2 setosa 3 0.2 setosa 4 0.2 setosa 5 0.2 setosa 6 0.4 setosa ``` `stat_bin(bins=7, mapping=aes(Petal.Width, fill=Species))`
<i class="fas fa-arrow-down faa-none animated "></i>
Under the hood, the raw data is transformed into statistics and this is passed onto the `geom` where here `geom="bar"` is default. ``` fill y count x xmin xmax density ncount 1 #619CFF 0 0 0.0 -0.2 0.2 0.0 0.0000000 2 #00BA38 0 0 0.0 -0.2 0.2 0.0 0.0000000 3 #F8766D 34 34 0.0 -0.2 0.2 1.7 1.0000000 4 #619CFF 0 0 0.4 0.2 0.6 0.0 0.0000000 5 #00BA38 0 0 0.4 0.2 0.6 0.0 0.0000000 6 #F8766D 16 16 0.4 0.2 0.6 0.8 0.4705882 ``` ]] .column[.vmiddle[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-26-1.png)<!-- --> ]] --- class: split-20 .row[.content[ # Using `stat` with different `geom` object ```r p <- ggplot(iris, aes(Petal.Width, fill=Species)) ``` ]] .row[ .split-two[ .row[ .split-two[ .column.center[ ```r p + stat_bin() ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-28-1.png" style="display: block; margin: auto;" /> ] .column.center[ ```r p + stat_bin(geom="bar") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-29-1.png" style="display: block; margin: auto;" /> ] ]] .row[ .split-two[ .column.center[ ```r p + stat_bin(geom="point") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-30-1.png" style="display: block; margin: auto;" /> ] .column.center[ ```r p + stat_bin(geom="line") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-31-1.png" style="display: block; margin: auto;" /> ] ]]]] --- class: split-70 with-border .column[.content.pad10px[ .font_sm74[ stat Description -------------------- ---------------------------------------------------------------- stat_count Bar charts stat_bin_2d Heatmap of 2d bin counts stat_boxplot A box and whiskers plot (in the style of Tukey) stat_contour 2d contours of a 3d surface stat_sum Count overlapping points stat_density Smoothed density estimates stat_density_2d Contours of a 2d density estimate stat_bin_hex Hexagonal heatmap of 2d bin counts stat_bin Histograms and frequency polygons stat_qq_line A quantile-quantile plot stat_quantile Quantile regression stat_smooth Smoothed conditional means stat_spoke Line segments parameterised by location, direction and distance stat_ydensity Violin plot stat_sf Visualise sf objects stat_ecdf Compute empirical cumulative distribution stat_ellipse Compute normal confidence ellipses stat_function Compute function for each x value stat_identity Leave data as is stat_sf_coordinates Extract coordinates from 'sf' objects stat_summary_bin Summarise y values at unique/binned x stat_summary_2d Bin and summarise in 2d (rectangle & hexagons) stat_unique Remove duplicates ]]] .column.bg-blue[.content.vmiddle.white.center[ # .font-mono.font_large[stat] ]] --- class: bg-brand-red middle center white # Customisation -- <br><br> ### There are so many ways to customise a `ggplot`. --- class: split-two with-border .column[.content[ # Changing Color There are many color palettes available, e.g. ```r library(RColorBrewer) ggplot(iris, aes(Petal.Width, * fill=Species)) + geom_dotplot() + * scale_fill_brewer(palette="Set3") ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-33-1.png)<!-- --> ]] .column[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-34-1.png)<!-- --> ] --- class: split-two with-border .column[.content[ # Grey-scale ```r ggplot(iris, aes(Petal.Width, fill=Species)) + geom_dotplot() + * scale_fill_grey() ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-35-1.png" style="display: block; margin: auto;" /> ]] .column[.content[ # Manual scale .font_sm85[ ```r ggplot(iris, aes(Petal.Width, fill=Species)) + geom_dotplot() + *scale_fill_manual( * values=c("red","blue", "green"), * labels=c("setosa", "versicolor", "virginica")) ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-36-1.png" style="display: block; margin: auto;" /> ]]] --- class: split-two with-border .column[.content[ # Color variable is `factor` .font_sm60[ ```r ggplot(iris, aes(Petal.Width, Petal.Length, * color=Species)) + geom_point(size=2) + * scale_color_brewer(palette="Set1") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-37-1.png" style="display: block; margin: auto;" /> ]]] .column[.content[ # Color variable is continuous .font_sm60[ ```r ggplot(iris, aes(Petal.Width, Petal.Length, * color=Sepal.Length)) + geom_point(size=2) + * scale_color_distiller(palette="YlGnBu") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-38-1.png" style="display: block; margin: auto;" /> ]]] --- class: bg-brand-red middle center white # Data Wrangling -- <br><br> ### You may need to wrangle the data to get it in the right form for <br> `ggplot` (or other purposes). --- class: split-70 .column[.content[ ```r library(agridat) # data is inside here library(dplyr) # for data wrangling; loaded together with library(tidyverse) str(pearl.kernels) # or glimpse(pearl.kernels) ``` ``` 'data.frame': 59 obs. of 6 variables: $ ear: Factor w/ 4 levels "Ear08","Ear09",..: 1 1 1 1 1 1 1 1 1 1 ... $ obs: Factor w/ 15 levels "Obs01","Obs02",..: 1 2 3 4 5 6 7 8 9 10 ... $ ys : int 352 322 298 332 305 313 308 311 327 308 ... $ yt : int 102 49 75 101 101 100 86 101 101 92 ... $ ws : int 52 82 108 71 86 90 95 92 78 95 ... $ wt : int 26 79 51 28 40 29 43 28 26 37 ... ``` * We are using the data `pearl.kernels` loaded from `library(agridat)`. * The data contains the counts of yellow/white and sweet/starchy kernels on each of 4 maize ears by 15 observers. * I want to get the counts for the 8th maize ear by observer 1 (plant pathologist). ```r (ear8obs1 <- pearl.kernels %>% filter(ear=="Ear08" & obs=="Obs01")) ``` ``` ear obs ys yt ws wt 1 Ear08 Obs01 352 102 52 26 ``` .bottom_abs.width100.font_small[ Pearl, Raymond (1911) The Personal Equation In Breeding Experiments Involving Certain Characters of Maize *Biological Bulletin* **21** 339-366 ] ]] .column.bg-brand-yellow[.content.vmiddle[ .img-fill[![](images/corn-parts.png)] .font_small[Image source: http://corncommentary.com/2012/05/22/using-the-kfc-kernel-for-cellulosic/ ] ]] --- class: split-50 with-border .column[.content[ # Help! The data: ```r ear8obs1 ``` ``` ear obs ys yt ws wt 1 Ear08 Obs01 352 102 52 26 ``` How do I make the below graph in `ggplot`? ![](lecture02_2020JC_files/figure-html/unnamed-chunk-43-1.png)<!-- --> ]] .column.bg-brand-yellow[.content[ ```r ggplot(ear8obs1, aes(x=..., y=...)) + geom_bar() ``` {{content}} ]] -- What if the data was shaped as below? ``` Type Count Color Kernel 1 ys 352 Yellow Starchy 2 yt 102 Yellow Sweet 3 ws 52 White Starchy 4 wt 26 White Sweet ``` {{content}} -- How do I get the data in this shape easily? {{content}} -- ```r ear8obs1 %>% * tidyr::gather("Type", "Count", ys:wt) ``` ``` ear obs Type Count 1 Ear08 Obs01 ys 352 2 Ear08 Obs01 yt 102 3 Ear08 Obs01 ws 52 4 Ear08 Obs01 wt 26 ``` --- class: split-70 with-border .column[.content[ ## Data Wrangling Get the counts for the 8th maize ear by observer 1 (plant pathologist): ```r maize <- pearl.kernels %>% filter(ear=="Ear08" & obs=="Obs01") %>% select(ys, yt, ws, wt) %>% tidyr::gather("Type", "Count", ys:wt) %>% mutate(Color=case_when( Type %in% c("ys", "yt") ~ "Yellow", Type %in% c("ws", "wt") ~ "White" ),Kernel=case_when( Type %in% c("ys", "ws") ~ "Starchy", Type %in% c("yt", "wt") ~ "Sweet")) maize ``` ``` Type Count Color Kernel 1 ys 352 Yellow Starchy 2 yt 102 Yellow Sweet 3 ws 52 White Starchy 4 wt 26 White Sweet ``` ] ]] .column.bg-brand-yellow[.content.vmiddle.nopadding[ ]] --- class: split-70 with-border .column.bg-white[.content[ # Example: Observer 1 for Maize Ear 8 ```r ggplot(maize, aes(Kernel, Count, fill=Color)) + geom_bar(stat="identity") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-47-1.png" style="display: block; margin: auto;" /> ]] .column.bg-brand-yellow[.content.vmiddle[ <img src="images/corn-farm.jpg" width="100%"> .font_small[ Image Source:<br> https://agrifarmingtips.com/maize-cultivation-process/] ]] --- class: split-70 with-border count: false .column[.content[ # Example: Observer 1 for Maize Ear 8 ```r ggplot(maize, aes(Kernel, Count, fill=Color)) + geom_bar(stat="identity", color="black") + scale_fill_manual(values=c("white", "yellow"), label=c("White", "Yellow")) + guides(fill=FALSE) + theme_minimal(base_size = 20) ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-48-1.png" style="display: block; margin: auto;" /> ]] .column.bg-brand-yellow[.content.vmiddle[ <img src="images/corn-farm.jpg" width="100%"> .font_small[ Image Source:<br> https://agrifarmingtips.com/maize-cultivation-process/] ]] --- class: split-20 .row[.content.nopadding[ ## Position for `geom_bar` which include `stat="identity"` ```r p2 <- ggplot(maize, aes("",Count,fill=Type)) ``` ]] .row[ .split-two[ .row[ .split-two[ .column.center[ ```r p2 + geom_bar() ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-51-1.png" style="display: block; margin: auto;" /> ] .column.center[ ```r p2 + geom_bar(position="stack") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-53-1.png" style="display: block; margin: auto;" /> ] ]] .row[ .split-two[ .column.center[ ```r p2 + geom_bar(position="dodge") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-55-1.png" style="display: block; margin: auto;" /> ] .column.center[ ```r p2 + geom_bar(position="fill") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-57-1.png" style="display: block; margin: auto;" /> ] ]]]] --- class: split-10 .row[.content[ ## Coordinate system ]] .row[.content[ .split-two[ .row[ .split-two[ .column[.content[ .center[ ```r p + geom_bar() ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-60-1.png" style="display: block; margin: auto;" /> ]]] .column[.content.center[ ```r p + geom_bar() + coord_polar(theta="y") ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-62-1.png" style="display: block; margin: auto;" /> ]]]] .row[ .split-two[ .column[.content.center[ ```r p + geom_bar() + coord_flip() ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-64-1.png" style="display: block; margin: auto;" /> ]] .column[.content.center[ ```r p + geom_bar() + coord_polar(theta="y", direction=-1) ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-66-1.png" style="display: block; margin: auto;" /> ]]]]]]] .bottom_abs.width100[All `geom_bar` include the arguments `stat="identity"` and `color="black"`.] --- class: split-20 .row[.content[ ## Overplotting ```r g <- ggplot(pearl.kernels, aes(ear, ys, color=ear, size=1,shape=)) + xlab(NULL) + guides(color=FALSE, size=FALSE) + ylab("No. of Yellow\n Starchy Kernel") ``` ]] .row[ .split-two[ .row[ .split-two[ .column[.content.center[ ```r g + geom_point() ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-68-1.png)<!-- --> ]] .column[.content.center[ ```r g + geom_point(position="jitter") ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-69-1.png)<!-- --> ]]]] .row[ .split-two[ .column[.content.center[ ```r g + geom_point(alpha=1 / 3) ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-70-1.png)<!-- --> ]] .column[.content.center[ ```r g + geom_point(alpha=1 / 6) ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-71-1.png)<!-- --> ]]]] ]] --- class: split-two .row[.split-70[ .column[.content[ # Massaging data to tidy form ```r maize_all <- pearl.kernels %>% tidyr::gather("Type", "Count", ys:wt) %>% mutate(Color=ifelse(substr(Type, 1, 1)=="y", "Yellow", "White"), Kernel=ifelse(substr(Type, 2, 2)=="s", "Starchy", "Sweet"), obs=factor(as.integer(substring(obs, 4, 5)))) ``` ]] .column[.content[ .img-fill[![](images/tidyr-spread-gather.gif)] ]] ]] .row[.split-two[ .column[.content[ ```r head(pearl.kernels) ``` ``` ear obs ys yt ws wt 1 Ear08 Obs01 352 102 52 26 2 Ear08 Obs02 322 49 82 79 3 Ear08 Obs03 298 75 108 51 4 Ear08 Obs04 332 101 71 28 5 Ear08 Obs05 305 101 86 40 6 Ear08 Obs06 313 100 90 29 ``` ]] .column[.content[ ```r head(maize_all) ``` ``` ear obs Type Count Color Kernel 1 Ear08 1 ys 352 Yellow Starchy 2 Ear08 2 ys 322 Yellow Starchy 3 Ear08 3 ys 298 Yellow Starchy 4 Ear08 4 ys 332 Yellow Starchy 5 Ear08 5 ys 305 Yellow Starchy 6 Ear08 6 ys 313 Yellow Starchy ``` ]] ]] --- class: split-two .column[.content[ # Faceting ```r ggplot(maize_all, aes(obs, Count, fill=Type)) + geom_bar(stat="identity") + xlab("Observer") + * facet_wrap(~ear) ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-75-1.png" style="display: block; margin: auto;" /> ]] .column[.content[ # ```r ear8 <- maize_all %>% filter(ear=="Ear08") %>% ggplot(aes(obs, Count, fill=Type)) + geom_bar(stat="identity", show.legend=F) + labs(tag="(A)", title="Ear 8", x="Observer") + * facet_grid(Color ~ Kernel) ``` <img src="lecture02_2020JC_files/figure-html/unnamed-chunk-77-1.png" style="display: block; margin: auto;" /> ]] --- class: split-70 .column[.content[ # Patching Plots Together ```r library(patchwork) ear8 + ear9 + ear10 + ear11 + plot_layout(ncol = 2) ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-79-1.png)<!-- --> ]] .column.bg-brand-blue[.content.vmiddle.center[ <img src="images/hex-patchwork.png" width="80%"> ]] --- class: split-60 with-border .column.bg-brand-blue.white[.content[ # Changing Labels ```r g <- ggplot(vargas.wheat1.traits, aes(NGS, yield)) + geom_point(size=3) + geom_point(aes(colour=gen)) + geom_smooth(se=F, method="lm") + facet_wrap(~year) + * labs(colour="Genotype") + # changes the label name for color legend * labs(x="Number of grains per spikelet") + # same as xlab(..) * labs(y="Yield (kg/ha)") + # same as ylab(..) * labs(title="Durum Wheat at Ciudad Obregon, Mexico 1990-1995") + # same as ggtitle(..) * labs(subtitle="Source: Vargas et al. (1998) Interpreting Genotype x Environment Interaction in Wheat by Partial Least Squares Regression.") # same as ggtitle(subtitle=..) ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-81-1.png)<!-- --> ]] --- class: split-60 with-border .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r *g ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-82-1.png)<!-- --> ]] --- class: split-60 with-border count: false .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r g + *theme(legend.position="bottom") ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-83-1.png)<!-- --> ]] --- class: split-60 with-border count: false .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r g + theme(legend.position="bottom", *plot.title=element_text(face="bold", size=15)) ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-84-1.png)<!-- --> ]] --- class: split-60 with-border count: false .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r g + theme(legend.position="bottom", plot.title=element_text(face="bold", size=15), *plot.subtitle=element_text(face="italic", size=8)) ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-85-1.png)<!-- --> ]] --- class: split-60 with-border count: false .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r g + theme(legend.position="bottom", plot.title=element_text(face="bold", size=15), plot.subtitle=element_text(face="italic", size=8), *panel.background=element_rect(fill="white")) ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-86-1.png)<!-- --> ]] --- class: split-60 with-border count: false .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r g + theme(legend.position="bottom", plot.title=element_text(face="bold", size=15), plot.subtitle=element_text(face="italic", size=8), panel.background=element_rect(fill="white"), *panel.border=element_rect(colour="grey20", fill=NA)) ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-87-1.png)<!-- --> ]] --- class: split-60 with-border count: false .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r g + theme(legend.position="bottom", plot.title=element_text(face="bold", size=15), plot.subtitle=element_text(face="italic", size=8), panel.background=element_rect(fill="white"), panel.border=element_rect(colour="grey20", fill=NA), *panel.grid=element_line(colour="grey92")) ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-88-1.png)<!-- --> ]] --- class: split-60 with-border count: false .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r g + theme(legend.position="bottom", plot.title=element_text(face="bold", size=15), plot.subtitle=element_text(face="italic", size=8), panel.background=element_rect(fill="white"), panel.border=element_rect(colour="grey20", fill=NA), panel.grid=element_line(colour="grey92"), *panel.grid.minor=element_line(size=rel(0.5))) ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-89-1.png)<!-- --> ]] --- class: split-60 with-border count: false .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r g + theme(legend.position="bottom", plot.title=element_text(face="bold", size=15), plot.subtitle=element_text(face="italic", size=8), panel.background=element_rect(fill="white"), panel.border=element_rect(colour="grey20", fill=NA), panel.grid=element_line(colour="grey92"), panel.grid.minor=element_line(size=rel(0.5)), *strip.background=element_rect(fill="grey85", colour="grey20")) ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-90-1.png)<!-- --> ]] --- class: split-60 with-border count: false .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r g + theme(legend.position="bottom", plot.title=element_text(face="bold", size=15), plot.subtitle=element_text(face="italic", size=8), panel.background=element_rect(fill="white"), panel.border=element_rect(colour="grey20", fill=NA), panel.grid=element_line(colour="grey92"), panel.grid.minor=element_line(size=rel(0.5)), strip.background=element_rect(fill="grey85", colour="grey20"), *legend.key=element_rect(fill="white")) ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-91-1.png)<!-- --> ]] --- class: split-60 with-border .column.bg-brand-blue.white[.content[ # Theme - customise the look ```r g + theme(legend.position="bottom", plot.title=element_text(face="bold", size=15), plot.subtitle=element_text(face="italic", size=8), panel.background=element_rect(fill="white"), panel.border=element_rect(colour="grey20", fill=NA), panel.grid=element_line(colour="grey92"), panel.grid.minor=element_line(size=rel(0.5)), strip.background=element_rect(fill="grey85", colour="grey20"), legend.key=element_rect(fill="white")) ``` or use a pre-defined theme: ```r g + *theme_bw() + theme(legend.position="bottom", plot.title=element_text(face="bold", size=14), plot.subtitle=element_text(face="italic", size=8)) ``` ]] .column[.content.vmiddle.center[ ![](lecture02_2020JC_files/figure-html/unnamed-chunk-93-1.png)<!-- --> ]] --- class: split-50 .column.bg-white[.content.center[ # More Pre-Defined Themes ```r g + theme_gray() ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-94-1.png)<!-- --> ```r g + theme_classic() ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-95-1.png)<!-- --> ]] .column.bg-white[.content.center[ # ```r g + theme_minimal() ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-96-1.png)<!-- --> ```r g + theme_dark() ``` ![](lecture02_2020JC_files/figure-html/unnamed-chunk-97-1.png)<!-- --> ]] --- layout: true class: split-60 .column.bg-brand-red.white[.content[ # Summary * Using functions such as `filter` and `mutate` from `dplyr` to wrangle data. * Using function `gather` from `tidyr` to change the data from wide to long form. * Using `ggplot` from `ggplot2` to make many sorts of plots. ]] .column.bg-brand-charcoal.white[.content[ # Next lesson * Revisitng simple linear regression. * Maximum likelihood estimation. ]] --- class: show-10 --- count: false