Data processing with R tidyverse
R is a powerful language for data science in many disciplines of research with a steep learning curve. The tidyverse group of packages provide a dialect that greatly simplifies data importing, cleaning, processing, and visualization as well as providing reproducible workflows, replacing many intricacies of R with clear and easy to learn syntax.
The course will be run from May 14th to May 17th, 2019.
The four day course provides a complete introduction to data science in R with the tidyverse. Participants should have basic experience in programming environments such as Matlab, Octave or other programming languages or complete a simple free online course as this one offered by DataCamp.
The course will not go deep into statistics but rather getting data ready, some exploratory analysis, visualization and handling models. Preparing data takes up to 90% of the time spent in analysis — speeding this up is the mission of this course.
Day 1 will review the basics of R and loading data via the readr package as well as Markdown.
Day 2 will introduce tidying and organising data via the tidyr and dplyr packages as well as ggplot2 for visualisation.
Day 3 will look at functional programming tools using the purrr package, which greatly simplifies repeating operations. Many statistical packages have complicated and idiosyncratic data structures. The broom package helps to convert them to consistent data structures.
Participants are encouraged to bring their own data for analysis, convert existing code to tidyverse or perform a project on Day 4.
Students will receive 2 ECTS in category 1, which requires handing in a short report in Rmarkdown at the end of the course.
Speakers: Aurélien GINOLHAC (LSRU) and Roland KRAUSE (ELIXIR-LU/LCSB)
- - -