Tutoriales

Introduction to Machine Learning with tidymodels

Instructores: Dra. Hannah Frick & Dr. Max Kuhn

Fecha y hora: 18/10 - 9:00 - 12:30 & 13:30 - 17:00

Idioma: English

Descripción: This workshop provides an introduction to machine learning with R using the tidymodels framework, a collection of packages for modeling and machine learning using tidyverse principles. We will build, evaluate, and compare predictive models. Along the way, we’ll learn about key concepts in machine learning including overfitting and resampling. Learners will gain knowledge about good predictive modeling practices, as well as hands-on experience using tidymodels packages like parsnip, rsample, yardstick, and workflows.

Is this workshop for me?

This course assumes intermediate R knowledge. This workshop is for you if:

You can use the magrittr pipe %>% and/or native pipe |>
You are familiar with functions from dplyr, tidyr, and ggplot2
You can read data into R, transform and reshape data, and make a wide variety of graphs

We expect participants to have some exposure to basic statistical concepts, but NOT intermediate or expert familiarity with modeling or machine learning.

Using and contributing to the data.table package for efficient big data analysis

Instructor: Toby Dylan Hocking

Fecha y hora: 18/10 - 9:00-12:30

Idioma: English

Descripción: data.table is one of the most efficient open-source in-memory data manipulation packages available today. First released to CRAN by Matt Dowle in 2006, it continues to grow in popularity, and now over 1400 other CRAN packages depend on data.table. This three hour tutorial will start with data reading from CSV, discuss basic and advanced data manipulation topics, and finally will end with a discussion about how you can contribute to data.table. In each part of the tutorial, you will be asked to solve a few exercises, to practice each new concept.

Materiales: slides

Outline:

Efficiently reading CSV files into R: fread
Introducing general form of a data.table query - DT[i, j, by], or for those familiar with SQL: DT[where, select|update, group by]
Subsets and joins - exploring similarities between subsets and joins is key to understanding how data.table works.
Fast and flexible grouped aggregations and updates
quick look at other new and useful features in the recent releases, including CSV file writer (fwrite) and rolling/non-equi joins.
Using data.table in your own package that you want to put on CRAN.
Contributing to data.table via issues, pull requests, translations, travel awards, community governance.

Background knowledge: Familiarity with base R and/or SQL is useful but is not absolutely essential.

Requirements: You will need your laptop with the latest version of R and latest stable (CRAN) version of data.table already installed.

Instructor biography: Toby Dylan Hocking is a contributor to data.table, professor of computer science at Northern Arizona University, and principal investigator for a National Science Foundation funded project about expanding the open-source ecosystem around data.table.

Package Authors: Matt Dowle, Arun Srinivasan, and many other contributors, including myself.

Link: http://r-datatable.com

Creating data plots for effective decision-making using statistical inference with R

Instructora: Dra. Dianne Cook

Fecha y hora: 18/10 - 13:30-17:00

Idioma: English

Materiales: https://dicook.github.io/LatinR/

Descripción:

Review of making effective plots using ggplot2’s grammar of graphics:
- Organising your data to enable mapping variables to graphical elements,
- Common plot descriptions as scripts,
- Do’s and don’ts following cognitive perception principles.
Making decisions and inferential statements based on data plots
- What is your plot testing? Determining the hypothesis based on the type of plot.
- Creating null samples to build lineups for comparison and testing.
- Conducting a lineup test using your friends to determine whether what you see is real or spurious, and to determine the best design for your plot.

Background: Participants should have a good working knowledge of R, and tidyverse, and some experience with ggplot2. Familiarity with the material in R4DS (https://r4ds.hadley.nz) is helpful.

Creación de reportes reproducibles con Quarto

Instructora: Riva Quiroga

Fecha y hora: 18/10 - 13:30 - 17:00

Idioma: Español

Descripción: En este tutorial se hará una introducción a Quarto, un sistema de publicación científica y técnica que permite crear contenido dinámico usando R, Python, Julia y Observable. Durante la sesión se abordarán los aspectos generales de su uso para crear reportes con R, con especial énfasis en el trabajo en formato HTML. Para ello, se mostrará paso a paso cómo crear un reporte reproducible, cómo parametrizar su contenido, cómo editar su apariencia y cómo publicarlo utilizando GitHub Pages y Netlify.

Para poder seguir sin problema las actividades del tutorial, es necesario tener algún grado de experiencia con el operador “pipe” (en cualquiera de sus dos versiones: %>% o |>), con las funciones principales del paquete dplyr (como filter, summarize y group_by) y con el paquete ggplot2 (por ejemplo, tener una idea general de qué hacen las funciones geom_* o saber cómo modificar la escala del eje “y” de un gráfico). Para quienes tengan interés en la publicación de un reporte utilizando el servicio GitHub Pages, es necesario tener al menos un manejo inicial de git (saber cómo hacer commits y enviar cambios a un repositorio personal).

Introducción a Shiny - Buenas prácticas en un entorno de producción

Instructores: Agustin Perez Santangelo, Oriol Senan y Federico Rivadeneira

Fecha y hora: 18/10 - 9:00 - 12:30

Idioma: Español

Descripción: El tutorial consistirá en la introducción de conceptos básicos haciendo foco en las buenas prácticas de desarrollo. El mismo constará de tres partes:

Estructura inicial y básica de un proyecto junto a las herramientas que podemos utilizar para asistirnos en esta tarea.
Modularización y buenas prácticas de desarrollo haciendo hincapié en conceptos de desarrollo con ejemplos.
Optimización y performance de una aplicación introduciendo paquetes y buenas prácticas.
PRE-REQUISITOS:
- Conocimientos básicos de R / Shiny
- Para la parte práctica
- Laptop con RStudio, usando R > 4.2
- Clonar https://github.com/Appsilon/latin-r-2023 y seguir las instrucciones del README