Workshops

Personal R Administration

Description

  • Does the release of a new R version fill you with dread?
  • Are there passwords in your R code?
  • Do you look at the output of a failed package installation and think to yourself, “WTF?!”

If you said yes to any of those questions, then you need Personal R Administration. You’ll come away with tips, tricks, tweaks, and some hacks for building data science dev environments that you won’t be afraid to come back to in a year.

David Aja and Shannon Pileggi

E. David Aja is a Software Engineer at Posit. Before joining Posit, he worked as a data scientist in the public sector.

Shannon Pileggi (she/her) is a Lead Data Scientist at The Prostate Cancer Clinical Trials Consortium, a frequent blogger, and a member of the R-Ladies Global leadership team. She enjoys automating data wrangling and data outputs, and making both data insights and learning new material digestible.

teal Mastery: From Pre-built Modules to Custom Module Creation

Description

This session provides a comprehensive introduction to teal programming, starting with creating a simple teal application from scratch. You’ll learn the fundamentals of building a basic teal app and understand its core components. Next, we will explore the practical use of pre-built modules from teal.modules.general and teal.modules.clinical, demonstrating how these ready-to-use components can streamline the development of robust teal applications. Participants will gain hands-on experience in integrating these modules into their projects. The workshop will then focus on building upon this foundation by learning how to create custom teal modules that meet specific project needs. You’ll learn how to leverage the core features of the teal framework to develop tailored solutions and take your skills to the next level. This session will provide practical insights and coding examples, empowering you to extend and customize your teal applications beyond the capabilities of pre-built modules. By the end of this workshop, you will have a comprehensive understanding of both pre-built and custom module development in teal, making it an ideal choice for beginners and intermediate learners looking to expand their R skills with teal.

Dony Unardi

Dony Unardi is a Principal Data Scientist at Genentech, and the Engineering Team Lead in the development effort of an open-source R product called teal, a Shiny-based R package focused on interactive and reproducible data analysis and visualization in clinical trials.

Survival analysis with tidymodels

Description

Survival analysis is now supported across the tidymodels framework, a collection of R packages for modeling and machine learning using tidyverse principles. It covers the entire predictive modeling workflow from data splitting, resampling, feature engineering, model fitting, and performance evaluation to tuning. It provides a consistent interface with composable functions that allow beginners a safe start and advanced users access to more specialized techniques such as feature engineering on text data or tuning via racing methods. The addition of dedicated performance metrics has enabled us to support tuning of survival models and unlock the entire framework for survival analysis. This workshop focuses on the core components of tidymodels to get you up and running with predictive survival analysis.

This workshop is for you if you

  • are familiar with basic survival analysis such as censoring of time-to-event data, Kaplan-Meier curves, proportional hazards models
  • are familiar with the basic predictive modeling workflow such as split in train and test set, resampling, tuning via grid search
  • want to learn how to leverage the tidymodels framework for survival analysis

Hannah Frick

Hannah Frick is a software engineer on the tidymodels team at Posit. She holds a PhD in statistics and has worked in interdisciplinary research and data science consultancy. She is a co-founder of R-Ladies Global.

Title TBD

Description

TBD

Cara Thompson

Cara is a data visualisation consultant with an academic background, specialising in helping research teams and data-driven organisations turn their data insights into to clear and compelling visualisations.

Following her PhD in Psychology and a spell teaching research methods at Edinburgh Uni, she embarked on a career in psychometrics at the Royal college of Surgeons of Edinburgh. After ten years of helping surgeons and other medical professionals understand complex patterns in exam data, she set out as an independent data visualisation consultant and launched her business “Building Stories with Data”, to continue crafting innovative dataviz solutions for a range of different organisations.

She lives in Edinburgh, Scotland, with her husband and two young daughters. Cara regularly shares coding tips for dataviz online, and genuinely enjoys helping others level up their dataviz skills through talks, bespoke toolkits, organisational training, and one-to-one coaching.

Enhancing Scientific Equity: A Spanish Introduction to Using R for Biostatistical and Data Science Programming

Promover la Equidad Científica: Una Introducción al uso de R para la programación en Bioestadística y Ciencia de Datos, en Español.

Description

Despite the abundant resources available for learning R, most of these materials are primarily accessible to English speakers. This language barrier significantly restricts access for individuals who do not speak English proficiently. As a result, Spanish-speaking communities often face considerable challenges in accessing software training opportunities. This disparity leads to inequities in the distribution and utilization of scientific technologies, which is particularly concerning given the increasing importance of digital skills nowadays. To mitigate these challenges and promote inclusivity, we propose conducting a programming workshop in Spanish during the conference. This initiative aims to bridge the gap by providing Spanish-speaking participants with equal opportunities to engage with and benefit from technological advancements. By doing so, we not only enhance individual capabilities but also contribute to a more equitable distribution of educational resources in the scientific community. This workshop will equip attendees with basic skills in R. Our primary objective is to familiarize participants with RStudio and its key features for generating reproducible reports. We will guide attendees through the process of creating and managing projects in RStudio and introduce them to creating reproducible manuscripts using Quarto documents. The workshop will utilize a publicly available dataset from the CDC, which contains information on drug use and suicidal ideation among adolescents, as a practical example of using R for academic research in public health. We will explain how to use functions such as filter, mutate, summarize, and select from the tidyverse suite of packages. We will conclude by demonstrating how to use ggplot2 to create visualizations in R. By the end of the workshop, participants will have created a, reproducible document in HTML format, detailing the data cleaning steps and analysis of a significant, contemporary social issue. This presentation aims to close the gap in programming literacy among Spanish-speaking researchers and promote methods for reproducible scientific inquiry.

Descripción

A pesar de los abundantes recursos disponibles para aprender R, la mayoría de estos materiales son accesibles principalmente para angloparlantes. Esta barrera del idioma restringe significativamente el acceso de personas que no hablan inglés con fluidez. Como resultado, las comunidades de habla hispana a menudo enfrentan desafíos considerables para acceder a oportunidades de capacitación en software. Esta disparidad conduce a desigualdades en la distribución y utilización de las tecnologías científicas, lo que es particularmente preocupante dada la creciente importancia de las habilidades digitales en la actualidad. Para mitigar estos desafíos y promover la inclusión, proponemos realizar un taller de programación en español durante la conferencia. Esta iniciativa tiene como objetivo cerrar la brecha brindando a los participantes de habla hispana igualdad de oportunidades para interactuar y beneficiarse de los avances tecnológicos. Al hacerlo, no sólo mejoramos las capacidades individuales sino que también contribuimos a una distribución más equitativa de los recursos educativos en la comunidad científica. Este taller equipará a los asistentes con habilidades básicas en R. Nuestro objetivo principal es familiarizar a los participantes con RStudio y sus características clave para generar informes reproducibles. Guiaremos a los asistentes a través del proceso de creación y gestión de proyectos en RStudio y les presentaremos la creación de manuscritos reproducibles utilizando documentos Quarto. El taller utilizará un conjunto de datos disponible públicamente de los CDC, que contiene información sobre el uso de drogas y la ideación suicida entre adolescentes, como un ejemplo práctico del uso de R para la investigación académica en salud pública. Explicaremos cómo utilizar funciones como filtrar, mutar, resumir y seleccionar del conjunto de paquetes tidyverse. Concluiremos demostrando cómo usar ggplot2 para crear visualizaciones en R. Al final del taller, los participantes habrán creado un documento reproducible en formato HTML, detallando los pasos de limpieza de datos y el análisis de un problema social contemporáneo importante. Esta presentación tiene como objetivo cerrar la brecha en la alfabetización en programación entre los investigadores de habla hispana y promover métodos para la investigación científica reproducible.

Catalina Canizares-Escobar and Francisco Cardozo

Catalina Cañizares is a passionate data scientist and a Ph.D. candidate in Social Welfare, dedicated to using data to gain insights into emotional disorders. She has been delving deep into data analysis, especially with R. Her focus? Making data understandable and useful. She specializes in cleaning and merging data, and loves exploring data with tools like tidyverse, table1, gtsummary, and skimr, among others. She is also interested in using Machine Learning models, with the tidymodels package, to better understand emotional disorders. Plus, She is all about keeping things clear and reproducible by using tools such as Quarto.

Francisco Cardozo is a PhD canidate in prevention science and community health, he specializes in applying quantitative techniques to evaluate the efficacy of prevention programs, focusing on understanding the dynamics of how and for whom these programs are most effective. He is dedicated to developing precise measurements and analyses that inform decisions about program operations. Francisco is passionate about translating resource science into practical, real-world applications.

Catalina Cañizares es una científica de datos apasionada y candidata Ph.D. en trabajo social. Ella se dedidca al uso de datos para obtener información sobre los trastornos emocionales. Ha estado profundizando en el análisis de datos, especialmente con R y su enfoque es hacer que los datos sean comprensibles y útiles. Se especializo en limpiar y fusionar datos, y le encanta explorar datos con herramientas como tidyverse, table1, gtsummary y skimr, entre otras. También se interesa utilizar modelos de Machine Learning, con el paquete tidymodels, para comprender mejor los trastornos emocionales.

Francisco Cardozo es un candidato a PhD en Ciencias de las Prevención y Salud Comunitaria, se especializa en aplicar técnicas cuantitativas para evaluar la eficacia de programas de prevención, centrándose en entender la dinámica de cómo y para quién estos programas son más efectivos. Está dedicado a desarrollar mediciones y análisis que informen decisiones sobre cómo operar los programas. Francisco siente pasión por traducir hallazgos científicos en aplicaciones prácticas.