Unit Testing for R Developers

2: Advancing your skills

Jonathan Sidi, Daniel Sabanes Bove

What are we going to talk about?

  • Understand the philosophy behind unit testing and how it relates to other tests
  • Gain more confidence in your testing framework by examining coverage

What is the structure of unit tests?

  • Setup: Set up the inputs for the test.
  • Compute: Compute the result which will be tested.
  • Expect: Define the expected result.
  • Compare: Compare the actual with the expected result.

The code should test a specific characteristic or functionality of the package.

Because if your action does too many things at once, you have to search longer for the bug.

Who do you write tests for?

Unit testing is a method to communicate to various stakeholders the package requirements.

Test Flavors

There are different flavors of unit tests that we write:

  • Developer Tests
    • Tests that help the software developer speed up and iterate over different versions
  • Quality Assurance (QA) Tests
    • Proving that the package isn’t broken or that functions are returning the expected answers

Test Flavors (cont’d)

Developer Tests

Remove Fear of Change

  • Assist with package design
  • Guide for refactoring

Reduce Development Time

  • Explain code to others
  • Pinpoint errors
  • Minimize debugging/manual testing

QA Tests

  • Encode object requirements
  • Catch performance degradation

Planning tests for success

  • Write tests to evaluate the exported behavior that the package user would invoke.
    • Sets up the ability to refactor code.
    • Tests will not have to be rewritten as the package iterates and improves.

Planning tests for success (cont’d)

  • Write tests that you will not need to update or maintain because of a dependency.
    • Because you can’t control how other maintainers are developing packages and managing their lifecycle.
    • Note: Having unit tests allows dependency changes to be detected early enough - e.g. CRAN is running reverse dependency checks before accepting new package versions

Testing surface areas

  • When writing unit tests for functions it is important to take into account the surface area of the tests
  • In this example we have three unit tests on stats::mean.default
  • Tests B and C have large surface areas, if they fail you will still need to investigate the cause of their failure.

Adding new features with a safety net

When you are developing new features in a package make sure to prepare the area with unit tests for the expected behavior of the feature.

This will serve two main purposes:

  1. Communicate the goals of the feature to others and to yourself.
  2. Free to develop the idea while writing messy code and refactoring it while still controlling for the basic requirements.

Some strategies for learning

  1. Pair up with experienced developer buddies.
  2. Help out with refactoring or features on GitHub repositories:
    • Developers usually list what they are working on and are happy to get an extra pair of hands to tackle issues.
    • The process of merging a pull request will teach you hands on.
  3. Clone repositories of packages you are familiar with:
    • Run the tests you find there.
    • You will learn a lot about testing, coding and strategies.

What can and should be unit tested?

  1. Script
    • Single files with functions can have unit tests
  2. Packages
    • This is the standard object to write tests for
  3. Shiny apps
    • Shiny app UI and reactivity can be tested
  4. Data derivation
    • Testing data preprocessing pipelines for expected characteristics of columns

Overview of unit testing frameworks

  1. script/packages: testthat, tinytest, box, Runit
  2. shiny apps: shinytest2, reactor, crrry
  3. data: pointblank, assertr, validate, dataReporter
  4. plots: vdiffr

Testing plot functions

  • vdiffr package allows to save vector graphics file of the expected plot outcome
  • works for both base plots as well as ggplots
  • Beware of false positive test failures though
    • operating systems can lead to slightly different plot outcomes
    • requires manual visual checks

Testing plot functions (cont’d)

  • Alternative for ggplot:
    • use ggplot2::layer_data() function to extract layer information
    • use that for comparisons in tests instead of vector graphics

Communicating Tests: Coverage

  1. covr
    • R package that evaluates the % of lines of code that are tested
    • Use covr::package_coverage() to analyze current package coverage
  2. covrpage
    • R package that summarizes covr statistics into simple reports that can be shared
    • Use covrpage::covrpage() to create the report page

Mapping the Logic

Summary

  • Philosophy of testing:
    • Aggregate of all tests should cover the whole functionality
    • But each test on its own should be specific and only have small surface
    • Developer vs. Quality Assurance Tests
    • Not just R packages: also scripts, Shiny apps, data
  • How to communicate test coverage:
    • Use covr to calculate coverage
    • Use covrpage to create a summary report

Outlook: Automating Tests

When developing through a version control platform like GitHub, GitLab, Bitbucket you can automatically run your tests through CI/CD for each commit or when working with other developers via pull requests.

Brought to you by the Software Engineering Working Group

rconsortium.github.io/asa-biop-swe-wg