R Submission Pilot 7

Pilot 7 - Benchmark Data and Test Cases

Goal: Develop and curate modern, realistic, and reusable synthetic clinical trial datasets that can serve as a high-quality public benchmark for evaluating R tools, AI-skills, to serve regulatory purposes.

The pharmaceutical ecosystem is seeing rapid growth of open-source R tools and AI-enabled applications across clinical development, analysis, and regulatory submissions. However, objective benchmarking and evaluation of these tools is currently constrained by the lack of high-quality, publicly available clinical trial datasets. Existing public datasets, such as CDISC Pilot 1, are outdated, limited in scale and complexity, and insufficient for evaluating modern workflows. This project addresses that gap by providing modern, realistic, and reusable synthetic clinical trial data to support tool demonstration, method evaluation, and community education.

Current Activities:

The project now includes two complementary parts:

Part A: Benchmark datasets and test cases for evaluating open-source tools
Part B: Benchmark test cases for evaluating pharma open-source skills, starting with group sequential design (GSD)

Part B is being developed in collaboration with BBSW, which is supporting the work by sponsoring shared tokens for the automated evaluation pipeline. The first skill in this effort is available at the pharma_skills repository.

Links:

R Submission Pilot 7 Development Repository

Pilot 7 Meeting Minutes

Pilot 7 Part B - open source skills and test cases

Key team members:

Developer team:

Yilong Zhang (meta)
Eric Zhang (Eikon)
Ning Leng (AbbVie)
Erick Scott (Keiji AI)
Eric Nantz (Eli Lilly)
Jimeng Sun (Keiji AI)