Docker and renv Strategy for Reproducible R Development: A Practical Example

Published

March 10, 2025

Here’s an updated version of main1 that includes a listing of peng1.Rmd as initially written by Developer 1 and as later modified by Developer 2.

Docker and renv Strategy for Reproducible R Development: A Practical Example

  1. Introduction

Ensuring reproducibility in R Markdown workflows can be challenging due to package version mismatches and differences in computing environments. By combining renv (for package management) and Docker (for OS and system-level reproducibility), we can create fully reproducible workflows that work identically across different machines.

In this example, two developers collaborate on an R Markdown analysis using: • renv for package dependency management • Docker for containerizing the computing environment • GitHub for version control and collaboration • DockerHub for sharing a pre-configured R environment

  1. Step-by-Step Example: Collaborative Reproducible Development with Docker and renv

2.1 Developer 1: Setting Up the Project Locally

Step 1: Create a GitHub Repository and Clone It Locally

Developer 1 creates a new GitHub repository (penguins-analysis) and clones it:


git clone https://github.com/username/penguins-analysis.git
cd penguins-analysis

Step 2: Initialize renv for Dependency Management

Inside the project directory, Developer 1 initializes renv:


install.packages("renv")  # Install renv if not already installed
renv::init()              # Initialize renv for the project

This creates an renv.lock file to track exact package versions.

Step 3: Install Required R Packages

Since the analysis will use the palmerpenguins dataset, Developer 1 installs the required packages:


install.packages("ggplot2")
install.packages("palmerpenguins")
renv::snapshot()  # Save exact package versions to renv.lock

2.2 Developer 1: Writing the Initial R Markdown File (peng1.Rmd)

Listing: peng1.Rmd Created by Developer 1

Developer 1 creates peng1.Rmd with an initial plot of flipper length vs. bill length:

---
title: "Palmer Penguins Analysis"
author: "Developer 1"
date: "2025-03-10"
output: html_document
---


0.1 2.3 Developer 1: Creating a Docker Image

0.1.1 Step 4: Write a Minimal Dockerfile (Without peng1.Rmd)

Developer 1 creates a Dockerfile that does not include peng1.Rmd, ensuring that Developer 2’s local files are used when running the container.

# Use R 4.1.0 as base image
FROM rocker/r-ver:4.1.0

# Set the working directory inside the container
WORKDIR /workspace

# Install renv and restore dependencies
RUN R -e "install.packages('renv', repos='https://cloud.r-project.org')"

# Copy only the renv.lock and renv infrastructure
COPY renv.lock renv/activate.R /workspace/

# Restore the R package environment
RUN R -e "renv::restore()"

CMD ["/bin/bash"]

Step 5: Build and Push the Docker Image

Developer 1 builds the Docker image:

docker build -t username/penguins-analysis:v1 .

To push the image to DockerHub:

docker login docker push username/penguins-analysis:v1

2.4 Developer 1: Push to GitHub and Communicate to Developer 2

Step 6: Commit and Push to GitHub

Developer 1 commits the project without peng1.Rmd in the Docker image:

git add . git commit -m “Initial renv setup and Docker environment (without Rmd)” git push origin main

Developer 1 then shares these instructions with Developer 2: 1. Clone the GitHub repository. 2. Pull the prebuilt Docker image from DockerHub. 3. Run the container interactively, mounting the local repository. 4. Write the peng1.Rmd file and generate the report. 5. Push changes back to GitHub.

2.5 Developer 2: Running the Analysis

Step 7: Clone the Repository and Pull the Docker Image

Developer 2 clones the repository:

git clone https://github.com/username/penguins-analysis.git cd penguins-analysis

Pull the Docker image:

docker pull username/penguins-analysis:v1

Step 8: Run Docker Interactively with Local Repository

Since peng1.Rmd is not included in the Docker image, Developer 2 mounts the local repo inside the container:

docker run –rm -it -v “$(pwd):/workspace” -w /workspace username/penguins-analysis:v1 /bin/bash

This allows Developer 2 to: • Use the renv-restored environment from the container. • Access and modify peng1.Rmd directly from their local machine.

2.6 Developer 2: Extending the Analysis

Step 9: Modify peng1.Rmd to Include a Second Plot

Developer 2 adds a new plot for body mass vs. bill length:

Listing: peng1.Rmd as Modified by Developer 2


---
title: "Palmer Penguins Analysis"
author: "Developer 2"
date: "2025-03-10"
output: html_document
---

0.1.2 Step 10: Commit and Push Changes Back to GitHub

Developer 2 commits and pushes the changes:

git add peng1.Rmd
git commit -m "Added second plot: Body Mass vs. Bill Length"
git push origin main

  1. Conclusion: Achieving Full Reproducibility

By following these steps: • renv ensures that package versions are identical across environments. • Docker guarantees that the R version and OS-level dependencies are consistent. • GitHub enables collaborative development. • DockerHub provides a shared, pre-configured execution environment.

Key Updates in This Version

✅ Includes a listing of peng1.Rmd as written by Developer 1. ✅ Includes a listing of peng1.Rmd as modified by Developer 2. ✅ Ensures peng1.Rmd is not inside the Docker image—instead, Developer 2 uses a bind mount.

Would you like any further refinements or additions? 🚀

Reuse

Citation

BibTeX citation:
@online{(ryy) glenn thomas2025,
  author = {(Ryy) Glenn Thomas, Ronald},
  title = {Docker and Renv {Strategy} for {Reproducible} {R}
    {Development:} {A} {Practical} {Example}},
  date = {2025-03-10},
  url = {https://focusonr.org/posts/share_R_code_via_docker/docker_renv.html},
  langid = {en}
}
For attribution, please cite this work as:
(Ryy) Glenn Thomas, Ronald. 2025. “Docker and Renv Strategy for Reproducible R Development: A Practical Example.” March 10, 2025. https://focusonr.org/posts/share_R_code_via_docker/docker_renv.html.