Docker and renv Strategy for Reproducible R Development: A Practical Example
Here’s an updated version of main1 that includes a listing of peng1.Rmd as initially written by Developer 1 and as later modified by Developer 2.
⸻
Docker and renv Strategy for Reproducible R Development: A Practical Example
- Introduction
Ensuring reproducibility in R Markdown workflows can be challenging due to package version mismatches and differences in computing environments. By combining renv (for package management) and Docker (for OS and system-level reproducibility), we can create fully reproducible workflows that work identically across different machines.
In this example, two developers collaborate on an R Markdown analysis using: • renv for package dependency management • Docker for containerizing the computing environment • GitHub for version control and collaboration • DockerHub for sharing a pre-configured R environment
⸻
- Step-by-Step Example: Collaborative Reproducible Development with Docker and renv
2.1 Developer 1: Setting Up the Project Locally
Step 1: Create a GitHub Repository and Clone It Locally
Developer 1 creates a new GitHub repository (penguins-analysis) and clones it:
git clone https://github.com/username/penguins-analysis.git
cd penguins-analysis
Step 2: Initialize renv for Dependency Management
Inside the project directory, Developer 1 initializes renv:
install.packages("renv") # Install renv if not already installed
renv::init() # Initialize renv for the project
This creates an renv.lock file to track exact package versions.
Step 3: Install Required R Packages
Since the analysis will use the palmerpenguins dataset, Developer 1 installs the required packages:
install.packages("ggplot2")
install.packages("palmerpenguins")
renv::snapshot() # Save exact package versions to renv.lock
⸻
2.2 Developer 1: Writing the Initial R Markdown File (peng1.Rmd)
Listing: peng1.Rmd Created by Developer 1
Developer 1 creates peng1.Rmd with an initial plot of flipper length vs. bill length:
---
title: "Palmer Penguins Analysis"
author: "Developer 1"
date: "2025-03-10"
output: html_document
---
0.1 2.3 Developer 1: Creating a Docker Image
0.1.1 Step 4: Write a Minimal Dockerfile (Without peng1.Rmd
)
Developer 1 creates a Dockerfile
that does not include peng1.Rmd
, ensuring that Developer 2’s local files are used when running the container.
# Use R 4.1.0 as base image
FROM rocker/r-ver:4.1.0
# Set the working directory inside the container
WORKDIR /workspace
# Install renv and restore dependencies
RUN R -e "install.packages('renv', repos='https://cloud.r-project.org')"
# Copy only the renv.lock and renv infrastructure
COPY renv.lock renv/activate.R /workspace/
# Restore the R package environment
RUN R -e "renv::restore()"
CMD ["/bin/bash"]
Step 5: Build and Push the Docker Image
Developer 1 builds the Docker image:
docker build -t username/penguins-analysis:v1 .
To push the image to DockerHub:
docker login docker push username/penguins-analysis:v1
⸻
2.4 Developer 1: Push to GitHub and Communicate to Developer 2
Step 6: Commit and Push to GitHub
Developer 1 commits the project without peng1.Rmd in the Docker image:
git add . git commit -m “Initial renv setup and Docker environment (without Rmd)” git push origin main
Developer 1 then shares these instructions with Developer 2: 1. Clone the GitHub repository. 2. Pull the prebuilt Docker image from DockerHub. 3. Run the container interactively, mounting the local repository. 4. Write the peng1.Rmd file and generate the report. 5. Push changes back to GitHub.
⸻
2.5 Developer 2: Running the Analysis
Step 7: Clone the Repository and Pull the Docker Image
Developer 2 clones the repository:
git clone https://github.com/username/penguins-analysis.git cd penguins-analysis
Pull the Docker image:
docker pull username/penguins-analysis:v1
Step 8: Run Docker Interactively with Local Repository
Since peng1.Rmd is not included in the Docker image, Developer 2 mounts the local repo inside the container:
docker run –rm -it -v “$(pwd):/workspace” -w /workspace username/penguins-analysis:v1 /bin/bash
This allows Developer 2 to: • Use the renv-restored environment from the container. • Access and modify peng1.Rmd directly from their local machine.
⸻
2.6 Developer 2: Extending the Analysis
Step 9: Modify peng1.Rmd to Include a Second Plot
Developer 2 adds a new plot for body mass vs. bill length:
Listing: peng1.Rmd as Modified by Developer 2
---
title: "Palmer Penguins Analysis"
author: "Developer 2"
date: "2025-03-10"
output: html_document
---
0.1.2 Step 10: Commit and Push Changes Back to GitHub
Developer 2 commits and pushes the changes:
git add peng1.Rmd
git commit -m "Added second plot: Body Mass vs. Bill Length"
git push origin main
⸻
- Conclusion: Achieving Full Reproducibility
By following these steps: • renv ensures that package versions are identical across environments. • Docker guarantees that the R version and OS-level dependencies are consistent. • GitHub enables collaborative development. • DockerHub provides a shared, pre-configured execution environment.
Key Updates in This Version
✅ Includes a listing of peng1.Rmd as written by Developer 1. ✅ Includes a listing of peng1.Rmd as modified by Developer 2. ✅ Ensures peng1.Rmd is not inside the Docker image—instead, Developer 2 uses a bind mount.
Would you like any further refinements or additions? 🚀
Reuse
Citation
@online{(ryy) glenn thomas2025,
author = {(Ryy) Glenn Thomas, Ronald},
title = {Docker and Renv {Strategy} for {Reproducible} {R}
{Development:} {A} {Practical} {Example}},
date = {2025-03-10},
url = {https://focusonr.org/posts/share_R_code_via_docker/docker_renv.html},
langid = {en}
}