focusonr
  • Home
  • Blog
  • rgtlab.org

On this page

  • 1 Introduction
    • 1.1 Motivations
    • 1.2 Objectives
  • 2 Prerequisites and Setup
  • 3 What Are Morphometrics?
  • 4 Getting Started: Initial Exploration
    • 4.1 Species and Morphometric Overview
    • 4.2 Species-Specific Patterns
    • 4.3 Looking for Relationships
  • 5 Building a Model
    • 5.1 Fitting the Simple Regression
    • 5.2 Making Predictions
  • 6 Checking Our Work
    • 6.1 Things to Watch Out For
  • 7 What Did We Learn?
    • 7.1 Lessons Learnt
      • 7.1.1 Technical Skills
      • 7.1.2 Gotchas and Pitfalls
    • 7.2 Limitations
    • 7.3 Opportunities for Improvement
  • 8 Wrapping Up
  • 9 See Also
  • 10 Reproducibility
  • 11 Let’s Connect

Other Formats

  • PDF

Palmer Penguins Part 1: Exploratory Data Analysis and Simple Regression

Getting acquainted with Antarctic penguin morphometrics and their predictive relationships

r
regression
zzcollab
reproducibility

Exploring the Palmer Penguins dataset reveals how much a single morphometric measurement can predict about penguin body mass. A simple regression model provides a strong baseline for subsequent analyses.

Author

Ronald ‘Ryy’ G. Thomas

Published

January 1, 2025

Penguins gathered along the Antarctic shoreline

Photo: Antarctic penguins. Licensed under CC BY 2.0 via Wikimedia Commons.

1 Introduction

How much information can a single flipper measurement carry? Working with the Palmer Penguins dataset reveals that one well-chosen predictor can explain a surprising amount of variation in body mass, despite initial assumptions that predicting body mass from morphometrics would require a complex model.

The Palmer Penguins dataset was collected by Dr. Kristen Gorman at Palmer Station, Antarctica, and contains morphometric measurements for three species: Adelie (Pygoscelis adeliae), Chinstrap (Pygoscelis antarcticus), and Gentoo (Pygoscelis papua). It has become a widely used teaching dataset in the R community, offering a richer and more realistic alternative to the classic iris dataset.

This post walks through an initial encounter with the data, from exploratory analysis to a simple regression model. The relationship between flipper length and body mass proves remarkably strong, though ignoring species differences leaves a lot on the table.

More formally, this analysis is presented as an instance of the Project Compendium tier of the Workflow Construct described in post 52. The penguins dataset is the construct’s reference example; running it through the zzcollab compendium pattern (Dockerfile + renv.lock + Makefile + R/ + analysis/) produces an artefact that another analyst can rebuild byte-for-byte. The construct-tier post that documents the compendium pattern itself is post 29; this post is one worked example of that pattern.

1.1 Motivations

  • To practise a complete exploratory data analysis workflow on a real biological dataset rather than simulated data.
  • To examine how well simple morphometric measurements can predict body mass without any species information.
  • To provide a concrete example of Simpson’s Paradox and why ignoring grouping variables can be misleading.
  • To demonstrate that working with penguin data makes statistics feel more tangible than abstract textbook examples.
  • To establish a baseline model that subsequent posts can improve upon.

1.2 Objectives

  1. Load and clean the Palmer Penguins dataset, documenting missing values and data quality.
  2. Conduct exploratory data analysis with summary statistics and visualisations for each species.
  3. Compute correlations among morphometric variables and identify the strongest predictor of body mass.
  4. Fit a simple linear regression model (body mass ~ flipper length), report R-squared and confidence intervals, and examine residual diagnostics.

Errors and better approaches are welcome; see the Feedback section at the end.

A quiet moment before the analysis begins.

2 Prerequisites and Setup

Background: This post assumes familiarity with basic R syntax and the tidyverse. No prior knowledge of regression modelling is required.

Required packages:

install.packages(
  c("palmerpenguins", "tidyverse", "broom",
    "corrplot", "GGally", "patchwork", "knitr")
)

Load libraries:

library(palmerpenguins)
library(tidyverse)
library(broom)
library(corrplot)
library(GGally)
library(patchwork)
library(knitr)

theme_set(theme_minimal(base_size = 12))

penguin_colors <- c(
  "Adelie"    = "#FF6B6B",
  "Chinstrap" = "#9B59B6",
  "Gentoo"    = "#2E86AB"
)

3 What Are Morphometrics?

Morphometrics is the quantitative study of biological form: measuring the size and shape of organisms. In penguin research, this means recording bill length, bill depth, flipper length, and body mass for each individual. These measurements serve as proxies for overall health and condition, and they help ecologists monitor population trends without invasive procedures.

The central question in this post is whether a single measurement, flipper length, can reliably predict body mass. If it can, field researchers gain a quick, non-invasive way to assess penguin condition.

4 Getting Started: Initial Exploration

The analysis begins by loading the data and checking for missing values. The raw dataset has 344 observations, but some records are incomplete.

data(penguins)

cat(
  "Palmer Penguins Dataset Overview\n",
  "================================\n",
  "Dimensions:",
  nrow(penguins), "observations x",
  ncol(penguins), "variables\n"
)
Palmer Penguins Dataset Overview
 ================================
 Dimensions: 344 observations x 8 variables
missing_counts <- sapply(
  penguins, function(x) sum(is.na(x))
)
cat(
  "Missing values:",
  sum(missing_counts), "total\n\n"
)
Missing values: 19 total
penguins_clean <- penguins |> drop_na()
cat(
  "Clean dataset:",
  nrow(penguins_clean),
  "observations (removed",
  nrow(penguins) - nrow(penguins_clean),
  "incomplete cases)\n\n"
)
Clean dataset: 333 observations (removed 11 incomplete cases)
glimpse(penguins_clean)
Rows: 333
Columns: 8
$ species           <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel…
$ island            <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgerse…
$ bill_length_mm    <dbl> 39.1, 39.5, 40.3, 36.7, 39.3, 38.9, 39.2, 41.1, 38.6…
$ bill_depth_mm     <dbl> 18.7, 17.4, 18.0, 19.3, 20.6, 17.8, 19.6, 17.6, 21.2…
$ flipper_length_mm <int> 181, 186, 195, 193, 190, 181, 195, 182, 191, 198, 18…
$ body_mass_g       <int> 3750, 3800, 3250, 3450, 3650, 3625, 4675, 3200, 3800…
$ sex               <fct> male, female, female, female, male, female, male, fe…
$ year              <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007…

After removing incomplete cases, 333 observations remain across three species. That is a modest but workable sample for exploratory modelling.

4.1 Species and Morphometric Overview

The three species differ noticeably in both sample size and average body mass. Gentoo penguins are considerably larger than Adelie and Chinstrap.

species_summary <- penguins_clean |>
  group_by(species) |>
  summarise(
    n = n(),
    body_mass_mean =
      round(mean(body_mass_g), 0),
    flipper_length_mean =
      round(mean(flipper_length_mm), 1),
    .groups = "drop"
  ) |>
  mutate(
    percentage = round(n / sum(n) * 100, 1)
  )

kable(
  species_summary,
  caption = paste(
    "Species Distribution and",
    "Key Morphometrics"
  ),
  col.names = c(
    "Species", "N", "Body Mass (g)",
    "Flipper Length (mm)", "% of Dataset"
  )
)
Species Distribution and Key Morphometrics
Species N Body Mass (g) Flipper Length (mm) % of Dataset
Adelie 146 3706 190.1 43.8
Chinstrap 68 3733 195.8 20.4
Gentoo 119 5092 217.2 35.7
p_species <- ggplot(
  species_summary,
  aes(x = species, y = n, fill = species)
) +
  geom_col(alpha = 0.8) +
  geom_text(
    aes(
      label = paste0(n, "\n(", percentage, "%)")
    ),
    vjust = -0.5, size = 3.5
  ) +
  scale_fill_manual(values = penguin_colors) +
  labs(
    title = "Species Distribution",
    x = "Species", y = "Count"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

p_relationship <- ggplot(
  penguins_clean,
  aes(
    x = flipper_length_mm,
    y = body_mass_g,
    color = species
  )
) +
  geom_point(alpha = 0.7, size = 1.5) +
  geom_smooth(
    method = "lm", se = FALSE, size = 0.8
  ) +
  scale_color_manual(values = penguin_colors) +
  labs(
    title = "Flipper Length vs Body Mass",
    x = "Flipper Length (mm)",
    y = "Body Mass (g)",
    color = "Species"
  ) +
  theme_minimal()

eda_overview <- p_species + p_relationship
print(eda_overview)

ggsave(
  "eda-overview.png",
  plot = eda_overview,
  width = 10, height = 5, dpi = 300
)
Two-panel figure: bar chart of species counts on the left and scatter plot of flipper length versus body mass on the right with species-coloured regression lines.
Figure 1: Species distribution and morphometric relationship overview showing sample sizes and the key flipper-body mass relationship across species.

The scatter plot on the right immediately suggested a strong positive relationship between flipper length and body mass, and the three species clearly occupy somewhat different regions of the plot.

4.2 Species-Specific Patterns

morphometric_summary <- penguins_clean |>
  group_by(species) |>
  summarise(
    n = n(),
    body_mass_mean =
      round(mean(body_mass_g), 0),
    body_mass_ci =
      round(
        1.96 * sd(body_mass_g) / sqrt(n()),
        1
      ),
    flipper_length_mean =
      round(mean(flipper_length_mm), 1),
    flipper_length_ci =
      round(
        1.96 * sd(flipper_length_mm) /
          sqrt(n()),
        1
      ),
    .groups = "drop"
  )

kable(
  morphometric_summary,
  caption = paste(
    "Morphometric Statistics by Species",
    "(plus/minus 95% CI)"
  ),
  col.names = c(
    "Species", "N", "Body Mass (g)",
    "+/- 95% CI",
    "Flipper Length (mm)", "+/- 95% CI"
  )
)
Morphometric Statistics by Species (plus/minus 95% CI)
Species N Body Mass (g) +/- 95% CI Flipper Length (mm) +/- 95% CI
Adelie 146 3706 74.4 190.1 1.1
Chinstrap 68 3733 91.4 195.8 1.7
Gentoo 119 5092 90.1 217.2 1.2
p_comparison <- ggplot(
  penguins_clean,
  aes(x = species, fill = species)
) +
  geom_boxplot(
    aes(y = body_mass_g),
    alpha = 0.7,
    position = position_dodge(0.8)
  ) +
  scale_fill_manual(values = penguin_colors) +
  labs(
    title = paste(
      "Body Mass Distribution by Species"
    ),
    subtitle = paste(
      "Gentoo penguins are notably larger",
      "than Adelie and Chinstrap"
    ),
    x = "Species",
    y = "Body Mass (g)"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

print(p_comparison)

ggsave(
  "species-comparison.png",
  plot = p_comparison,
  width = 8, height = 5, dpi = 300
)
Box plot of body mass by species with Adelie median around 3750g, Chinstrap around 3700g, and Gentoo around 5000g.
Figure 2: Body mass distributions by species showing Gentoo penguins are substantially larger than Adelie and Chinstrap.

Gentoo penguins stand out clearly, with a median body mass roughly 1250g higher than the other two species. This species-level variation will matter a great deal when we move to multiple regression in the next post.

A wider view of the research landscape

Stepping back to see the bigger picture.

4.3 Looking for Relationships

A correlation matrix across the four numeric morphometric variables identifies which measurement has the strongest linear association with body mass.

numeric_vars <- penguins_clean |>
  select(
    bill_length_mm, bill_depth_mm,
    flipper_length_mm, body_mass_g
  )

correlation_matrix <- cor(numeric_vars)

body_mass_cors <- correlation_matrix[
  "body_mass_g",
] |>
  sort(decreasing = TRUE) |>
  round(3)

cat(
  "Correlations with Body Mass:\n"
)
Correlations with Body Mass:
cat(sprintf(
  "  Flipper Length: %s (strongest)\n",
  body_mass_cors["flipper_length_mm"]
))
  Flipper Length: 0.873 (strongest)
cat(sprintf(
  "  Bill Length: %s\n",
  body_mass_cors["bill_length_mm"]
))
  Bill Length: 0.589
cat(sprintf(
  "  Bill Depth: %s (negative)\n",
  body_mass_cors["bill_depth_mm"]
))
  Bill Depth: -0.472 (negative)
png(
  "correlation-matrix.png",
  width = 6, height = 6,
  res = 300, units = "in"
)
corrplot(
  correlation_matrix,
  method = "color",
  type = "upper",
  order = "hclust",
  tl.cex = 1.0,
  tl.col = "black",
  addCoef.col = "black",
  number.cex = 0.8,
  title = "Morphometric Correlations",
  mar = c(0, 0, 2, 0)
)
dev.off()
quartz_off_screen 
                2 
Upper triangular correlation matrix with colour-coded cells and numeric coefficients for four morphometric variables.
Figure 3: Correlation matrix showing flipper length as the strongest predictor of body mass (r = 0.87).

Flipper length emerged as the clear winner, with a correlation of 0.87 against body mass. This makes biological sense: longer flippers correspond to larger birds overall. The negative correlation with bill depth was initially surprising, but it reflects the confounding influence of species (Gentoo penguins have shallower bills but heavier bodies).

5 Building a Model

5.1 Fitting the Simple Regression

With flipper length identified as the strongest predictor, a simple linear regression model is appropriate.

simple_model <- lm(
  body_mass_g ~ flipper_length_mm,
  data = penguins_clean
)

model_coefficients <- tidy(
  simple_model, conf.int = TRUE
)
model_metrics <- glance(simple_model)

cat("Simple Linear Model Results:\n")
Simple Linear Model Results:
cat("============================\n")
============================
cat(sprintf(
  "R-squared: %.3f (%.1f%% variance explained)\n",
  model_metrics$r.squared,
  model_metrics$r.squared * 100
))
R-squared: 0.762 (76.2% variance explained)
cat(sprintf(
  "RMSE: %.1f grams\n",
  sigma(simple_model)
))
RMSE: 393.3 grams
cat(sprintf(
  "F-statistic: %.1f (p < 0.001)\n",
  model_metrics$statistic
))
F-statistic: 1060.3 (p < 0.001)
intercept <- model_coefficients$estimate[1]
slope <- model_coefficients$estimate[2]
slope_ci_lower <- model_coefficients$conf.low[2]
slope_ci_upper <- model_coefficients$conf.high[2]

cat("\nModel Equation:\n")

Model Equation:
cat(sprintf(
  "Body Mass = %.1f + %.1f x Flipper Length\n",
  intercept, slope
))
Body Mass = -5872.1 + 50.2 x Flipper Length
cat(sprintf(
  "Slope 95%% CI: [%.1f, %.1f] grams/mm\n",
  slope_ci_lower, slope_ci_upper
))
Slope 95% CI: [47.1, 53.2] grams/mm

An R-squared of 0.76 indicates that flipper length alone accounts for about three-quarters of the variation in body mass. That is a surprisingly strong result for a single predictor.

5.2 Making Predictions

To make the model concrete, predictions for three representative flipper lengths with 95% confidence intervals are generated.

new_data <- tibble(
  flipper_length_mm = c(180, 200, 220)
)
predictions <- predict(
  simple_model,
  newdata = new_data,
  interval = "confidence"
)

cat("Example Predictions (95% CI):\n")
Example Predictions (95% CI):
for (i in 1:nrow(new_data)) {
  cat(sprintf(
    "  %dmm flippers: %.0f g [%.0f, %.0f]\n",
    new_data$flipper_length_mm[i],
    predictions[i, "fit"],
    predictions[i, "lwr"],
    predictions[i, "upr"]
  ))
}
  180mm flippers: 3155 g [3079, 3232]
  200mm flippers: 4159 g [4116, 4201]
  220mm flippers: 5162 g [5090, 5233]

The confidence intervals are reasonably narrow, which is encouraging. A penguin with 200mm flippers would be predicted to weigh roughly 4200g, give or take a few hundred grams.

model_plot <- ggplot(
  penguins_clean,
  aes(x = flipper_length_mm, y = body_mass_g)
) +
  geom_point(
    aes(color = species), alpha = 0.6
  ) +
  geom_smooth(
    method = "lm",
    color = "black",
    fill = "gray80"
  ) +
  scale_color_manual(values = penguin_colors) +
  labs(
    title = paste(
      "Simple Linear Regression:",
      "Body Mass ~ Flipper Length"
    ),
    subtitle = paste(
      "Gray band shows 95%",
      "confidence interval"
    ),
    x = "Flipper Length (mm)",
    y = "Body Mass (g)",
    color = "Species"
  ) +
  theme_minimal()

print(model_plot)

ggsave(
  "simple-regression-model.png",
  plot = model_plot,
  width = 8, height = 5, dpi = 300
)
Scatter plot with regression line showing positive relationship between flipper length and body mass, with gray confidence band and species-coloured points.
Figure 4: Simple linear regression of body mass on flipper length with 95% confidence band.

6 Checking Our Work

Before drawing conclusions, examining the residuals reveals whether the model assumptions hold.

penguins_with_predictions <- penguins_clean |>
  mutate(
    predicted = predict(simple_model),
    residuals = residuals(simple_model),
    standardized_residuals =
      rstandard(simple_model)
  )

outliers <- which(
  abs(
    penguins_with_predictions$standardized_residuals
  ) > 2.5
)
cat("Model Assumption Checks:\n")
Model Assumption Checks:
cat(sprintf(
  "  Potential outliers: %d (>2.5 SD)\n",
  length(outliers)
))
  Potential outliers: 5 (>2.5 SD)
cat(sprintf(
  "  Residual standard error: %.1f grams\n",
  sigma(simple_model)
))
  Residual standard error: 393.3 grams
diagnostic_plot <- ggplot(
  penguins_with_predictions,
  aes(
    x = predicted,
    y = standardized_residuals
  )
) +
  geom_point(
    aes(color = species), alpha = 0.6
  ) +
  geom_hline(
    yintercept = c(-2, 0, 2),
    linetype = c("dashed", "solid", "dashed"),
    color = c("red", "black", "red")
  ) +
  scale_color_manual(values = penguin_colors) +
  labs(
    title = "Model Residuals Diagnostic",
    subtitle = paste(
      "Species clustering suggests",
      "missing predictors"
    ),
    x = "Predicted Body Mass (g)",
    y = "Standardized Residuals",
    color = "Species"
  ) +
  theme_minimal()

print(diagnostic_plot)

ggsave(
  "model-diagnostics.png",
  plot = diagnostic_plot,
  width = 8, height = 5, dpi = 300
)
Scatter plot of standardized residuals versus predicted values with points coloured by species, showing distinct clustering patterns.
Figure 5: Residual plot showing species clustering, indicating the model is missing important predictors.

The residual plot reveals a clear pattern: the three species form distinct clusters. This indicates that the simple model, while useful, is leaving systematic variation on the table. Species identity clearly matters, and ignoring it introduces bias into the predictions.

6.1 Things to Watch Out For

  1. Simpson’s Paradox is real. Ignoring species can reverse or obscure the true relationship between variables. Always check for grouping structure.
  2. Missing data is not random. The 11 incomplete cases dropped may not be missing at random. It is worth investigating why measurements were missing before assuming complete-case analysis is safe.
  3. Correlation is not slope. A correlation of 0.87 sounds high, but the RMSE of 393g means individual predictions can be off by quite a bit.
  4. Extrapolation is dangerous. The model is only valid within the observed flipper length range (172-231mm). Predictions outside this range are unreliable.
  5. Residual patterns demand attention. When residuals cluster by group, the model is telling you something important is missing.

Looking forward to what comes next

Pausing to reflect before drawing conclusions.

7 What Did We Learn?

7.1 Lessons Learnt

Conceptual Understanding:

  • Flipper length alone explains 76.2% of body mass variance (R-squared = 0.762), which is a remarkably strong single-predictor result in biological data.
  • Species-level differences are substantial: Gentoo penguins average roughly 5000g compared to approximately 3700g for Adelie and Chinstrap.
  • The negative correlation between bill depth and body mass (r = -0.47) is a classic example of confounding by species, not a true biological relationship.
  • Confidence intervals around the slope (approximately 46-54 grams per mm of flipper length) are narrow enough to be practically useful for field estimation.

7.1.1 Technical Skills

  • Using broom::tidy() and broom::glance() produces tidy model summaries that integrate cleanly into a tidyverse workflow.
  • corrplot provides a quick visual summary of multivariate relationships that is easier to scan than a numeric matrix.
  • Combining patchwork with ggplot2 enables side-by-side panel figures without wrestling with par(mfrow) or grid graphics.
  • The predict() function with interval = "confidence" gives immediate access to uncertainty quantification for new observations.

7.1.2 Gotchas and Pitfalls

  • Dropping NA rows silently can change species proportions; always report how many cases were removed and from which groups.
  • The default geom_smooth(method = "lm") plots a confidence band for the mean, not a prediction interval; these are different and should not be confused.
  • Saving a plot with ggsave() and then including it with include_graphics() can produce slightly different dimensions than the inline rendering; set explicit width and height to ensure consistency.
  • Colour palettes that look fine on screen may not be distinguishable in greyscale or to colour-blind readers; always use shape or linetype as a secondary encoding.

7.2 Limitations

  • The dataset contains only 333 complete observations across three species, which limits the complexity of models that can be reliably estimated.
  • Data were collected between 2007 and 2009 at a single Antarctic station; climate and environmental conditions may have changed since then.
  • The simple model treats all species as a homogeneous population, which the residual analysis clearly shows is inappropriate.
  • Measurement error in morphometric variables is not accounted for in the model.
  • Sex is available in the data but was not included in this initial analysis; it may explain additional variance.
  • The causal direction is assumed (flipper length predicts mass), but both variables are expressions of overall body size.

7.3 Opportunities for Improvement

  1. Add species as a categorical predictor to account for the between-species variance visible in the residuals.
  2. Include bill length and bill depth as additional predictors in a multiple regression framework.
  3. Explore interaction terms between species and flipper length to allow different slopes per species.
  4. Use cross-validation to assess out-of-sample prediction accuracy rather than relying solely on in-sample R-squared.
  5. Compare linear models to non-parametric approaches such as random forests to see whether the linear assumption is costing predictive accuracy.
  6. Investigate the missing data mechanism before dropping incomplete cases.

8 Wrapping Up

This first exploration of the Palmer Penguins dataset revealed that flipper length is a remarkably strong predictor of body mass, but that ignoring species differences leaves substantial room for improvement.

Even a simple regression model can tell a compelling story when paired with careful diagnostic checks. The residual clustering by species was the most instructive finding: it shows exactly where the model falls short and points directly toward the next analytical step.

For analysts working through this kind of analysis, the advice is to always plot the residuals and look for structure. The numbers alone (R-squared, p-values) do not tell the whole story.

Main takeaways:

  • Flipper length explains 76.2% of body mass variance (R-squared = 0.762).
  • RMSE of approximately 393g provides a practical measure of prediction accuracy.
  • Species clustering in residuals indicates the model is missing a key predictor.
  • Adding species information in Part 2 will improve R-squared to over 0.860.

9 See Also

Related posts in this series:

  • Part 2: Multiple Regression and Species Effects
  • Part 3: Advanced Models and Cross-Validation
  • Part 4: Model Diagnostics and Interpretation
  • Part 5: Random Forest vs Linear Models

Key resources:

  • Palmer Penguins R Package – official documentation and vignettes
  • R for Data Science (2e) – Wickham, Cetinkaya-Rundel, and Grolemund
  • Introduction to Statistical Learning – James, Witten, Hastie, and Tibshirani
  • Tidy Modeling with R – Kuhn and Silge

10 Reproducibility

Data source: The palmerpenguins R package (version 0.1.1), which provides the penguins dataset collected by Dr. Kristen Gorman at Palmer Station, Antarctica.

Analysis pipeline:

Rscript analysis/scripts/01_prepare_data.R
Rscript analysis/scripts/02_fit_models.R
Rscript analysis/scripts/03_generate_figures.R
quarto render index.qmd

Session information:

R version 4.5.3 (2026-03-11)
Platform: aarch64-apple-darwin20
Running under: macOS Tahoe 26.4.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] knitr_1.51           patchwork_1.3.2      GGally_2.2.1        
 [4] corrplot_0.95        broom_1.0.11         lubridate_1.9.4     
 [7] forcats_1.0.0        stringr_1.6.0        dplyr_1.2.1         
[10] purrr_1.2.1          readr_2.2.0          tidyr_1.3.2         
[13] tibble_3.3.1         ggplot2_4.0.2        tidyverse_2.0.0     
[16] palmerpenguins_0.1.1

loaded via a namespace (and not attached):
 [1] generics_0.1.4     lattice_0.22-9     stringi_1.8.7      hms_1.1.4         
 [5] digest_0.6.39      magrittr_2.0.5     evaluate_1.0.5     grid_4.5.3        
 [9] timechange_0.3.0   RColorBrewer_1.1-3 fastmap_1.2.0      Matrix_1.7-4      
[13] plyr_1.8.9         jsonlite_2.0.0     backports_1.5.0    mgcv_1.9-4        
[17] scales_1.4.0       textshaping_1.0.4  cli_3.6.5          rlang_1.1.7       
[21] splines_4.5.3      withr_3.0.2        yaml_2.3.12        otel_0.2.0        
[25] tools_4.5.3        parallel_4.5.3     tzdb_0.5.0         ggstats_0.9.0     
[29] png_0.1-8          vctrs_0.7.2        R6_2.6.1           lifecycle_1.0.5   
[33] htmlwidgets_1.6.4  ragg_1.5.0         pkgconfig_2.0.3    pillar_1.11.1     
[37] gtable_0.3.6       glue_1.8.0         Rcpp_1.1.1         systemfonts_1.3.1 
[41] xfun_0.57          tidyselect_1.2.1   farver_2.1.2       nlme_3.1-168      
[45] htmltools_0.5.9    rmarkdown_2.31     labeling_0.4.3     compiler_4.5.3    
[49] S7_0.2.1          

11 Let’s Connect

Have questions, suggestions, or spot an error? Let me know.

  • GitHub: rgt47
  • Twitter/X: @rgt47
  • LinkedIn: Ronald Glenn Thomas
  • Email: Contact form

I would enjoy hearing from you if:

  • You spot an error or a better approach to any of the code in this post.
  • You have suggestions for topics you would like to see covered.
  • You want to discuss R programming, data science, or reproducible research.
  • You have questions about anything in this tutorial.
  • You just want to say hello and connect.

Copyright 2023-2026, Ronald ‘Ryy’ G. Thomas. The lab’s other activities live at rgtlab.org.