Author
Name Claire Descombes
Affiliation Universitätsklinik für Neurochirurgie, Inselspital Bern
Degree MSc Statistics and Data Science, University of Bern
Contact claire.descombes@insel.ch

The reference material for this course, as well as some useful literature to deepen your knowledge of R, can be found at the bottom of the page.


The base R plotting functions are effective and user-friendly, so we will begin by exploring their features.

Next, we will take a look at the functions offered by ggplot2, a system for declarative graphics creation. You provide the data and specify how to map variables to aesthetics and which graphical primitives to use; ggplot2 then takes care of the details. Although it is slightly more technical than base R, it produces very aesthetically pleasing results.

To explore R’s plotting capabilities, we’ll continue working with the NHANES datasets. Specifically, we’ll use the merged data frame that combines the demo, bpx, bmx and smq data sets. If you still have it loaded from Chapter 2, you can use it directly.

To be able to plot survival curves, we will also need survival data for the NHANES data set. Those can be found publicly as well (for adults only), on the National Death Index. I already merged this survival data with the NHANES data set, please download it from the data_sets folder and import it into R.

# Load the merged_nhanes CSV file
merged_nhanes <- read.csv("/home/claire/Documents/GitHub/rforphysicians/data_sets/merged_nhanes.csv")

# Load the merged_nhanes_with_mort CSV file
merged_nhanes_with_mort <- read.csv("/home/claire/Documents/GitHub/rforphysicians/data_sets/merged_nhanes_with_mort.csv")

1 Including plots using base R

1.1 Scatter plots

plot(x = merged_nhanes$RIDAGEYR, y = merged_nhanes$BPXSY1,
     xlab = "Age of participant [years]",
     ylab = "Systolic blood pressure (1st reading) [mmHg]",
     main = "Scatterplot of age vs. systolic BP",
     col = c("steelblue"))

💡 The arguments xlab and ylab allow you to define a label for the x and y axes respectively. The argument main defines the main title of the plot.

1.2 Box plots

boxplot(merged_nhanes$BPXSY1 ~ merged_nhanes$RIAGENDR,
        names = c("Male", "Female"),
        xlab = "Gender",
        ylab = "Systolic blood pressure (1st reading) [mmHg]",
        main = "Boxplot of BP by gender",
        col = c("salmon", "skyblue"))

1.3 Bar plots

# We define the age groups in which to calculate the mean systolic blood pressure and calculate the mean systolic blood pressure by age group
avg_bp <- merged_nhanes %>%
  filter(!is.na(BPXSY1), !is.na(RIDAGEYR)) %>%             # Remove NAs
  mutate(age_group = cut(RIDAGEYR, breaks = seq(10, 80, 10))) %>%  # Define age groups
  group_by(age_group) %>%
  summarise(mean_bp = mean(BPXSY1, na.rm = TRUE)) %>%
  ungroup()

barplot(height = avg_bp$mean_bp,
        names.arg = avg_bp$age_group,
        xlab = "Age group [years]",
        ylab = "Average systolic BP [mmHg]",
        main = "Bar plot of average systolic BP by age group",
        col = "steelblue")

💡 cut() divides the range of the imputed vector into intervals and codes its values according to which interval they fall. The leftmost interval corresponds to level one, the next leftmost to level two and so on.

1.4 Histograms

hist(merged_nhanes$BMXBMI,
     breaks = 30,
     freq = TRUE,
     col = "steelblue",
     xlab = "Body Mass Index (BMI)",
     main = "Histogram of BMI")

💡 The argument breaks is a vector giving the breakpoints between histogram cells or a function to compute the vector of breakpoints.

💡 The argument freq is a logical: if TRUE, the histogram graphic is a representation of frequencies, the counts component of the result; if FALSE, probability densities, component density, are plotted.

1.5 Line plots

# We define the age groups in which to calculate the mean systolic blood pressure and calculate the mean systolic blood pressure by age group
avg_bp_with_midpoints <- merged_nhanes %>%
  filter(!is.na(BPXSY1), !is.na(RIDAGEYR)) %>%
  mutate(age_group = cut(RIDAGEYR, breaks = seq(10, 80, 10))) %>%
  group_by(age_group) %>%
  summarise(mean_bp = mean(BPXSY1)) %>%
  ungroup() %>%
  mutate(
    # Add midpoints
    age_mid = seq(15, 75, 10)
  )

plot(avg_bp_with_midpoints$age_mid, avg_bp_with_midpoints$mean_bp, type = "b",
     xlab = "Age (midpoint of group) [years]",
     ylab = "Mean systolic BP [mmHg]",
     main = "Mean systolic BP by age group",
     col = "steelblue")

💡 The argument type is 1-character string giving the type of plot desired. The following values are possible:

  • “p” for points
  • “l” for lines
  • “b” for both points and lines
  • “c” for empty points joined by lines
  • “o” for overplotted points and lines
  • “s” and “S” for stair steps and “h” for histogram-like vertical lines

1.6 Pie charts

(gender_table <- table(merged_nhanes$RIAGENDR))
## 
## Female   Male 
##   3298   3251
pie(x = gender_table,
    labels = c("Female", "Male"),
    col = c("skyblue", "salmon"),
    main = "Gender Distribution")

💡 table() uses cross-classifying factors to build a contingency table of the counts at each combination of factor levels.

Personally, I would recommend avoiding pie charts, as the human eye is not very good at recognising angle ratios. I favour bar plots, for example.

1.7 Mosaic plots

(smoking_gender_table <- table(merged_nhanes$RIAGENDR, merged_nhanes$SMQ020))
##         
##          Don't know   No Refused  Yes
##   Female          2 1817       0  879
##   Male            1 1220       1 1398
mosaicplot(smoking_gender_table[,c("Yes","No")], # we select only the 'yes/no' answers
           main = "Mosaic plot of smoking status by gender",
           xlab = "Gender",
           ylab = "Smoked at least 100 cigarettes?",
           col = c("steelblue"))

1.8 Forest plots

Base R does not include a built-in function for forest plots, but you can use a package like forestplot for that, or ggplot2 (see below).

library(forestplot)

# Calculate summary statistics by gender
summary_stats <- merged_nhanes %>%
  filter(!is.na(BPXSY1)) %>%                      # Remove NAs for BPXSY1
  group_by(RIAGENDR) %>%                          # Group by gender
  summarise(
    mean = mean(BPXSY1),
    sd = sd(BPXSY1),
    n = n(),
    lower = mean - 1.96 * sd / sqrt(n),
    upper = mean + 1.96 * sd / sqrt(n)
  ) %>%
  ungroup()

# Create the forest plot
forestplot(labeltext = c("Female","Male"),
           mean = summary_stats$mean,
           lower = summary_stats$lower,
           upper = summary_stats$upper,
           zero = 120,
           grid = TRUE,
           ci.vertices = TRUE,
           xlab = "Systolic BP (95% CI) [mmHg]",
           col = fpColors(box = "steelblue", line = "steelblue"),
           title = "Forest plot of systolic BP by gender")

1.9 Kaplan-Meier curves

Base R requires the survival package in order to draw Kaplan-Meier survival curves.

library(survival)

# Filter out data outside of window (> 24 months)
merged_nhanes_no_outliers <- merged_nhanes_with_mort[merged_nhanes_with_mort$PERMTH_INT <= 24, ]

# Fit Kaplan–Meier survival curves
fit <- survfit(Surv(PERMTH_INT, MORTSTAT) ~ 1, 
               data = merged_nhanes_no_outliers)

# Plot in base R
plot(fit,
     col = "steelblue",
     lwd = 2,
     xlab = "Time (months)",
     ylab = "Survival probability",
       ylim = c(0.7,1),
     main = "Kaplan–Meier survival curve")

1.10 Exercises

✏️ Exercise on plotting with base R n°1: Plot the first, second and third systolic blood pressure readings by gender using base R. Display all three plots (1st, 2nd and 3rd readings) in the same output panel.

💡 Hint: Try using the par() function (e.g. par(mfrow=c(2,3)); do not forget to reset it to par(mfrow=c(1,1)) afterwards!)

✏️ Exercise on plotting with base R n°2: Create a histogram showing the systolic blood pressure (1st reading) for each smoking status category (excluding those who refused to provide their smoking status). Set the number of breaks to 20. You should get three different plots. Display them all in the same output panel.

2 Including plots using ggplot2

# Start by installing ggplot2
install.packages("ggplot2")
library(ggplot2) # after installation

# Note that the ggplot2 package is included in some R package collections, e.g. the tidyverse collection
install.packages("tidyverse")
library(tidyverse) # after installation

ggplot2 is based on the grammar of graphics — the idea that every graph can be built from the same basic components:

  • a data set,
  • a coordinate system, and
  • geoms: visual marks (like points, lines, bars) that represent data.

Structure of a ggplot2 graph:

How to build a plot:

  1. Start with ggplot(), where you provide the dataset (and optionally the aesthetic mapping).
  2. Add layers, such as:
  • geom_point() (scatter plot)
  • geom_histogram() (histogram)
  • geom_line() (line plot)
  • scales (e.g., scale_y_log10())
  • faceting (e.g., facet_wrap())
  • coordinate systems (e.g., coord_flip())
  1. Define aesthetics using aes(): these describe how variables in the data are mapped to visual properties like: x, y, color, fill, linetype, size, etc.

💡 You can place aes() either in ggplot() (to apply it globally) or inside the individual geom_*() functions (to apply it locally).

The geom function must be chosen according to the type of data examined (two continuous variables, one categorical and one continuous variable, etc.).

2.1 Scatter plots

ggplot(merged_nhanes, aes(x = RIDAGEYR, y = BPXSY1)) +
  geom_point(alpha = 0.5, col = "steelblue") +
  labs(title = "Scatterplot of age vs. systolic BP",
       x = "Age of participant [years]",
       y = "Systolic blood pressure (1st reading) [mmHg]") +
  theme_minimal()

Or equivalently:

ggplot(merged_nhanes) +
  geom_point(aes(x = RIDAGEYR, y = BPXSY1), alpha = 0.5, col = "steelblue") +
  labs(title = "Scatterplot of age vs. systolic BP",
       x = "Age of participant [years]",
       y = "Systolic blood pressure (1st reading) [mmHg]") +
  theme_minimal()

2.2 Box plots

ggplot(data = merged_nhanes, aes(x = factor(RIAGENDR, labels = c("Male", "Female")), y = BPXSY1, fill = factor(RIAGENDR))) +
  geom_boxplot() +
  scale_fill_manual(values = c("salmon", "skyblue")) +
  labs(title = "Box plot of systolic BP by gender",
       x = "Gender",
       y = "Systolic BP (1st reading) [mmHg]") +
  theme_minimal() +
  theme(legend.position = "none")

2.3 Bar plots

ggplot(avg_bp, aes(x = age_group, y = mean_bp)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  labs(title = "Bar plot of mean systolic BP by age group",
       x = "Age group [years]",
       y = "Mean systolic BP [mmHg]") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

2.4 Histograms

ggplot(merged_nhanes, aes(x = BMXBMI)) +
  geom_histogram(bins = 30, fill = "steelblue", color = "white") +
  labs(title = "Histogram of BMI",
       x = "Body Mass Index (BMI)",
       y = "Count") +
  theme_minimal()

💡 A quick reminder that the number of bins in a histogram is extremely important and should be given careful consideration, as the appearance of the data can vary significantly depending on this choice, which could potentially be misleading.

An example with the age of the household reference person:

ggplot(merged_nhanes, aes(x = DMDHRAGE)) +
  geom_histogram(bins = 40, fill = "steelblue", color = "white") +
  labs(title = "Histogram of age of the household reference person (40 bins)",
       x = "Age [years]",
       y = "Count") +
  theme_minimal()

ggplot(merged_nhanes, aes(x = DMDHRAGE)) +
  geom_histogram(bins = 20, fill = "steelblue", color = "white") +
  labs(title = "Histogram of age of the household reference person (20 bins)",
       x = "Age [years]",
       y = "Count") +
  theme_minimal()

ggplot(merged_nhanes, aes(x = DMDHRAGE)) +
  geom_histogram(bins = 10, fill = "steelblue", color = "white") +
  labs(title = "Histogram of age of the household reference person (10 bins)",
       x = "Age [years]",
       y = "Count") +
  theme_minimal()

2.5 Line plots

ggplot(avg_bp_with_midpoints, aes(x = age_mid, y = mean_bp)) +
  geom_line(col = "steelblue") +
  geom_point(col = "steelblue") +
  labs(title = "Mean systolic BP by age group",
       x = "Age (midpoint of group) [years]",
       y = "Mean systolic BP [mmHg]") +
  theme_minimal()

2.6 Pie charts

# Count proportions for each gender
gender_counts <- merged_nhanes %>%
  filter(!is.na(RIAGENDR)) %>%                     # Remove NAs if any
  count(Gender = RIAGENDR) %>%                      # Count by gender, rename column to 'Gender'
  mutate(
    prop = n / sum(n),                              # Compute proportion
    ypos = cumsum(prop) - 0.5 * prop                # Compute y position for label
  )

ggplot(gender_counts, aes(x = "", y = prop, fill = Gender)) +
  geom_col(width = 1) +
  coord_polar(theta = "y") +
  geom_text(aes(y = ypos, label = scales::percent(prop)), color = "white") +
  scale_fill_manual(values = c("salmon", "skyblue")) +
  labs(title = "Pie chart of gender distribution") +
  theme_void()

2.7 Mosaic plots

ggplot2 does not include a built-in function for mosaic plots, but can be extended with the ggmoisaic package for that.

library(ggmosaic)

ggplot(data = merged_nhanes[merged_nhanes$SMQ020 %in% c("Yes","No"),]) +
  geom_mosaic(aes(x = product(RIAGENDR), fill = SMQ020)) +
  scale_fill_manual(values = c("salmon", "skyblue")) +
  labs(x = "Gender", y = "Smoked at least 100 cigarettes?", title = "Mosaic plot of smoking status by gender") +
  theme_minimal() +
  theme(legend.position = "none")

2.8 Forest plots

# Calculate summary statistics by gender
forest_df <- merged_nhanes %>%
  filter(!is.na(BPXSY1), !is.na(RIAGENDR)) %>%
  group_by(Gender = RIAGENDR) %>%
  summarise(
    mean = mean(BPXSY1),
    sd = sd(BPXSY1),
    n = n(),
    lower = mean - 1.96 * sd / sqrt(n),
    upper = mean + 1.96 * sd / sqrt(n),
    .groups = "drop"
  )

ggplot(forest_df, aes(x = mean, y = Gender)) +
  geom_point(size = 3, color = "steelblue") +
  geom_errorbarh(aes(xmin = lower, xmax = upper), height = 0.2, color = "steelblue") +
  geom_vline(xintercept = 120, linetype = "dashed") +
  labs(title = "Forest plot of systolic BP by gender",
       x = "Systolic BP (95% CI) [mmHg]",
       y = "") +
  theme_minimal()

2.9 Kaplan-Meier curves

ggplot2 does not include a built-in function for KM curves, but can be extended with the survminer package for that.

library(survminer)

# Fit Kaplan–Meier survival curves (already done)
fit <- survfit(Surv(PERMTH_INT, MORTSTAT) ~ 1, 
               data = merged_nhanes_no_outliers)

# Plot with ggsurvplot
ggsurvplot(fit,
           data = merged_nhanes_no_outliers,
           palette = "steelblue",
           conf.int = TRUE,  
           xlab = "Time (months)",
           ylab = "Survival probability",
           ylim = c(0.7, 1),
           risk.table = TRUE,
           title = "Kaplan–Meier survival curve")

2.10 Exercises

✏️ Exercise on plotting with ggplot2: Using the same age groups as in the examples, create a bar plot showing the median BMI by age group. Then produce two further bar plots showing the same data for each gender, and display them in the same output panel, using the same y-axis to allow for better comparison.

💡 Hints: Try using the grid.arrange() function from the gridExtra package (e.g. grid.arrange(plot1, plot2, ncol=2)), and the argument ylim in the ggplot() function.

3 Saving plots

3.1 Base R

We would use the jpeg() function to save a plot as a JPEG image and the png() function to save it as a PNG image. Please note that after the plotting and saving, we need to call the function dev.off() to save the file.

We can specify the resolution we want with arguments width and height. We can also specify the full path of the file we want to save if we don’t want to save it in the current directory.

plotting_directory <- "/home/claire/Documents/GitHub/rforphysicians/docs/plots/"

jpeg(file = paste0(plotting_directory,"hist_BMI.jpeg"), 
     width = 10, 
     height = 6, 
     units = "in",
     res=100)
hist(merged_nhanes$BMXBMI,
     breaks = 30,
     freq = TRUE,
     col = "steelblue",
     xlab = "Body Mass Index (BMI)",
     main = "Histogram of BMI")
dev.off()

png(file= paste0(plotting_directory,"pie_chart_gender.png"), 
    width = 10, 
    height = 6, 
    units="in",
    res=100)
pie(x = gender_table,
    labels = c("Female", "Male"),
    col = c("skyblue", "salmon"),
    main = "Gender Distribution")
dev.off()

3.2 ggplot2

Saving plots created using the ggplot2 package is very easy with the ggsave() function. However, the plot must first be stored in a variable; otherwise, the function will simply save the last plot displayed. The function guesses the type of graphics from the file extension.

We can specify the resolution we want with arguments width and height. We can also specify the full path of the file we want to save if we don’t want to save it in the current directory.

boxplot_syst_bp_by_gender <- 
  ggplot(data = merged_nhanes, aes(x = factor(RIAGENDR, labels = c("Male", "Female")), y = BPXSY1, fill = factor(RIAGENDR))) +
  geom_boxplot() +
  scale_fill_manual(values = c("salmon", "skyblue")) +
  labs(title = "Box plot of systolic BP by gender",
       x = "Gender",
       y = "Systolic BP (1st reading) [mmHg]") +
  theme_minimal() +
  theme(legend.position = "none")

ggsave(filename = "boxplot_syst_bp_by_gender.png",
  plot = boxplot_syst_bp_by_gender,
  bg = 'white',
  width = 10,
  height = 6,
  units = c("in"),
  path = plotting_directory)

3.3 Exercises

✏️ Exercise on saving plots n°1: Save the three box plots from the plotting exercise using base R n°1 as one JPEG image.

✏️ Exercise on saving plots n°2: Save the two bar plots from the plotting exercise using ggplot2 as one PNG image.

4 Solutions to the exercises

☑️ Exercise on plotting with base R n°1

par(mfrow=c(1,3))
boxplot(merged_nhanes$BPXSY1 ~ merged_nhanes$RIAGENDR,
        names = c("Male", "Female"),
        xlab = "Gender",
        ylab = "Systolic blood pressure (1st reading) [mmHg]",
        main = "Boxplot of BP by gender",
        col = c("salmon", "skyblue"))
boxplot(merged_nhanes$BPXSY2 ~ merged_nhanes$RIAGENDR,
        names = c("Male", "Female"),
        xlab = "Gender",
        ylab = "Systolic blood pressure (2nd reading) [mmHg]",
        main = "Boxplot of BP by gender",
        col = c("salmon", "skyblue"))
boxplot(merged_nhanes$BPXSY3 ~ merged_nhanes$RIAGENDR,
        names = c("Male", "Female"),
        xlab = "Gender",
        ylab = "Systolic blood pressure (3rd reading) [mmHg]",
        main = "Boxplot of BP by gender",
        col = c("salmon", "skyblue"))

par(mfrow=c(1,1))

☑️ Exercise on plotting with base R n°2

# We identify the different smoking status
table(merged_nhanes$SMQ040)
## 
##  Every day Not at all    Refused  Some days 
##        878       1210          2        187
par(mfrow=c(3,1))
hist(merged_nhanes[merged_nhanes$SMQ040=="Not at all",]$BPXSY1,
     breaks = 20,
     freq = TRUE,
     col = "darkgreen",
     xlab = "Systolic blood pressure (1st reading) [mmHg]",
     main = "Histogram of systolic BP: not smoking")
hist(merged_nhanes[merged_nhanes$SMQ040=="Some days",]$BPXSY1,
     breaks = 20,
     freq = TRUE,
     col = "orange2",
     xlab = "Systolic blood pressure (1st reading) [mmHg]",
     main = "Histogram of systolic BP: smoking every other day")
hist(merged_nhanes[merged_nhanes$SMQ040=="Every day",]$BPXSY1,
     breaks = 20,
     freq = TRUE,
     col = "coral3",
     xlab = "Systolic blood pressure (1st reading) [mmHg]",
     main = "Histogram of systolic BP: smoking every day")

par(mfrow=c(1,1))

# In case you were wondering why the BP is higher in the group of non-smokers
merged_nhanes$age_group <- cut(merged_nhanes$RIDAGEYR, breaks = seq(10, 80, 10))
table(merged_nhanes[merged_nhanes$SMQ040!="Refused",]$SMQ040, merged_nhanes[merged_nhanes$SMQ040!="Refused",]$age_group)
##             
##              (10,20] (20,30] (30,40] (40,50] (50,60] (60,70] (70,80]
##   Every day       16     165     187     179     172     111      48
##   Not at all       3      84     152     148     228     283     312
##   Some days        5      60      35      30      26      24       7

☑️ Exercise on plotting with ggplot2

# We define the age groups
age_groups <- cut(x = merged_nhanes$RIDAGEYR, breaks = seq(10, 80, 10))
head(age_groups)
## [1] (20,30] (10,20] (40,50] (10,20] (20,30] (10,20]
## Levels: (10,20] (20,30] (30,40] (40,50] (50,60] (60,70] (70,80]
# We calculate the median BMI by age group
(med_bmi <- aggregate(BMXBMI ~ age_groups, data = merged_nhanes, FUN = function(x) median(x, na.rm = TRUE)))
ggplot(med_bmi, aes(x = age_groups, y = BMXBMI)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  labs(title = "Bar plot of median BMI by age group",
       x = "Age group [years]",
       y = "Median BMI") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

# We calculate the median BMI by age group and by gender
(med_bmi <- aggregate(BMXBMI ~ age_groups + RIAGENDR, data = merged_nhanes, FUN = function(x) median(x, na.rm = TRUE)))
med_bmi_fem <- med_bmi[med_bmi$RIAGENDR == "Female",]
med_bmi_male <- med_bmi[med_bmi$RIAGENDR == "Male",]

plot1 <- ggplot(med_bmi_fem, aes(x = age_groups, y = BMXBMI)) +
  geom_bar(stat = "identity", fill = "skyblue") +
  labs(title = "Median BMI, gender = female",
       x = "Age group [years]",
       y = "Median BMI") +
  theme_minimal() +
  ylim(c(0,30)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
plot2 <- ggplot(med_bmi_male, aes(x = age_groups, y = BMXBMI)) +
  geom_bar(stat = "identity", fill = "salmon") +
  labs(title = "Median BMI, gender = male",
       x = "Age group [years]",
       y = "Median BMI") +
  theme_minimal() +
  ylim(c(0,30)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

library(gridExtra)
grid.arrange(plot1, plot2, ncol=2)

☑️ Exercise on saving plots n°1

plotting_directory <- "/home/claire/Documents/GitHub/rforphysicians/docs/plots/" # your plotting directory here

jpeg(file = paste0(plotting_directory,"boxplots_bp_by_gender.jpeg"), 
     width = 10, 
     height = 6, 
     units = "in",
     res=100)
par(mfrow=c(1,3))
boxplot(merged_nhanes$BPXSY1 ~ merged_nhanes$RIAGENDR,
        names = c("Male", "Female"),
        xlab = "Gender",
        ylab = "Systolic blood pressure (1st reading) [mmHg]",
        main = "Boxplot of BP by gender",
        col = c("salmon", "skyblue"))
boxplot(merged_nhanes$BPXSY2 ~ merged_nhanes$RIAGENDR,
        names = c("Male", "Female"),
        xlab = "Gender",
        ylab = "Systolic blood pressure (2nd reading) [mmHg]",
        main = "Boxplot of BP by gender",
        col = c("salmon", "skyblue"))
boxplot(merged_nhanes$BPXSY3 ~ merged_nhanes$RIAGENDR,
        names = c("Male", "Female"),
        xlab = "Gender",
        ylab = "Systolic blood pressure (3rd reading) [mmHg]",
        main = "Boxplot of BP by gender",
        col = c("salmon", "skyblue"))
dev.off()
par(mfrow=c(1,1))

☑️ Exercise on saving plots n°2

plotting_directory <- "/home/claire/Documents/GitHub/rforphysicians/docs/plots/" # your plotting directory here

bar_plot_med_bmi_by_gender <- grid.arrange(plot1, plot2, ncol=2)
ggsave(filename = "bar_plot_med_bmi_by_gender.png",
  plot = bar_plot_med_bmi_by_gender,
  bg = 'white',
  width = 10,
  height = 6,
  units = c("in"),
  path = plotting_directory)

References

Alexander Henzi. 2021. “Programming and Data Analysis with R.” Lecture notes.
Burns, Patrick. n.d. The R Inferno. Accessed May 8, 2025. https://www.burns-stat.com/documents/books/the-r-inferno/.
CDC. 2025. “National Death Index.” Data Linkage. https://www.cdc.gov/nchs/linked-data/mortality-files/index.html.
ChatGPT.” n.d. Accessed January 26, 2025. https://chatgpt.com.
Christopher J. Endres. 2025. “Introducing nhanesA.” https://cran.r-project.org/web/packages/nhanesA/vignettes/Introducing_nhanesA.html.
“Create Elegant Data Visualisations Using the Grammar of Graphics.” n.d. Accessed January 26, 2025. https://ggplot2.tidyverse.org/.
David, Author. 2016. BIRT Joins.” MBSE Chaos. https://mbsechaos.wordpress.com/2016/05/24/birt-joins/.
Elena Kosourova. n.d. RStudio Tutorial for Beginners: A Complete Guide.” Accessed January 26, 2025. https://www.datacamp.com/tutorial/r-studio-tutorial.
Grolemund, Hadley Wickham and Garrett. n.d. R for Data Science. Accessed May 8, 2025. https://r4ds.had.co.nz/introduction.html.
Mayer, Michael. 2025. “Mayer79/Statistical_computing_material.” https://github.com/mayer79/statistical_computing_material.
Patrick Burns. n.d. Impatient R. Accessed May 8, 2025. https://www.burns-stat.com/documents/tutorials/impatient-r/.
P-Value.” 2025. Wikipedia. https://en.wikipedia.org/w/index.php?title=P-value&oldid=1305292611.
“Synthetic Dataset for AI in Healthcare.” n.d. Accessed May 9, 2025. https://www.kaggle.com/datasets/smmmmmmmmmmmm/synthetic-dataset-for-ai-in-healthcare.
“The Comprehensive R Archive Network.” n.d. Accessed January 26, 2025. https://stat.ethz.ch/CRAN/.
W. N. Venables, D. M. Smith and the R Core Team. n.d. “An Introduction to R.” Accessed May 8, 2025. https://cran.r-project.org/doc/manuals/r-release/R-intro.html.
Wickham, Hadley. n.d. Advanced R. Accessed May 8, 2025. https://adv-r.hadley.nz/introduction.html.