Author
Name Claire Descombes
Affiliation Universitätsklinik für Neurochirurgie, Inselspital Bern
Degree MSc Statistics and Data Science, University of Bern
Contact claire.descombes@insel.ch

The reference material for this course, as well as some useful literature to deepen your knowledge of R, can be found at the bottom of the page.

1 Useful functions for efficient data handling

Function(s) Package Description
Selecting and filtering
subset() base R Select rows and columns of a data frame
select(), rename() dplyr Select or rename columns
filter() dplyr Select rows by condition
slice() dplyr Select rows by position
distinct() dplyr Return unique/distinct rows
duplicated() base R Logical vector: is row duplicated?
Sorting and ordering
order(), rank() base R Sort indices or compute ranks
arrange() dplyr Sort by one or more columns
Creating or transforming variables
transform() base R Add or overwrite columns in a data frame
mutate(), transmute() dplyr Create or overwrite columns; transmute() returns only new ones
cumsum() base R Cumulative sum of a vector
with() base R Evaluate expressions using data frame columns directly
across() dplyr Apply function(s) to multiple columns (within mutate() or summarize())
Grouped operations
aggregate(), tapply(), rowsum(), by() base R Perform grouped calculations
ave() base R Grouped transformations (e.g., group-wise mean)
group_by(), ungroup() dplyr Group rows by variable levels
summarize() dplyr Compute summary statistics (often after group_by())
Combining data
rbind(), cbind() base R Bind rows or columns
merge() base R Join data frames by key
bind_rows(), bind_cols() dplyr Bind data frames by row or column
left_join(), inner_join(), right_join() dplyr Join data frames by key
Reshaping data
reshape() base R Reshape data (complex interface)
expand.grid() base R Create all combinations (cross-join)
pivot_wider(), pivot_longer() tidyr Reshape data between wide and long formats
Descriptive statistics
mean(), median(), sd(), quantile(), min(), max() base R Univariate statistics
rowSums(), rowMeans(), colSums(), colMeans() base R Row- or column-wise statistics of data frames or matrices
summary() base R Summarize object (e.g., per column of data frame)
table(), prop.table(), addmargins() base R Absolute and relative frequency tables
cor(), cov() base R Bivariate statistics (correlation, covariance)
Exploration and structure
head(), tail() base R Show first or last elements of an object
nrow(), ncol(), dim() base R Number of rows, columns, or both
str() base R Display structure of an object
Apply functions and utilities
lapply() base R Apply a function over list/data frame elements
match(), %in% base R Matching and membership testing
match.arg() base R Match a value against a set of allowed values (useful in functions)

2 Useful functions for plotting

Function Package Description
plot() base R General-purpose plotting (scatter, lines, etc.)
boxplot() base R Box plots by groups
pie() base R Pie charts from frequency tables
barplot() base R Bar plots for summary statistics
mosaicplot() base R Mosaic plot for contingency tables
hist() base R Histogram of numeric data
forestplot() forestplot Forest plot with advanced features
ggplot() ggplot2 Initialize a ggplot object
aes() ggplot2 Define aesthetics mappings (x, y, color, fill)
geom_point() ggplot2 Scatter plot layer
geom_line() ggplot2 Line plot layer
geom_histogram() ggplot2 Histogram layer
geom_bar() ggplot2 Bar plot layer (can be stacked, grouped)
geom_boxplot() ggplot2 Box plot layer
geom_mosaic() ggmoisaic Moisaic plot layer
labs() ggplot2 Add titles and axis labels
scale_fill_manual() ggplot2 Manual fill color scale
theme_minimal() ggplot2 Minimalistic plot theme

References

Alexander Henzi. 2021. “Programming and Data Analysis with R.” Lecture notes.
Burns, Patrick. n.d. The R Inferno. Accessed May 8, 2025. https://www.burns-stat.com/documents/books/the-r-inferno/.
ChatGPT.” n.d. Accessed January 26, 2025. https://chatgpt.com.
Christopher J. Endres. 2025. “Introducing nhanesA.” https://cran.r-project.org/web/packages/nhanesA/vignettes/Introducing_nhanesA.html.
“Create Elegant Data Visualisations Using the Grammar of Graphics.” n.d. Accessed January 26, 2025. https://ggplot2.tidyverse.org/.
David, Author. 2016. BIRT Joins.” MBSE Chaos. https://mbsechaos.wordpress.com/2016/05/24/birt-joins/.
Elena Kosourova. n.d. RStudio Tutorial for Beginners: A Complete Guide.” Accessed January 26, 2025. https://www.datacamp.com/tutorial/r-studio-tutorial.
Grolemund, Hadley Wickham and Garrett. n.d. R for Data Science. Accessed May 8, 2025. https://r4ds.had.co.nz/introduction.html.
Mayer, Michael. 2025. “Mayer79/Statistical_computing_material.” https://github.com/mayer79/statistical_computing_material.
Patrick Burns. n.d. Impatient R. Accessed May 8, 2025. https://www.burns-stat.com/documents/tutorials/impatient-r/.
“Synthetic Dataset for AI in Healthcare.” n.d. Accessed May 9, 2025. https://www.kaggle.com/datasets/smmmmmmmmmmmm/synthetic-dataset-for-ai-in-healthcare.
“The Comprehensive R Archive Network.” n.d. Accessed January 26, 2025. https://stat.ethz.ch/CRAN/.
W. N. Venables, D. M. Smith and the R Core Team. n.d. “An Introduction to R.” Accessed May 8, 2025. https://cran.r-project.org/doc/manuals/r-release/R-intro.html.
Wickham, Hadley. n.d. Advanced R. Accessed May 8, 2025. https://adv-r.hadley.nz/introduction.html.