Selecting and filtering |
|
|
subset() |
base R |
Select rows and columns of a data frame |
select() , rename() |
dplyr |
Select or rename columns |
filter() |
dplyr |
Select rows by condition |
slice() |
dplyr |
Select rows by position |
distinct() |
dplyr |
Return unique/distinct rows |
duplicated() |
base R |
Logical vector: is row duplicated? |
|
|
|
Sorting and ordering |
|
|
order() , rank() |
base R |
Sort indices or compute ranks |
arrange() |
dplyr |
Sort by one or more columns |
|
|
|
Creating or transforming variables |
|
|
transform() |
base R |
Add or overwrite columns in a data frame |
mutate() , transmute() |
dplyr |
Create or overwrite columns; transmute() returns only
new ones |
cumsum() |
base R |
Cumulative sum of a vector |
with() |
base R |
Evaluate expressions using data frame columns directly |
across() |
dplyr |
Apply function(s) to multiple columns (within mutate()
or summarize() ) |
|
|
|
Grouped operations |
|
|
aggregate() , tapply() ,
rowsum() , by() |
base R |
Perform grouped calculations |
ave() |
base R |
Grouped transformations (e.g., group-wise mean) |
group_by() , ungroup() |
dplyr |
Group rows by variable levels |
summarize() |
dplyr |
Compute summary statistics (often after
group_by() ) |
|
|
|
Combining data |
|
|
rbind() , cbind() |
base R |
Bind rows or columns |
merge() |
base R |
Join data frames by key |
bind_rows() , bind_cols() |
dplyr |
Bind data frames by row or column |
left_join() , inner_join() ,
right_join() |
dplyr |
Join data frames by key |
|
|
|
Reshaping data |
|
|
reshape() |
base R |
Reshape data (complex interface) |
expand.grid() |
base R |
Create all combinations (cross-join) |
pivot_wider() , pivot_longer() |
tidyr |
Reshape data between wide and long formats |
|
|
|
Descriptive statistics |
|
|
mean() , median() , sd() ,
quantile() , min() , max() |
base R |
Univariate statistics |
rowSums() , rowMeans() ,
colSums() , colMeans() |
base R |
Row- or column-wise statistics of data frames or matrices |
summary() |
base R |
Summarize object (e.g., per column of data frame) |
table() , prop.table() ,
addmargins() |
base R |
Absolute and relative frequency tables |
cor() , cov() |
base R |
Bivariate statistics (correlation, covariance) |
|
|
|
Exploration and structure |
|
|
head() , tail() |
base R |
Show first or last elements of an object |
nrow() , ncol() , dim() |
base R |
Number of rows, columns, or both |
str() |
base R |
Display structure of an object |
|
|
|
Apply functions and utilities |
|
|
lapply() |
base R |
Apply a function over list/data frame elements |
match() , %in% |
base R |
Matching and membership testing |
match.arg() |
base R |
Match a value against a set of allowed values (useful in
functions) |