Author
Name	Claire Descombes
Affiliation	Universitätsklinik für Neurochirurgie, Inselspital Bern
Degree	MSc Statistics and Data Science, University of Bern
Contact	claire.descombes@insel.ch

The reference material for this course, as well as some useful literature to deepen your knowledge of R, can be found at the bottom of the page.

To explore R’s statistical tools, we’ll continue working with the NHANES datasets. Specifically, we’ll use the merged data frame that combines the demo, bpx, bmx and smq data sets. If you still have it loaded from Chapters 2 und 3, you can use it directly. Otherwise, you can download it from the data_sets folder and import it into R.

To be able to use survival data, we will also need the NHANES data set complemented with mortality data. If you still have it loaded from Chapters 3, you can use it directly. Otherwise, you can download it from the data_sets folder and import it into R.

# Load the merged_nhanes CSV file
merged_nhanes <- read.csv("/home/claire/Documents/GitHub/rforphysicians/data_sets/merged_nhanes.csv")

# Load the merged_nhanes_with_mort CSV file
merged_nhanes_with_mort <- read.csv("/home/claire/Documents/GitHub/rforphysicians/data_sets/merged_nhanes_with_mort.csv")

1 Most commonly used statistical tests and models

This chapter will follow the structure outlined below. We will learn how to perform a selection of statistical tests and models in R, categorising them by their objective (e.g. comparing two means or analysing survival).

Section	Topic	Subtopics
4.1 Tests for comparing two groups	Tests comparing means or proportions between two groups	• Student’s t-test • Wilcoxon-Mann-Whitney test (Mann-Whitney-U-test) • Fisher’s exact test • McNemar test
4.2 Tests for more than two groups	Tests for comparing multiple groups	• Kruskal-Wallis test • Friedman test • Pearson’s Chi-Square test
4.3 Tests for distribution and normality	Tests for distribution of data	• Lilliefors / Kolmogorov-Smirnov-Lilliefors test
4.4 Tests for survival analysis	Tests for time-to-event data	• Logrank test
4.5 Correlation and association tests	Tests for relationships between variables	• Correlation test by Pearson • Correlation test by Spearman
4.6 Predictive modelling and regression	Predictive models and regression techniques	• Generalized Linear Models (GLMs): Linear regression, logistic regression, ordinal regression, Cox (proportional hazards) regression - Multivariable Regression • Mixed Effects Models • Generalized Additive Models (GAMs) • Generalized Additive Mixed Models (GAMMs)

1.1 Which test for which data?

Before we begin, here is an overview of the type of data required by each test/model. More tests and models are presented than will be discussed in the course, but I have included them for the sake of completeness. If you feel lost on that topic at any point in the chapter, just scroll up again to have a look at this table.

	1 sample	2 paired samples	2 unpaired samples	>2 paired samples	>2 unpaired samples	Continuous predictor
Binary	• Binomial test	• McNemar test	• Chi-square test • Fisher’s exact test	• Cochran’s Q test	• Chi-square test • Extensions of Fisher’s test	• Logistic regression
Nominal	• Chi-square goodness of fit test		• Chi-square test • Extensions of Fisher’s test		• Chi-square test	• Multinomial regression
Ordinal	• Wilcoxon signed-rank test • Sign test	• Sign test • Wilcoxon signed-rank test on differences	• Mann-Whitney U test (Wilcoxon rank-sum test)	• Friedman test	• Kruskal–Wallis test	• Ordinal regression
Continuous	• One-sample t-test	• Paired t-test • Wilcoxon signed-rank test on differences	• Two-sample t-test	• Repeated measures ANOVA	• ANOVA	• Linear regression
Time-to-event	• One-sample log-rank test		• Log-rank test			• Cox regression • Weibull regression

1.2 What is a statistical test?

Before we start looking at many different statistical hypothesis tests, let us recall the basic structure that they all share, by exploring more in-depth the Student’s t-test for two, independent continuous samples.

Generally	Student’s t-test for two, independent samples
A test always aims to put to test a claim we make about some parameter (a correlation, a difference in mean, etc.) in the distribution of the population of interest.	Working on two patients cohorts, the smokers and the non-smokers, one could ask her-/himself if they share the same mean on a continuous variable, e.g. the BMI
The data itself is expected to fulfil some assumptions.	As the BMI data is continuous, the cohorts are big (> 100 patients each, of sizes n_1 and n_2) and assumed to be independent, and there is no reason to assume they’d have a different variance in the BMI, an unpaired Student’s t-test appears to be the right choice.
The ‘status quo’/no difference formulation of the question we’re asking is called the null hypothesis.	The null hypothesis, in our example, is the hypothesis that our two cohorts (the smokers and the non-smokers) share the same mean BMI
The hypothesis we are usually interested to prove, the one showing a difference between two quantities, is called the alternative hypothesis	The alternative hypothesis, in our example, is the hypothesis that our two cohorts (the smokers and the non-smokers) do not share the same mean BMI
Assuming the null hypothesis is true, a specific parameter, called the statistic, is expected to fall into a given distribution	If the null hypothesis is true, then the t statistic: t = \frac{\tilde{X}_1-\tilde{X}_2}{s_p \cdot \sqrt{\frac{1}{n_1}+\frac{1}{n_2}}} (where s_p is the pooled standard deviation) follows a Student’s t-distribution with n_1+n_2-2 degrees of freedom.
For any object of which the distribution is (assumed to be) known, it is possible to compute its probability to fall within a certain range; for a given significance level (denoted by \alpha) and between 0 and 1, one can determine an interval in which the statistic has a probability of 1-\alpha to fall within	Letting \alpha be 0.05, as it is common in medicine, we can determine that our t statistic has a probability of 95% to be in a given interval (that can be computed in R or using a table).
If the statistic, after being computed, falls in the 1-\alpha confidence interval (p>\alpha), then we cannot reject the null hypothesis, and have to look for further evidence to prove our claim.	If the t statistic falls into the range we computed, then we cannot say that we saw a significant difference in the means.
If the statistic, after being computed, does not fall in the 1-\alpha confidence interval (p \leq \alpha), then we can reject the null hypothesis, and say that the claim we made on the data is statistically significant with a 1-\alpha confidence.	If the t-statistic does not fall into the range we computed, then we can say that we observe a statistically significant difference in the means, with a confidence of 95%.

A few important comments:

The p-value is the probability under the null hypothesis of obtaining a (real-valued) test statistic at least as extreme as the one obtained.
Loosely speaking, rejection of the null hypothesis implies that there is sufficient evidence against it.
It is not possible to prove that the null hypothesis is true: it could always be that with a bigger effect size and/or a larger sample size, we could statistically show the difference that we are trying to demonstrate. So, when a test comes out “negatively” (no statistical significance, p>\alpha), it does not mean that our claim is necessarly wrong.
On the other hand, even a significant test (p\leq\alpha) does not mean that we irrefutably proved our claim: in a theoretical world, where we could repeat our experiment 100 times in the same conditions and on the same population, even if our theory is wrong, we would always find a “positive” result (p\leq\alpha) in \alpha\% of the cases. That is why consensus, in science, always relies on repeated experiments that (for the vast majority of them*) draw the same conclusion.
* Why is that? Because again, even in a theoretical world, where we could repeat our experiment 100 times in the same conditions, even if our theory is right, we would always find a “negative” result in \alpha\% of the cases.

2 Tests for comparing two groups

2.1 Student’s t-test

1. Null hypothesis

Means of two populations are equal

2. Type of data

One or two vectors of continuous variables (e.g. blood pressure in mmHg in two cohorts)

3. Requirements

Both samples are approximately normally distributed. Under weak assumptions, this follows in large samples from the central limit theorem, even when the distribution of observations in each group is non-normal.
If using Student’s original definition of the t-test, the two populations being compared should have the same variance (assessable graphically using a Q–Q plot). If the sample sizes in the two groups being compared are equal, Student’s original t-test is highly robust to the presence of unequal variances. Welch’s t-test is insensitive to equality of the variances regardless of whether the sample sizes are similar.
The data used to carry out the test should either be sampled independently from the two populations being compared or be fully paired. This is in general not testable from the data, but if the data are known to be dependent (e.g. paired by test design), a paired test has to be applied.

💡 Independent samples are randomly selected so that their observations are not dependent on the values of other observations.

4. R function

t.test(x, ...): Performs one and two sample t-tests on vectors of data.

5. Important arguments

t.test(x, y = NULL,
       alternative = c("two.sided", "less", "greater"),
       mu = 0, paired = FALSE, var.equal = FALSE,
       conf.level = 0.95, ...)

x, y: the vector(s) that contain the data (only one vector in the case of a one-sample t-test, two vectors otherwise).
alternative: a character string specifying the alternative hypothesis, must be one of “two.sided” (default), “greater” or “less”.
mu: a number indicating the true value of the mean (or difference in means if you are performing a two sample test).
paired: a logical indicating whether you want a paired t-test.
var.equal:a logical variable indicating whether to treat the two variances as being equal. If TRUE then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used.
conf.level: confidence level of the interval (default: 0.95).

💡 “paired” means there is an obvious and meaningful one-to-one correspondence between the data in the first set and the data in the second set, e.g. blood pressure in a set of patients before and after administering a medicine.

6. Example

Let us test if the mean BMI (BMXBMI) is significantly different between men and women (RIAGENDR).

# We quickly check the distribution of the BMI using a Q-Q-plot (even though,
# with such a large sample, this isn't required)
qqnorm(y = merged_nhanes$BMXBMI)

# We create two vectors, with the BMIs of women and men
bmi_women <- merged_nhanes$BMXBMI[merged_nhanes$RIAGENDR == "Female"]
bmi_men <- merged_nhanes$BMXBMI[merged_nhanes$RIAGENDR == "Male"]

# We can start looking at how many non NA entries we have and by computing the means
sum(!is.na(bmi_women));sum(!is.na(bmi_men))

## [1] 3239

## [1] 3212

mean(bmi_women, na.rm = TRUE);mean(bmi_men, na.rm = TRUE)

## [1] 28.35542

## [1] 27.34091

qqplot(x = bmi_women, y = bmi_men)

# We compare their means using a t-test
t.test(x = bmi_women, y = bmi_men, alternative = 'two.sided', var.equal = TRUE, paired = FALSE)

## 
##  Two Sample t-test
## 
## data:  bmi_women and bmi_men
## t = 5.8288, df = 6449, p-value = 5.854e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.6733092 1.3557093
## sample estimates:
## mean of x mean of y 
##  28.35542  27.34091

The t value is the statistic of the t-test.
df are the degrees of freedom (a characteristics of the underlying distribution of the statistic), depends on the test used (one- or two-sample, paired or unpaired), in this case it is the sum of the two samples minus 2 (3239+3212-2=6449).
The p-value is the probability of getting such a t value under the null hypothesis. It is very small: 5.854e-09, so to an \alpha-level of 0.05, we say that their is a statistically significant difference between the means of the two groups.
The 95% CI tells us where the difference in the means lies, to a confidence of 95%. In coherence with the very low p-value, this interval does not contain 0 (which would mean “no difference”).
Finally, the means of both vectors are also computed.

💡 qqnorm is a function that produces a normal Q-Q-plot of the values in y. qqplot produces a Q-Q-plot of two datasets.

2.1.1 Z-test

The z-test is similar to the t-test (continuous variable(s), independent and normally distributed), and its null hypothesis is also that the means of two populations are equal, but it requires to know the standard deviation of your population(s).

You should favour a z-test to a t-test if the standard deviation of the population is known and the sample size is greater than or equal to 30.

There is no z-test function available in base R, but in packages like BSDA (z.test(x, ...)).

2.2 Wilcoxon rank-sum and signed-rank tests

Non-parametric (= makes minimal assumptions about the underlying distribution of the data being studied) “alternative” to the t-test (we do not test exactly the same thing, the t-test compares means, the Wilcoxon test compares distributions).

💡 If working on paired data, the test is called Wilcoxon signed-rank test, and on unpaired data it is called Wilcoxon rank-sum test (also called Mann–Whitney U test, Mann–Whitney–Wilcoxon test, or Wilcoxon–Mann–Whitney test).

1. Null hypothesis

Distributions of both populations are identical

2. Type of data

One or two vectors of continuous or ordinal variables (e.g. pain levels in two cohorts)

3. Requirements

The data used to carry out the test should either be sampled independently from the two populations being compared or be fully paired. This is in general not testable from the data, but if the data are known to be dependent (e.g. paired by test design), a paired test has to be applied.
The differences between paired observations should be symmetrically distributed around the median. Unlike the paired t-test, the Wilcoxon test does not require normality but works best when the distribution is roughly symmetric.

4. R function

wilcox.test(x, ...): Performs one- and two-sample Wilcoxon tests on vectors of data; the latter is also known as ‘Mann-Whitney’ test.

5. Important arguments

wilcox.test(x, y = NULL,
            alternative = c("two.sided", "less", "greater"),
            mu = 0, paired = FALSE, exact = NULL, correct = TRUE,
            conf.int = FALSE, conf.level = 0.95,
            tol.root = 1e-4, digits.rank = Inf, ...)

x, y: the vector(s) that contain the data (only one vector in the case of a one-sample Wilcoxon test, two vectors otherwise).
alternative: a character string specifying the alternative hypothesis, must be one of “two.sided” (default), “greater” or “less”.
mu: a number specifying an optional parameter used to form the null hypothesis. Set mu = 0 (default) to test if the samples have the same distribution.
paired: a logical indicating whether you want a paired test. If only x is given, or if both x and y are given and paired is TRUE, a Wilcoxon signed rank test of the null that the distribution of x (in the one sample case) or of x - y (in the paired two sample case) is symmetric about mu is performed. Otherwise, if both x and y are given and paired is FALSE, a Wilcoxon rank sum test (equivalent to the Mann-Whitney test: see the Note) is carried out. In this case, the null hypothesis is that the distributions of x and y differ by a location shift of mu and the alternative is that they differ by some other location shift.
conf.level: confidence level of the interval (default: 0.95).

6. Example

Let us test if the distribution of the BMI (BMXBMI) is significantly different between men and women (RIAGENDR).

# We create two vectors, with the BMIs of women and men
bmi_women <- merged_nhanes$BMXBMI[merged_nhanes$RIAGENDR == "Female"]
bmi_men <- merged_nhanes$BMXBMI[merged_nhanes$RIAGENDR == "Male"]

# We compare their distributions using a Wilcoxon test
wilcox.test(x = bmi_women, y = bmi_men, alternative = 'two.sided', mu = 0, paired = FALSE)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  bmi_women and bmi_men
## W = 5464414, p-value = 0.0004466
## alternative hypothesis: true location shift is not equal to 0

The W value is the statistic of the Wilcoxon test (also called U statistic).
The p-value is the probability of getting such a W value under the null hypothesis. It is very small: 0.0004466, so to an \alpha-level of 0.05, we say that their is a statistically significant difference between the distributions of the two groups.

2.3 Fisher’s exact test

Focus on categorical variables and small sample sizes.

1. Null hypothesis

Distributions of both populations are identical

2. Type of data

One or two vectors of continuous or ordinal variables (e.g. pain levels in two cohorts)

3. Requirements

The data used to carry out the test should either be sampled independently from the two populations being compared or be fully paired. This is in general not testable from the data, but if the data are known to be dependent (e.g. paired by test design), a paired test has to be applied.
The differences between paired observations should be symmetrically distributed around the median. Unlike the paired t-test, the Wilcoxon test does not require normality but works best when the distribution is roughly symmetric.

4. R function

fisher.test(x, ...): tests the null hypothesis of independence of rows and columns in a contingency table with fixed marginals.

5. Important arguments

fisher.test(x, y = NULL, workspace = 200000, hybrid = FALSE,
            hybridPars = c(expect = 5, percent = 80, Emin = 1),
            control = list(), or = 1, alternative = "two.sided",
            conf.int = TRUE, conf.level = 0.95,
            simulate.p.value = FALSE, B = 2000)

x: either a two-dimensional contingency table in matrix form, or a factor object.
y: a factor object; ignored if x is a matrix.
alternative: a character string specifying the alternative hypothesis, must be one of “two.sided” (default), “greater” or “less”. Only used in the 2\times2 case.
conf.int: logical indicating if a confidence interval for the odds ratio in a 2\times2 table should be computed (and returned).
conf.level: confidence level for the returned confidence interval. Only used in the 2\times2 case and if conf.int = TRUE.

6. Example

Let us test if the distribution of the BMI (BMXBMI) is significantly different between men and women (RIAGENDR).

# We create two vectors, with the BMIs of women and men
bmi_women <- merged_nhanes$BMXBMI[merged_nhanes$RIAGENDR == "Female"]
bmi_men <- merged_nhanes$BMXBMI[merged_nhanes$RIAGENDR == "Male"]

# We compare their distributions using a Wilcoxon test
wilcox.test(x = bmi_women, y = bmi_men, alternative = 'two.sided', mu = 0, paired = FALSE)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  bmi_women and bmi_men
## W = 5464414, p-value = 0.0004466
## alternative hypothesis: true location shift is not equal to 0

2.4 McNemar test

Highlight its application for paired categorical data.

3 Tests for more than two groups

3.1 Kruskal-Wallis Test

Generalization of Wilcoxon-Mann-Whitney for more than two groups.

3.2 Friedman Test

Generalization of paired tests (e.g., Wilcoxon) for more than two related groups.

3.3 Pearson’s Chi-Square Test

Complement Fisher’s Exact Test, emphasizing it’s better suited for larger samples.

4 Tests for distribution and normality

4.1 Lilliefors/ Kolmogorov-Smirnov-Lilliefors Test

Test for deviations from normality, set the stage for determining when to use parametric vs. non-parametric tests.

5 Tests for survival analysis

5.1 Logrank/ Log-Rank Test

For analyzing time-to-event data. Mention Kaplan-Meier curves for context.

5.2 Correlation and association tests

5.3 Correlation test by Pearson

Basis for understanding relationships between two continuous, normally distributed variables.

5.4 Correlation test by Spearman

Non-parametric alternative for monotonic relationships.

6 Predictive modeling and regression

6.1 Generalized Linear Models (GLMs)

Purpose: GLMs are an extension of linear models that allow for non-normal distributions of the response variable (e.g., binary, count, or categorical outcomes). They offer more flexibility than traditional linear regression by using different link functions and error distributions.

Key Features:

Linear relationship: GLMs assume a linear relationship between the predictors and the transformed response variable.

Link function: Links the linear predictor to the mean of the distribution. Common link functions:

Identity link for normal distribution (linear regression)
Logit link for binomial distribution (logistic regression)
Log link for Poisson distribution (Poisson regression)

Error distributions: GLMs can be applied with various error distributions: * Normal for continuous data (linear regression) * Binomial for binary data (logistic regression) * Poisson for count data

Assumptions

Independence: Observations must be independent.
Distribution: The response variable follows an appropriate distribution (e.g., binomial for binary outcomes, Poisson for count data).

Common Applications

Linear regression: Predicting a continuous outcome.
Logistic regression: Predicting binary outcomes (e.g., yes/no, success/failure).
Poisson regression: Modeling count data (e.g., number of events in a fixed time period).
Cox regression: A form of survival analysis used to model time-to-event data, often with censored observations. It is based on the proportional hazards assumption and estimates the effect of predictor variables on the hazard (event occurrence rate).

6.1.1 Linear regression

Purpose: Used to model the relationship between a continuous dependent variable and one or more independent variables. Assumptions: Linearity, normality of residuals, homoscedasticity, independence. Example Application: Predicting the price of a house based on square footage, number of rooms, etc.

6.1.2 Logistic regression

Purpose: Used when the dependent variable is binary (e.g., yes/no, success/failure). Assumptions: Linear relationship between the log-odds of the outcome and predictors. Example Application: Predicting the likelihood of a disease based on age, gender, and other factors.

6.1.3 Cox proportional hazards regression

Purpose: Used for survival analysis, particularly when studying the time to an event (e.g., time to death, relapse). Assumptions: Proportional hazards assumption, meaning the effect of the predictor on the hazard rate is constant over time. Example Application: Analyzing the impact of age, treatment type, and other covariates on patient survival times.

6.1.4 Multivariable regression

Purpose: An extension of linear or logistic regression with more than one predictor variable. Assumptions: Similar to linear and logistic regression, but more complex due to multiple predictors. Example Application: Predicting a health outcome (e.g., cholesterol levels) based on multiple lifestyle factors (e.g., diet, exercise, genetics).

6.2 Mixed Effects Models

Purpose: Mixed effects models allow for the inclusion of both fixed and random effects, providing flexibility for hierarchical or grouped data. They are especially useful when there is variation between groups or subjects.

Key Features

Fixed effects: These are the main predictors of interest (e.g., treatment, age, etc.), which are assumed to have the same effect across all groups. Random effects: These account for variability across groups or clusters (e.g., random intercepts for subjects or random slopes for measurements over time).

Assumptions

Random effects are independent and identically distributed.
Fixed effects have a linear relationship with the response variable.

Common Applications

Longitudinal data: When measurements are taken repeatedly on the same subjects over time (e.g., repeated measurements on patients).
Clustered data: When observations are grouped into clusters (e.g., students within schools, patients within hospitals).

6.3 Generalized Additive Models (GAMs)

Purpose: GAMs extend GLMs by allowing for non-linear relationships between predictors and the outcome. This is useful when the relationship between the independent and dependent variables is not linear.

Key Features

Non-linear terms: Uses smooth functions (e.g., splines) for predictors, allowing for flexibility in modeling.

Additive structure: The model assumes that the total effect is an additive combination of linear and smooth non-linear terms.

Link function: Like GLMs, GAMs can use different link functions depending on the distribution of the outcome variable.

Common Applications: Modelling complex relationships in patient data where the effect of treatment or time may not be linear.

6.4 Generalized Additive Mixed Models (GAMMs)

Purpose: GAMMs combine the flexibility of GAMs with random effects, useful for hierarchical or clustered data.

Key Features: Like GAMs, but with the inclusion of random effects to account for variability between groups.

Applications: Ideal for longitudinal studies or hierarchical data where both non-linear relationships and random effects are present.

Assumptions

Additivity: The total effect is a sum of the individual effects of predictors (this can be both linear and smooth).
Normality or appropriate error distribution: Depending on the type of outcome (e.g., Poisson for count data, binomial for binary data).
Random effects: If included, random effects account for variations between groups or subjects.

Example Application: Analysing patient data where outcomes are influenced by both individual patient characteristics and random hospital-specific effects (e.g., variability between hospitals).

References

Alexander Henzi. 2021. “Programming and Data Analysis with R.” Lecture notes.

Burns, Patrick. n.d. The R Inferno. Accessed May 8, 2025. https://www.burns-stat.com/documents/books/the-r-inferno/.

CDC. 2025. “National Death Index.” Data Linkage. https://www.cdc.gov/nchs/linked-data/mortality-files/index.html.

“ChatGPT.” n.d. Accessed January 26, 2025. https://chatgpt.com.

Christopher J. Endres. 2025. “Introducing nhanesA.” https://cran.r-project.org/web/packages/nhanesA/vignettes/Introducing_nhanesA.html.

“Create Elegant Data Visualisations Using the Grammar of Graphics.” n.d. Accessed January 26, 2025. https://ggplot2.tidyverse.org/.

David, Author. 2016. “BIRT Joins.” MBSE Chaos. https://mbsechaos.wordpress.com/2016/05/24/birt-joins/.

Elena Kosourova. n.d. “RStudio Tutorial for Beginners: A Complete Guide.” Accessed January 26, 2025. https://www.datacamp.com/tutorial/r-studio-tutorial.

Grolemund, Hadley Wickham and Garrett. n.d. R for Data Science. Accessed May 8, 2025. https://r4ds.had.co.nz/introduction.html.

Mayer, Michael. 2025. “Mayer79/Statistical_computing_material.” https://github.com/mayer79/statistical_computing_material.

Patrick Burns. n.d. Impatient R. Accessed May 8, 2025. https://www.burns-stat.com/documents/tutorials/impatient-r/.

“P-Value.” 2025. Wikipedia. https://en.wikipedia.org/w/index.php?title=P-value&oldid=1305292611.

“Synthetic Dataset for AI in Healthcare.” n.d. Accessed May 9, 2025. https://www.kaggle.com/datasets/smmmmmmmmmmmm/synthetic-dataset-for-ai-in-healthcare.

“The Comprehensive R Archive Network.” n.d. Accessed January 26, 2025. https://stat.ethz.ch/CRAN/.

W. N. Venables, D. M. Smith and the R Core Team. n.d. “An Introduction to R.” Accessed May 8, 2025. https://cran.r-project.org/doc/manuals/r-release/R-intro.html.

Wickham, Hadley. n.d. Advanced R. Accessed May 8, 2025. https://adv-r.hadley.nz/introduction.html.

Chapter 4

R for physicians

2025-08-14

1 Most commonly used statistical tests and models

1.1 Which test for which data?

1.2 What is a statistical test?

2 Tests for comparing two groups

2.1 Student’s t-test

2.1.1 Z-test

2.2 Wilcoxon rank-sum and signed-rank tests

2.3 Fisher’s exact test

2.4 McNemar test

3 Tests for more than two groups

3.1 Kruskal-Wallis Test

3.2 Friedman Test

3.3 Pearson’s Chi-Square Test

4 Tests for distribution and normality

4.1 Lilliefors/ Kolmogorov-Smirnov-Lilliefors Test

5 Tests for survival analysis

5.1 Logrank/ Log-Rank Test

5.2 Correlation and association tests

5.3 Correlation test by Pearson

5.4 Correlation test by Spearman

6 Predictive modeling and regression

6.1 Generalized Linear Models (GLMs)

6.1.1 Linear regression

6.1.2 Logistic regression

6.1.3 Cox proportional hazards regression

6.1.4 Multivariable regression

6.2 Mixed Effects Models

6.3 Generalized Additive Models (GAMs)

6.4 Generalized Additive Mixed Models (GAMMs)

References