Simulation Studies

Mixed Effects Models

R Packages

library(tidyverse)
library(mvtnorm)
library(brms)
theme_set(theme_bw())

Longitudinal Studies

  • Longitudinal Studies

  • Linear Mixed-Effects Models

  • Generalized Linear Mixed-Effects Models

  • Multilevel Models

Longitudinal Studies

Longitudinal studies are designed to determine whether there are time effects on an outcome of interest.

The data typically recorded on an experimental unit multiple times.

The repeated measurements are considered correlated due to sharing the experimental unit.

Correlated Outcomes

Correlated outcomes causes the standard error of the estimated to be inflated.

Therefore, inferential procedures are unreliable.

Data Structure

\[ Y_{i} = (Y_{i1}, Y_{i2}, \ldots, Y_{in_i}) \]

Modelling

  • Generalized Least Squares
  • Generalized Estimating Equations
  • Mixed Effects Models

Linear Mixed-Effects Models

  • Longitudinal Studies

  • Linear Mixed-Effects Models

  • Generalized Linear Mixed-Effects Models

  • Multilevel Models

Linear Models

\[ Y_{ij} = \beta_0 + \beta_1X_{ij} + \varepsilon_{ij} \]

Random Effects Model

\[ Y_{ij} = b_{0i} + b_{1i}X_{ij} + \varepsilon_i \]

Mixed Effects Model

\[ Y_{ij} = \beta_0 + \beta_1X_{ij} + b_{0i} + b_{1i}X_{ij} + \varepsilon_{ij} \]

  • \(b_{i0}\sim N(0, \sigma^2_{b0})\)
  • \(b_{i1}\sim N(0, \sigma^2_{b1})\)
  • \(\varepsilon_{ij} \sim N(0, \sigma^2_\varepsilon)\)
  • \(b_i \perp \varepsilon_{ij}\)

Simulation Study

  • 500 Units (patients)
  • For each unit, 5 repeated measurements
  • \(Y_{ij} = 4 + 1.9 X_{i1} + 2.3 t_j + b_{i0} + b_{i1} t_j + \varepsilon_{ij}\)
  • \(\varepsilon_{ij}\sim N(0, 1.1^2)\)
  • \(b_{i0}\sim N(0, 1.8^2)\)
  • \(b_{i1}\sim N(0,1.5^2)\)

Simulation

Code
df <- tibble()
beta0 <- 4
beta1 <- 1.9
beta2 <- 2.3
t <- 0:4

n <- 500
for(i in 1:n){
  b0 <- rnorm(1, sd = 1.8)
  b1 <- rnorm(1, sd = 1.5)
  x.vec <- rmvnorm(1, sigma = diag(rep(1, 5))) |> as.vector()
  y <- beta0 + beta1 * t + beta2 * x.vec + b0 + b1 * t  + rnorm(5, sd = 1.1) |> as.vector()
  df <- bind_rows(df,
                  tibble(y = y, x = x.vec, t = t, id = i))

}

Bayesian Analysis in R

Code
brm(y ~ x + t + (t|id),
    data = df,
    family = gaussian(),
    cores = 4)

Generalized Linear Mixed-Effects Models

  • Longitudinal Studies

  • Linear Mixed-Effects Models

  • Generalized Linear Mixed-Effects Models

  • Multilevel Models

GLMM

\[ g\{E(Y_{ij})\} = \beta_0 + \beta_1X_{ij} + b_{0i} + b_{1i}X_{ij} \]

Simulation Study

  • 500 Units (patients)
  • For each unit, 5 repeated measurements
  • \(g\{E(Y_{ij})\} = 2 - 1.56 X_{i1} - 1.3 t_j + b_{i0} + b_{i1} t_j\)
  • \(Y_{ij} \sim Bern(p_{ij})\)
  • \(b_{i0}\sim N(0, 1.8^2)\)
  • \(b_{i1}\sim N(0,1.5^2)\)

Simulation

Code
df <- tibble()
beta0 <- 2
beta1 <- -1.56
beta2 <- -1.3
t <- 0:4

n <- 500
for(i in 1:n){
  b0 <- rnorm(1, sd = 1.8)
  b1 <- rnorm(1, sd = 1.5)
  x.vec <- rmvnorm(1, sigma = diag(rep(1, 5))) |> as.vector()
  mu <- boot::inv.logit(beta0 + beta1 * t + beta2 * x.vec + b0 + b1 * t)
  y <- rbinom(5, 1, mu)
  df <- bind_rows(df,
                  tibble(y = y, x = x.vec, t = t, id = i))

}

Bayesian Analysis in R

Code
brm(y ~ t + x + (t|id),
    data = df,
    family = bernoulli(),
    cores = 4)

Multilevel Models

  • Longitudinal Studies

  • Linear Mixed-Effects Models

  • Generalized Linear Mixed-Effects Models

  • Multilevel Models

Multilevel Models

Multilevel models are utilized to account for correlation among multiple grouping settings.

Longitudinal Biomedical Study

Multilevel Models can be used to model patients in a facility, and repeated measurements in a patient.

The first level is the facility. There are a set number of facilities in a study.

The next level are the patients within a facility. Additionally, each patient has a set of observations.

It is thought that the patients within the facility share a number of characteristics. This will bias the standard errors in a model if not accounted for.

Data Structure

  • \(Y_i=(Y_{i1}, Y_{i2}, \ldots, Y_{in_{i}}) ^\mathrm T\): A vector containing all the outcomes in a facility.
  • \(Y_{ij}=(Y_{ij1}, Y_{ij2}, \ldots, Y_{ijn_{ij}}) ^\mathrm T\) A vector containing all the the outcomes in a patient.
  • \(Y_{ijk}\): An individual outcome.

Model

\[ Y_{ijk} = \beta_0 + \beta_1 X_{ij} + \beta_2 t_{ijk} + b_i + b_{ij} + \varepsilon_{ijk} \]

Simulation Study

  • 500 Facilities
  • For each facility, there are 30 patients
  • For each patient there are 5 repeated measurements
  • \(Y_{ijk} = 2 - 1.56 X_{i1} - 1.3 t_j + b_{ij} + b_{i} + \varepsilon_{ijk}\)
  • \(\varepsilon_{ijk} \sim N(0, 4)\)
  • \(b_{i}\sim N(0, .6^2)\)
  • \(b_{ij}\sim N(0,1.1^2)\)

Simulation

Code
df <- tibble()
beta0 <- 2
beta1 <- -1.56
beta2 <- -1.3
t <- 0:4

n <- 25
nn <- 30
j <- 1

for(i in 1:n){
  xi <- rnorm(1, sd = .6)
  for(ii in 1:nn){
    b0 <- rnorm(1, sd = 1.1)
    x.vec <- rmvnorm(1, sigma = diag(rep(1, 5))) |> as.vector()
    y <- beta0 + beta1 * t + beta2 * x.vec + 
                            b0 + xi + rnorm(5, sd = 2) |> 
      as.vector()
    df <- bind_rows(df,
                    tibble(y = y, x = x.vec, t = t, 
                           id = i, id2 = ii, jid = j))
    j <- j + 1
  }
}

Bayesian Analysis in R

Code
brm(y ~ x + t + (1|jid) + (1|id),
    data = df,
    family = gaussian(),
    iter = 2000,
    cores = 4)

Frog Study

Frog Species

Two of the frog species seems similar to each other.

Could they be evolutionarily related to each other?

We need to account the relatedness using random effects.

Random Effects on Species

\[ \left(\begin{array}{c}S_1\\S_2\\S_3\end{array}\right) \sim N\left\{\left(\begin{array}{c}0\\0\\0\end{array}\right), \left(\begin{array}{ccc}\sigma_{1}^2 & \sigma_{12} & \sigma_{13}\\\sigma_{12} & \sigma_{2}^2 & \sigma_{23}\\\sigma_{13} & \sigma_{23} & \sigma_{3}^2\end{array}\right) \right\} \]

Model

\[ Y_{ijk} = \beta_0 + \beta_1 X_{i1} + \beta_2 t_j + b_{ij} + b_{i} + b_{sp(ij)} + \varepsilon_{ijk} \]

Simulation Study

  • 15 Housing Unit
  • For each Housing Unit, there are 15 frogs
  • For each Housing Unit, there are 5 frogs of each species
  • For each frog, there are 4 repeated measurements
  • \(Y_{ijk} = 2 - 1.56 X_{i1} - 1.3 t_j + b_{ij} + b_{i} + b_{sp(ij)} + \varepsilon_{ijk}\)
  • \(\varepsilon_{ijk} \sim N(0, 4)\)
  • \(b_{i}\sim N(0, .6^2)\)
  • \(b_{ij}\sim N(0,1.1^2)\)
  • \(b_{sp(ij)} \sim N_3(0, \Sigma)\)

Simulation

Code
x1 <- c(1, 0.80, 0.10)
x2 <- c(0.80, 1, 0.43)
x3 <- c(0.10, 0.43, 1)
corr_x <- as.matrix(rbind(x1, x2, x3))
cov_x <- 2.2^2 * corr_x 
spe_id <- rep(1:3, times = 10)
spe_re <- rmvnorm(1, sigma = cov_x)

rownames(corr_x) <- c("A", "B", "C")
colnames(corr_x) <- c("A", "B", "C")

df <- tibble()
beta0 <- 2
beta1 <- -1.56
beta2 <- -1.3
t <- 0:3

n <- 15
nn <- 30
j <- 1

for(i in 1:n){
  xi <- rnorm(1, sd = .6)
  for(ii in 1:nn){
    b0 <- rnorm(1, sd = 1.1)
    x.vec <- rmvnorm(1, sigma = diag(rep(1, 4))) |> as.vector()
    y <- beta0 + beta1 * t + beta2 * x.vec + 
          b0 + xi + spe_re[spe_id[ii]] + rnorm(4, sd = 2) |> 
      as.vector()
    df <- bind_rows(df,
                    tibble(y = y, x = x.vec, t = t, 
                           id = i, id2 = ii, jid = j, 
                           spe = case_when(spe_id[ii] == 1 ~ "A",
                                           spe_id[ii] == 2 ~ "B",
                                           spe_id[ii] == 3 ~ "C")))
    j <- j + 1
  }
}

Bayesian Analysis in R

Code
brm(y ~ x + t + (1|jid) + (1|id) + (1|gr(spe, cov = A)),
    data = df,
    data2 = list(A = corr_x),
    family = gaussian(),
    prior = c(
      prior(normal(0, 10), "b"),
      prior(normal(0, 50), "Intercept"),
      prior(student_t(3, 0, 20), "sd"),
      prior(student_t(3, 0, 20), "sigma")
    ),
    control = list(adapt_delta = 0.99,
                   max_treedepth = 30),
    iter = 4000,
    cores = 4)