marginal-structural.Rmd

# Marginal Structural Model

This is a demonstration of a simple marginal structural model for estimation of so-called 'causal' effects using inverse probability weighting.

Example data is from, and comparison made to, the <span class="pack" style = "">ipw</span> package.  See more [here](https://www.jstatsoft.org/article/view/v043i13/v43i13.pdf).


## Data Setup

This example is from the helpfile at `?ipwpoint`.

```{r msm-setup}
library(tidyverse)
library(ipw)

set.seed(16)

n = 1000
simdat = data.frame(l = rnorm(n, 10, 5))
a_lin  = simdat$l - 10
pa = plogis(a_lin)

simdat = simdat %>% 
  mutate(
    a = rbinom(n, 1, prob = pa),
    y = 10 * a + 0.5 * l + rnorm(n, -10, 5)
  )

ipw_result = ipwpoint(
  exposure = a,
  family   = "binomial",
  link     = "logit",
  numerator   = ~ 1,
  denominator = ~ l,
  data = simdat
)

summary(ipw_result$ipw.weights)
ipwplot(ipw_result$ipw.weights)
```


We create the weights as follows using the probabilities from a logistic regression.

```{r msm-weights}
ps_num = fitted(glm(a ~ 1, data = simdat, family = 'binomial'))
ps_num[simdat$a == 0] = 1 - ps_num[simdat$a == 0]

ps_den = fitted(glm(a ~ l, data = simdat, family = 'binomial'))
ps_den[simdat$a == 0] = 1 - ps_den[simdat$a == 0]

wts = ps_num / ps_den
```

Compare the weights.

```{r msm-wts-compare}
rbind(summary(wts), summary(ipw_result$ipw.weights))
```


Add inverse probability weights to the data if desired.

```{r msm-add-weights}
simdat = simdat %>% 
  mutate(sw = ipw_result$ipw.weights)
```


## Function

Create the likelihood function for using the weights.

```{r msm-func}
msm_ll <- function(
  par,             # parameters to be estimated; first is taken to be sigma
  X,               # model matrix
  y,               # target variable
  wts              # estimated weights
) {
  beta  = par[-1]
  lp    = X %*% beta
  sigma = exp(par[1])  # exponentiated value to stay positive
  
  ll = dnorm(y, mean = lp, sd = sigma, log = TRUE)  
  -sum(ll * wts)   # weighted likelihood
  
  # same as
  # ll = dnorm(y, mean = lp, sd = sigma)^wts
  # -sum(log(ll))
}
```


## Estimation

We want to estimate the marginal structural model for the causal effect of `a` on `y` corrected for confounding by `l`, using inverse probability weighting with robust standard error from the <span class="pack" style = "">survey</span> package.  Create the matrices for estimation, estimate the model, and extract results.

```{r msm-ml}
X = cbind(1, simdat$a)
y = simdat$y

fit = optim(
  par = c(sigma = 0, intercept = 0, b = 0),
  fn  = msm_ll,
  X   = X,
  y   = y,
  wts = wts,
  hessian = TRUE,
  method  = 'BFGS',
  control = list(abstol = 1e-12)
)

dispersion = exp(fit$par[1])^2
beta = fit$par[-1]
```

Now we compute the standard errors. The following uses the <span class="pack" style = "">survey</span> package raw version to get the appropriate standard errors, which the <span class="pack" style = "">ipw</span> approach uses.


```{r msm-se-1}
glm_basic = glm(y ~ a, data = simdat, weights = wts)     # to get unscaled cov
res = resid(glm_basic, type = 'working')                 # residuals
glm_vcov_unsc = summary(glm_basic)$cov.unscaled          # weighted vcov unscaled by dispersion solve(crossprod(qr(X)))
estfun = X * res * wts                  
x = estfun %*% glm_vcov_unsc 
```


## Comparison

```{r msm-svy}
library("survey")

fit_msm = svyglm(
  y ~ a,
  design = svydesign(~ 1, weights = ~ sw, data = simdat)
)

summary(fit_msm)
```

Now  get the standard errors.

```{r msm-se-2}
se = sqrt(diag(crossprod(x) * n/(n-1)))                  # a robust standard error
se_robust = sqrt(diag(sandwich::sandwich(glm_basic)))    # an easier way to get it
se_msm    = sqrt(diag(vcov(fit_msm)))                        # extract from msm model
```

Compare standard errors.

```{r msm-se-compare}
tibble(se, se_robust, se_msm)
```

Inspect the general fit and compare with the other.

```{r msm-comparison- show, echo=FALSE}
tibble(
  Estimate  = beta,
  init_se   = sqrt(diag(solve(fit$hessian)))[c('intercept', 'b')],  # same as scaled se from glm_basic
  se_robust = se_robust,
  t = Estimate/se,
  p = 2*pt(abs(t), df = n - ncol(X), lower.tail = FALSE),  
  dispersion = dispersion             
) %>% 
  kable_df(caption = 'msm_ll')

broom::tidy(fit_msm) %>% 
  kable_df(caption = 'svyglm')
```


## Source

Original code available at https://github.com/m-clark/Miscellaneous-R-Code/blob/master/ModelFitting/ipw.R