BayesPiecewiseICAR: Tutorial and Details

This is a tutorial for how to use my package BayesPiecewiseICAR which fits a hazard function under the assumption of a piecewise hazard. This Bayesian sampler has a hierarchical structure that allows for shared dependency between the hazard heights and additionally allows for a changing dimension in the number of split points. I used this code in my analysis of Mass Shooting Data (with no censoring corresponding to the shooting times). Before getting into how to use the code extensively, I want to detail how the model is formulated.

The Model

Although this model can be used for many time to event purposes, it’s primarily intended for modeling patient survival which contains censoring indicators. Let $(Y_i,I_i)$ denote the patient survival time and censoring indicator, respectively. If you do not have censoring, you can use $I_i=1$ for all $i$. We assume that the hazard distribution for $(\mathbf{Y},\mathbf{I})$ is peicewise constant $e^{\lambda_{j+1}}$ on disjoint intervals $(s_j,s_{j+1})$ where $s_0=0. This produces the following survival distribution and corresponding likelihood.

Next we put prior distributions on the parameters along with second level prior. The prior distributions are as follows:

Where $\Sigma_s$ is defined by the ICAR formulation in the following manner where $c_\lambda$ is a hyperparameter:

The MCMC for this allows reversible jump moves, with the dimension of $\mathbf{s}$ and $\mathbf{\lambda}$ being permitted to change so this requires an additional parameter space adjustment to the acceptance probability, which I will not discuss here. The other two moves, sampling $\sigma^{-2}$, $\mu$ and $\mathbf{\lambda}|J$ are gibbs and metropolis samplers.

How to use the function

Let’s take a look at the example, which I adjusted to better illustrate what’s happening (will take more time than the normal example)

library(BayesPiecewiseICAR)

Y=rexp(100,1/20)
I=rbinom(100,1,.5) – Simulates Data
a1=.7
b1=.7 – Both of these are priors for $\sigma^{-2}$
phi=3 – Mean number of prior split points
Jmax=20 – Maximum number of split points allowed
cl1=.25 –Tuning parameter for proposals of $\lambda_j$
clam1=.5   – Spatial Dependency of  the ICAR specification
J1=10 – Starts the sampler with 10 split points, which are randomly generated
hyper=c(a1,b1,phi,Jmax,cl1,J1,clam1)
B=10000 – Using 10k iterations to show how it works
X=ICARBHSampler(Y,I,B,hyper) – Runs sampler

Which should produce the following graphs (or something like it)

The first graph displays a histogram of the number of split points in the sample, clearly it’s suggesting there are no split points, which makes sense because the exponential distribution has one piece in it’s constant hazard. The second and third graphs are trace plots for the level 2 hyper parameters. Now let’s investigate what’s happening more by looking at the objects in the list of X.

par(mfrow=c(1,1))

plot(1:length(X[[1]]),X[[1]],type=”l”, main=”Trace of # of Split Points”, xlab=”Iteration #”, ylab=”# of Split Points”)

As you can see, the MCMC is starting out with 10 split points and is immediately reducing this to 0-1 split points as it moves through the rest of the sampler. We can see how this is done by looking at the split point locations in X[[2]] and the $\lambda_j$ values stored in X[[3]].

As you can see the vectors are shrinking immediately (with the $\lambda_j$ sampler taking a few iterations to start sampling well.)

What we can do with the posterior

Firstly I would suggest burning in the sample, say 50% in the following way:

J=X[[1]][(B*burn):B]

S=X[[2]][(B*burn):B,]

Lam=X[[3]][(B*burn):B,]

Now this example is probably not the best suggestion since we only will have 0-1 split points with an exponential data set, but in general, say you want to plot the posterior locations of the split points with a mode of M split points in the posterior. The following code will give you the posterior means of $\mathbf{S}|J=M$ and $\mathbf{\lambda}|J=M$

l1=matrix(rep(NA,(M+1)*nrow(Lam)),nrow=nrow(Lam))
s1=matrix(rep(NA,(M+2)*nrow(Lam)),nrow=nrow(Lam))

for(b in 1:nrow(l1)){
if(is.na(S[b,M+3])==TRUE && is.na(S[b,M+2])==FALSE){
for(k in 1:(M+2)){
s1[b,k]=S[b,k]
}

for(k in 1:(M+1)){
l1[b,k]=Lam[b,k]
}

}

}

colMeans(l1,na.rm=TRUE)
colMeans(s1,na.rm=TRUE)

After having the matrices l1 and s1, you can remove the NA rows (which is where $J \ne M$ in the posterior) and plot the densities of the location of the split points. This is how you can get graphs like this.

Which if you have more than one changepoint, you can plot them all on the same graph with different colors.

I hope this helps explain how to use the code and what model is being used. Don’t hesitate to email me if you have any questions.

If you enjoyed this post, take a look at some of my other Adventures in Statistics.

-Andrew G. Chapple