Tuesday, July 19, 2011

Appendix to Advanced Bayesian Methods for Medical Test Accuracy

The code listed below can be copied and pasted to the WinBUGS worksheet and executed according to the instructions below.

The following will introduce the fundamentals of executing a Bayesian analysis with WinBUGS.
Consider the following example of an exercises stress test given by the 2 by 2 table below.

 Exercise Stress Test and Heart Disease

                                                                             CAD
EST
D = 0
D = 1
X = 0
327
208
X = 1
115
818



This information is contained in the first list statement of the following WinBUGS code, where a 1 is added to each cell frequency. This is in effect assumes a uniform prior is appropiate for the analysis.The Document below appears in the WinBUGS worksheeet.




# Measures of accuracy
# Binary Scores
Model;
{
# Dirichlet distribution for cell probabilities
g00~dgamma(a00,2)
g01~dgamma(a01,2)
g10~dgamma(a10,2)
g11~dgamma(a11,2)

h<-g00+g01+g10+g11
# the theta have a Dirichlet distribution
theta00<-g00/h
theta01<-g01/h
theta10<-g10/h
theta11<-g11/h
# the basic test accuracies are below
tpf<-theta11/(theta11+theta01)
se<-tpf
sp<-1-fpf
fpf<-theta10/(theta10+theta00)
tnf<-theta00/(theta00+theta10)
fnf<-theta01/(theta01+theta11)
ppv<-theta11/(theta10+theta11)
npv<-theta00/(theta00+theta01)
pdlr<-tpf/fpf
ndlr<-fnf/tnf

}

# Exercise Stress Test Pepe
# Uniform Prior (add one to each cell of the table frequencies !)
list(a00=328,a01=209,a10=116,a11=819)
# initial values
list(g00=1,g01=1,g10=1,g11=1)




In order to execute the analysis, the following sequence should be performed. The analysis will consisit of estimating the test accuracy with the true and false positive fractions and the postive and negative predictive values. 65,000 observations will be generated from the posterior distributions.
1.left click 'model' of the tool bar then left click on specification. This activates the specification tool.
2.active the word 'model' in the third line of the  code and click on the check model button of the specifcation tool.
3. activate the word 'list' from the first list statement of the code and click on the load data button of the specification tool.
4. click on the compile button of the specification tool.
5.click on the word 'list' of the second list statement of the code. This contains the inital values for the MCMC simulation.
6. click on the inference menu of the tool bar and click on sample monitor tool
7.in the node box type the word tpf, then click on set button of the sample monitor tool
8. in the node box, type fpf then click on the set button
9. type the word ppv in the node box and click on the set button
10. type he word npv in the node box then click on the set button
11.type * in the node box
12.type 5000 in the beg box of the sample monitor tool. The simulation is initiated with 5000 observations that are ignored for the posterior analysis.
13. click on the word model of the tool bar then click on the word update. This activates the update tool bar
14.type 70000 in the updates box and type 100 in the refresh box

You are now ready to execute the analysis!

15. click on the update box of the update tool bar. This initiates the simulation of 65000 observations from the posterior distribution of the four parameters.

You are now ready to veiw the output ( the posterior analysis of the simulation)

16. In the sample monitor tool click on the word stats and you will see the output below. Note that .7967 is the mean of the posterior distribution of the true positive fraction tpf. The error column is the simulation error for estimating the tpf with 65000 observations. Note also the posterior distribution of the fpf has a mean and standard deviation of .2612 and .0208 respectively and the 95% credible interval is (.2215,.3033).For additional information refer to the appendix of the book.


Parameter
Mean
SD
Error
Lower 2 1/2
Median
Upper 2 1/2
TPF
.7967
.0125
<.00001
.7716
.7968
.8208
FPF
.2612
.0208
<.00001
.2215
.2608
.3033






Friday, July 15, 2011

Chapter 11 Code for Advanced Bayesian Methods for Medical Test Accuracy


 Bugs Code 11.1

# one test
model;
{
for(i in 1:N){tpr[i]<-(tp[i]+.5)/(tp[i]+fn[i]+.05)}
for(i in 1:N){fpr[i]<-(fp[i]+.5)/(fp[i]+tn[i]+.05)}

for(i in 1:N){ u[i]<-logit(fpr[i])}

for(i in 1:N){ v[i]<-logit(tpr[i])}

for(i in 1:N){ b[i]<-v[i]-u[i]}

for(i in 1:N){s[i]<-v[i]+u[i]}


# bilogistic regression of b on s

for(i in 1:N){ b[i]~dlogis(mu[i],tau)

mu[i]<-beta[1]+beta[2]*s[i]}

for(i in 1:2){ beta[i]~dnorm(.0000,.0001)}

tau~dgamma(.0001,.0001)

P<-1+exp(-beta[1]/2)

# accuracy of test

Q<-1/P

#sroc curve assumes slope is 0
r1<-exp(-beta[1])
for( i in 1:N){r2[i]<-(1-fpr[i])/fpr[i]}
for( i in 1:N){r3[i]<-1+r1*r2[i]}
for( i in 1:N){sroc[i]<-1/r3[i]}

}
# data from DeVries et al. duplex mode
list(N=8, tn=c(516,89,235,262,488,48,156,376),
fn=c(28,8,23,20,14,7,2,31),
fp=c(20,12,5,22,9,3,14,12),
tp=c(78,59,75,89,118,48,39,121))

# data from meijer et al.
list( N=20,
tn=c(23,35,10,9,23,30,60,48,12,38,23,60,9,42,27,50,35,5,37,20),
fn=c(2,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1),
fp=c(1,2,1,1,1,6,7,4,4,1,6,7,1,4,3,5,1,1,3,3),
tp=c(36,28,29,29,16,19,20,18,88,12,35,14,25,53,38,25,13,26,44,76))


# initial values
list( beta=c(0,0), tau=1))

Bugs Code 11.2

model;
{
for(i in 1:N){tpr[i]<-(tp[i]+.5)/(tp[i]+fn[i]+.05)}
for(i in 1:N){fpr[i]<-(fp[i]+.5)/(fp[i]+tn[i]+.05)}

for(i in 1:N){ u[i]<-logit(fpr[i])}

for(i in 1:N){ v[i]<-logit(tpr[i])}

for(i in 1:N){ b[i]<-v[i]-u[i]}

for(i in 1:N){s[i]<-v[i]+u[i]}


# bilogistic regression of b on s

for(i in 1:N){ b[i]~dlogis(mu[i],tau)

mu[i]<-beta[1]+beta[2]*s[i]}

for(i in 1:2){ beta[i]~dnorm(.0000,.0001)}

tau~dgamma(.0001,.0001)

P<-1+exp(-beta[1]/2)

# accuracy of test

Q<-1/P
# sroc curve, does not assume slope is zero

r1<-exp(-beta[1]/(1-beta[2]))
for( i in 1:N){r2[i]<-(1-fpr[i])/fpr[i]}
r3<-(1+beta[2])/(1-beta[2])
for( i in 1:N){r4[i]<-pow(r2[i],r3)}
for( i in 1:N){r5[i]<-1+r1*r4[i]}
for( i in 1:N){sroc[i]<-1/r5[i]}


}
# data from DeVries et al. duplex mode
list(N=8, tn=c(516,89,235,262,488,48,156,376),
fn=c(28,8,23,20,14,7,2,31),
fp=c(20,12,5,22,9,3,14,12),
tp=c(78,59,75,89,118,48,39,121))


# initial values
list( beta=c(0,0), tau=1))

Bugs Code 11.3

# one test no covariates
model;
{

for(i in 1:N){ u[i]<-logit(fpr[i])}

for(i in 1:N){ v[i]<-logit(tpr[i])}

for(i in 1:N){ b[i]<-v[i]-u[i]}

for(i in 1:N){s[i]<-v[i]+u[i]}


# bilogistic regression of b on s

for(i in 1:N){ b[i]~dlogis(mu[i],tau)

mu[i]<-beta[1]+beta[2]*s[i]}

for(i in 1:2){ beta[i]~dnorm(.0000,.0001)}

tau~dgamma(.0001,.0001)

P<-1+exp(-beta[1]/2)

# accuracy of test

Q<-1/P

#sroc curve assumes slope is 0
r1<-exp(-beta[1])
for( i in 1:N){r2[i]<-(1-fpr[i])/fpr[i]}
for( i in 1:N){r3[i]<-1+r1*r2[i]}
for( i in 1:N){sroc[i]<-1/r3[i]}

}
# data from Pakos et al.
list(N=19,
tpr=c(.67,.75,.85,.35,.99,.95,.90,.99,.80,.78,.95,.61,.90,.89,.88,.93,.93,.43,.38
),
fpr=c(.15,.05,.11,.17,.22,.33,.33,.01,.33,.60,.43,.04,.17,.01,.10,.70,.01,.01,.57))

# data from Huebner et al.

list(N=10, tpr=c(.96,.90,.99,.98,.92,.96,.99,.99,.95,.94),
 fpr=c(.03,.01,.33,.17,.20,.01,.31,.29,.21,.01))


# initial values
list( beta=c(0,0), tau=1))

BUGS CODE 11.4
# two tests
model;
{

# for test 1
for(i in 1:N1){ u1[i]<-logit(fpr1[i])}

for(i in 1:N1){ v1[i]<-logit(tpr1[i])}

for(i in 1:N1){ b1[i]<-v1[i]-u1[i]}

for(i in 1:N1){ s1[i]<-v1[i]+u1[i]}

# bilogistic regression test 1

for(i in 1:N1){ b1[i]~dlogis(mu1[i],tau1)

mu1[i]<-beta1[1]+beta1[2]*s1[i]}

for(i in 1:2){ beta1[i]~dnorm(.0000,.0001)}

tau1~dgamma(.0001,.0001)

P1<-1+exp(-beta1[1]/2)

# accuracy of test 1

Q1<-1/P1

#sroc test 1, assumes slope is 0

r11<-exp(-beta1[1])
for( i in 1:N1){r12[i]<-(1-fpr1[i])/fpr1[i]}
for( i in 1:N1){r13[i]<-1+r11*r12[i]}
for( i in 1:N1){sroc1[i]<-1/r13[i]}
# for test 2
for(i in 1:N2){ u2[i]<-logit(fpr2[i])}

for(i in 1:N2){ v2[i]<-logit(tpr2[i])}

for(i in 1:N2){ b2[i]<-v2[i]-u2[i]}

for(i in 1:N2){ s2[i]<-v2[i]+u2[i]}

# bilogistic regression test 2

for(i in 1:N2){ b2[i]~dlogis(mu2[i],tau2)

mu2[i]<-beta2[1]+beta2[2]*s2[i]}

for(i in 1:2){ beta2[i]~dnorm(.0000,.0001)}

tau2~dgamma(.0001,.0001)

#sroc test 2, assumes slope is zero

r21<-exp(-beta2[1])
for( i in 1:N2){r22[i]<-(1-fpr2[i])/fpr2[i]}
for( i in 1:N2){r23[i]<-1+r21*r22[i]}
for( i in 1:N2){sroc2[i]<-1/r23[i]}

P2<-1+exp(-beta2[1]/2)

# accuracy of test 2

Q2<-1/P2

# difference in accuracy of two tests

d<-Q1-Q2

}

# below is data from DeVries et al. for duplex and color
# duplex is test 1
# color is test 2
list( N1 = 8, fpr1=c(.04,.12,.02,.08,.02,.06,.08,.03),
     tpr1=c(.74,.88,.77,.82,.89,.87,.95,.80),
N2=6, fpr2=c(.01,.01,.02,.06,.05,.05),
         tpr2=c(.90,.99,.88,.89,.99,.96))

# horsthuis et al. study
# test 1 is US and test 2 is MR
list( N1 =9, fpr1=c(.10,.01,.06,.12,.07,.07,.01,.33,.04),
     tpr1=c(.78,.93,.90,.81,.93,.88,.87,.96,.92),
N2=7, fpr2=c(.01,.01,.39,.39,.01,.08,.15),
         tpr2=c(.99,.99,.91,.87,.82,.95,.89))


list(beta1=c(0,0), tau1=1, beta2=c(0,0),tau2=1)

Bugs Code 11.5

# one test with covariates
model;
{

for(i in 1:N){ u[i]<-logit(fpr[i])}

for(i in 1:N){ v[i]<-logit(tpr[i])}

for(i in 1:N){ b[i]<-v[i]-u[i]}

for(i in 1:N){s[i]<-v[i]+u[i]}


# bilogistic regression of b on s

for(i in 1:N){ b[i]~dlogis(mu[i],tau)

mu[i]<-beta[1]+beta[2]*s[i]+neta[1]*x1[i]+neta[2]*x2[i]}

for(i in 1:2){ beta[i]~dnorm(.0000,.0001)}
for(i in 1:2){ neta[i]~dnorm(.0000,.0001)}
tau~dgamma(.0001,.0001)

P<-1+exp(-beta[1]/2)

# accuracy of test

Q<-1/P

#sroc curve assumes slope is 0
r1<-exp(-beta[1])
for( i in 1:N){r2[i]<-(1-fpr[i])/fpr[i]}
for( i in 1:N){r3[i]<-1+r1*r2[i]}
for( i in 1:N){sroc[i]<-1/r3[i]}

}
# data from Horsthuis et al.
#x1 is fraction with ulcerative colitis
# x2 is fraction of males
list(N=9,tpr=c(.78,.935,.90,.812,.931,.884,.87,.96,.92),
                fpr=c(.10,.01,.05,.12,.07,.07,.01,.33,.03),
# the first component of x2 is the average of the remaining 8
                x1=c(.322,.093,.322,1,.406,.643,1,1,.642),
                x2=c(.4088,.097,.457,.571,.423,.465,.366,.571,.321))

# Pakos et al.
list(N=19,
tpr=c(.67,.75,.85,.35,.99,.95,.90,.99,.80,.78,.95,.61,.90,.89,.88,.93,.93,.43,.38),
fpr=c(.15,.26,.11,.17,.22,.33,.33,.01,.33,.6,.43,.04,.17,.01,.10,.70,.01,.01,.57),

# components 2,3,4,14, and 17 of x1were given the age 54, the avg of remaining
# components 2,3,4,14, and 17 of x2 were given the value .70, the avg of remaining
# x1 is the average age
#x2 is the percentage of males
x1=c(59,54,54,54,48,57,61,56,66,45,58,48,47,54,60,58,54,48,46),
x2=c(83,70,70,70,81,53,68,67,86,72,82,76,71,70,58,68,70,59,67))

# initial values
list( beta=c(0,0),neta=c(0,0), tau=1))

Bugs Code 11.6

# two tests  with covariates
model;
{

for(i in 1:N1){ u1[i]<-logit(fpr1[i])}

for(i in 1:N1){ v1[i]<-logit(tpr1[i])}

for(i in 1:N1){ b1[i]<-v1[i]-u1[i]}

for(i in 1:N1){s1[i]<-v1[i]+u1[i]}


# bilogistic regression of b on s

for(i in 1:N1){ b1[i]~dlogis(mu1[i],tau1)

mu1[i]<-beta1[1]+beta1[2]*s1[i]+neta1[1]*x11[i]+neta1[2]*x12[i]}

for(i in 1:2){ beta1[i]~dnorm(.0000,.0001)}
for(i in 1:2){ neta1[i]~dnorm(.0000,.0001)}
tau1~dgamma(.0001,.0001)

P1<-1+exp(-beta1[1]/2)

# accuracy of test 1

Q1<-1/P1

#sroc curve assumes slope is 0
r11<-exp(-beta1[1])
for( i in 1:N1){r12[i]<-(1-fpr1[i])/fpr1[i]}
for( i in 1:N1){r13[i]<-1+r11*r12[i]}
for( i in 1:N1){sroc1[i]<-1/r13[i]}

 # for test 2

for(i in 1:N2){ u2[i]<-logit(fpr2[i])}

for(i in 1:N2){ v2[i]<-logit(tpr2[i])}

for(i in 1:N2){ b2[i]<-v2[i]-u2[i]}

for(i in 1:N2){s2[i]<-v2[i]+u2[i]}

# bilogistic regression of b2 on s2

for(i in 1:N2){ b2[i]~dlogis(mu2[i],tau2)

mu2[i]<-beta2[1]+beta2[2]*s2[i]+neta2[1]*x21[i]+neta2[2]*x22[i]}

for(i in 1:2){ beta2[i]~dnorm(.0000,.0001)}
for(i in 1:2){ neta2[i]~dnorm(.0000,.0001)}
tau2~dgamma(.0001,.0001)

P2<-1+exp(-beta2[1]/2)

# accuracy of test

Q2<-1/P2

#sroc curve assumes slope is 0
r21<-exp(-beta2[1])
for( i in 1:N2){r22[i]<-(1-fpr2[i])/fpr2[i]}
for( i in 1:N2){r23[i]<-1+r21*r22[i]}
for( i in 1:N2){sroc2[i]<-1/r23[i]}

d<-Q1-Q2

}

# data from Horsthuis et al. for US and MRI
# US is test 1 and MRI is test 2
#x11 is fraction with crohn disease
# x12 is fraction of males

list(N1=9,tpr1=c(.78,.935,.90,.812,.931,.884,.87,.96,.92),
                fpr1=c(.10,.01,.05,.12,.07,.07,.01,.33,.03),
# the first component of x12 is the average of the remaining 8
                x11=c(.322,.093,.322,1,.406,.643,1,1,.642),
                x12=c(.4088,.097,.457,.571,.423,.465,.366,.571,.321),
# the following is for test 2 or MRI
                N2=7,
                x21=c(.6,.54,1,1,.34,.36,1),
                x22=c(.6,.42,.46,.36,.49,.56,.52),

                tpr2=c(.99,.99,.913,.87,.818,.956,.889),
                 fpr2=c(.18,.01,.29,.29,.01,.08,.143))


# initial values
list( beta1=c(0,0),neta1=c(0,0), tau1=1,beta2=c(0,0),neta2=c(0,0), tau2=1))

BUGS CODE 11.7
model;
{

# binomial distributions for the two tests
for(i in 1:17){x[i]~dbin(p[i],m[i])}
for(i in 1:17){y[i]~dbin(q[i],n[i])}
# logit models for the two complication rates
for( i in 1:17){logit(p[i])<-theta[i]}
for( i in 1:17){logit(q[i])<-phi[i]}
# prior distributions
for( i in 1:17){theta[i]~dnorm(muth,tauth)}
for( i in 1:17){phi[i]~dnorm(muphi,tauphi)}
muth~dnorm(0.000,.0001)
muphi~dnorm(0.000,.0001)
tauth~dgamma(.00001,.00001)
tauphi~dgamma(.00001,.00001)
# mean of complication rates
pee<-mean(p[])
qee<-mean(q[])
d<-pee-qee

}
# heinrich et al. meta analysis
list( m=c(123,72,54,210,58,105,76,134,35,25,64,54,32,100,20,59,101),
n=c(125,76,48,204,56,116,77,125,35,25,65,49,32,50,19,60,99),
x=c(6,5,1,14,0,4,2,12,2,4,2,5,1,7,1,0,0),
y=c(7,0,5,9,1,1,0,17,9,4,17,14,0,3,0,0,2))
# initial values
# must initiate other chain with specification tool
list(muth=0,muphi=0,tauth=1,tauphi=1)

Bugs Code 11.8

# one test assumes beta[2] is not zero
model;
{

for(i in 1:N){ u[i]<-logit(fpr[i])}

for(i in 1:N){ v[i]<-logit(tpr[i])}

for(i in 1:N){ b[i]<-v[i]-u[i]}

for(i in 1:N){s[i]<-v[i]+u[i]}

# bilogistic regression of b on s

for(i in 1:N){ b[i]~dlogis(mu[i],tau)

mu[i]<-beta[1]+beta[2]*s[i]}

for(i in 1:2){ beta[i]~dnorm(.0000,.0001)}

tau~dgamma(.0001,.0001)

P<-1+exp(-beta[1]/2)

# accuracy of test
Q<-1/P
# sroc curve, does not assume slope is zero

r1<-exp(-beta[1]/(1-beta[2]))
for( i in 1:N){r2[i]<-(1-fpr[i])/fpr[i]}
r3<-(1+beta[2])/(1-beta[2])
for( i in 1:N){r4[i]<-pow(r2[i],r3)}
for( i in 1:N){r5[i]<-1+r1*r4[i]}
for( i in 1:N){sroc[i]<-1/r5[i]}
}
# data from Huebner et al.
list(N=10, tpr=c(.96,.90,.99,.98,.92,.96,.99,.99,.95,.94),
 fpr=c(.03,.01,.33,.17,.20,.01,.31,.29,.21,.01))
# initial values
list( beta=c(0,0), tau=1))





BUGS CODE 11.9

model;
{

# binomial distributions for the two tests
# Compares complications of two tests with Covariates
for(i in 1:17){x[i]~dbin(p[i],m[i])}
for(i in 1:17){y[i]~dbin(q[i],n[i])}
# logit models for the two complication rates
for( i in 1:17){logit(p[i])<-theta[i]+neta1[1]*x1[i]+neta1[2]*x2[i]}
for( i in 1:17){logit(q[i])<-theta[i]+phi[i]+neta2[1]*y1[i]+neta2[2]*y2[i]}
# prior distributions
neta1[1]~dnorm(0.000,.0001)
neta1[2]~dnorm(0.000,.0001)

neta2[1]~dnorm(0.000,.0001)
neta2[2]~dnorm(0.000,.0001)

for( i in 1:17){theta[i]~dnorm(muth,tauth)}
for( i in 1:17){phi[i]~dnorm(muphi,tauphi)}
muth~dnorm(0.000,.0001)
muphi~dnorm(0.000,.0001)
tauth~dgamma(.00001,.00001)
tauphi~dgamma(.00001,.00001)
# mean of complication rates
pee<-mean(p[])
qee<-mean(q[])
d<-pee-qee

}
# heinrich et al. meta analysis
list( m=c(123,72,54,210,58,105,76,134,35,25,64,54,32,100,20,59,101),
n=c(125,76,48,204,56,116,77,125,35,25,65,49,32,50,19,60,99),
x=c(6,5,1,14,0,4,2,12,2,4,2,5,1,7,1,0,0),
y=c(7,0,5,9,1,1,0,17,9,4,17,14,0,3,0,0,2),
# x1 is the age for control group
x1=c(68.3,65.4,65,70.5,60.6,60.6,67,71,70,73.8,71.1,62,67,55,61.9,62.4,61),
#y1 is age for the treatment group
y1=c(69.5,67.1,66,72.4,61.1,62.1,67.3,72.8,72,72,70.6,65,69,52,60.2,63,59),
# x2 is the % women for control group
x2=c(50,36,52,40,31,25,33,31,31,44,36,28,12,18,23,19,20),
# y2 is % women for treatment group
y2=c(57,24,33,32,27,24,30,28.4,28.4,44,46,31,16,22,23,7,12))
# initial values
# must initiate other chain with specification tool
list(muth=0,muphi=0,tauth=1,tauphi=1, neta1=c(0,0), neta2=c(0,0))