invGauss

R library for fitting a Mixture Inverse Gaussian distribution to survival data

Note

invGauss has been tested to a limited extent. There is always a risk of erroneous results.

Comments or bug reports can be sent to hakon.gjessing@uib.no.

Last updated: 28 March, 2014, with some minor modifications on 20 May, 2022.

Version 1.1 was an upgrade from 1.0, including some changes to default settings, implementing the use of analytic gradients has been implemented, together with adding a selection of optimization methods. Version 1.2 contains some minor fixes.

Install & load

invGauss is on CRAN and thus available for installation through the standard package handling system in R.

install.packages("invGauss")
library(invGauss)

Background

invGauss is designed to fit an Inverse Gaussian distribution to survival data (with possible censoring). The Inverse Gaussian distribution is derived as the barrier hitting distribution of a Brownian Motion with drift. To achieve higher flexibility the drift parameter m itself is modeled as a Gaussian random variable, giving different drift parameters for different individuals. Since some individuals will have a drift away from the barrier, the resulting hitting time distribution will be defective, i.e. it integrates to less than one. The model also incorporates covariates. Covariates can be included in the form

μi=g(aTxi)ci=h(bTyi)\mu_i=g(a^Tx_i)\\ c_i=h(b^T y_i)

μi\mu_i and cic_i are the individual values of drift and initial distance from the barrier, respectively, for indiviual ii. xix_i and yiy_i are covariate vectors and gg and hh are link functions. Typical link functions can be identity or exp. invGauss will estimate both parameter vectors aa and bb, in addition to τ\tau, which is the standard deviation of the Gaussian random variable describing drift. The xix_i and yiy_i covariate vectors can “overlap” in that they can consist of the same or different covariates.

The model is intended as an illustration of how one can model an underlying unobserved disease process where only the final outcome is observed.

For more details, we refer to:
  • Aalen OO, Borgan Ø, Gjessing HK. Survival and Event History Analysis: A Process Point of View. Springer-Verlag, 2008.
  • Aalen OO and Gjessing HK. Understanding the Shape of the Hazard Rate: A Process Point of View.
  • Statistical Science, 2001, Vol. 1, No. 1, 1-22

  • Aalen OO. Phase type distributions in survival analysis. Scandinavian Journal of Statistics, 1995, Vol. 22, Issue 4, 447-463.

Examples of use

The dataset d.oropha.rec is included in the library. To make it available, use

data(d.oropha.rec)

invGauss can then be run with, for instance,

res <- invGauss(formula.mu = Surv(time, status) ~ 1, formula.c = ~ cond + nstage + tstage, data = d.oropha.rec)

which corresponds to Model 5 in Table 10.2, page 412 in Aalen, Borgan, Gjessing (2008).

The results can be summarized with

summary(res)

which should produce this

Result

Another example would be

res1 <- invGauss(formula.mu = Surv(time, status) ~ sex + cond + nstage + tstage, formula.c = ~ sex + cond + nstage + tstage, data = d.oropha.rec, opti.method = "Nelder-Mead")

which corresponds to Model 4 in Aalen, Borgan, Gjessing (2008). (Note that the "Nelder-Mead" method has been replaced by "BFGS" as default in Version 1.1.)

Help files

More details about usage can be found in the help files. In R, use any of the following to view the help files:

?invGauss
?summary.invGauss
help(package = "invGauss")

Details

Here are some details of the estimation procedure:

Details

Author

Håkon K. Gjessing

Principal Investigator / Professor

image