R: Validate Predicted Probabilities (2024)

val.prob {rms}R Documentation

Validate Predicted Probabilities

Description

The val.prob function is useful for validatingpredicted probabilities against binary events.

Given a set of predicted probabilities p or predicted log oddslogit, and a vector of binary outcomes y that were notused in developing the predictions p or logit,val.prob computes the following indexes and statistics: Somers'D_{xy} rank correlation between p and y[2(C-.5), C=ROC area], Nagelkerke-Cox-Snell-Maddala-MageeR-squared index, Discrimination index D [ (Logistic modelL.R. \chi^2 - 1)/n], L.R. \chi^2,its P-value, Unreliability index U, \chi^2with 2 d.f. for testing unreliability (H0: intercept=0, slope=1), itsP-value, the quality index Q, Brier score (averagesquared difference in p and y), Intercept, andSlope, E_{max}=maximum absolute difference in predictedand loess-calibrated probabilities, Eavg, the average in same,E90, the 0.9 quantile of same, the Spiegelhalter Z-test for calibration accuracy, and its two-tailed P-value. Ifpl=TRUE, plots fitted logistic calibration curve and optionally a smooth nonparametric fit usinglowess(p,y,iter=0) and grouped proportions vs. mean predictedprobability in group. If the predicted probabilities or logits areconstant, the statistics are returned and no plot is made.Eavg, Emax, E90 were from linear logistic calibration beforerms 4.5-1.

When group is present, different statistics are computed,different graphs are made, and the object returned by val.prob isdifferent. group specifies a stratification variable.Validations are done separately by levels of group and overall. Aprint method prints summary statistics and several quantiles ofpredicted probabilities, and a plot method plots calibrationcurves with summary statistics superimposed, along with selectedquantiles of the predicted probabilities (shown as tick marks oncalibration curves). Only the lowess calibration curve isestimated. The statistics computed are the average predictedprobability, the observed proportion of events, a 1 d.f. chi-squarestatistic for testing for overall mis-calibration (i.e., a test of theobserved vs. the overall average predicted probability of the event)(ChiSq), and a 2 d.f. chi-square statistic for testingsimultaneously that the intercept of a linear logistic calibration curveis zero and the slope is one (ChiSq2), average absolutecalibration error (average absolute difference between thelowess-estimated calibration curve and the line of identity,labeled Eavg), Eavg divided by the difference between the0.95 and 0.05 quantiles of predictive probabilities (Eavg/P90), a"median odds ratio", i.e., the anti-log of the median absolutedifference between predicted and calibrated predicted log odds of theevent (Med OR), the C-index (ROC area), the Brier quadratic errorscore (B), a chi-square test of goodness of fit based on theBrier score (B ChiSq), and the Brier score computed on calibrated rather than rawpredicted probabilities (B cal). The first chi-square test is atest of overall calibration accuracy ("calibration in the large"), andthe second will also detect errors such as slope shrinkage caused byoverfitting or regression to the mean. See Cox (1970) for both of thesescore tests. The goodness of fit test based on the (uncalibrated) Brierscore is due to Hilden, Habbema, and Bjerregaard (1978) and is discussedin Spiegelhalter (1986). When group is present you can alsospecify sampling weights (usually frequencies), to obtainedweighted calibration curves.

To get the behavior that results from a grouping variable being presentwithout having a grouping variable, use group=TRUE. In theplot method, calibration curves are drawn and labeled by defaultwhere they are maximally separated using the labcurve function.The following parameters do not apply when group is present:pl, smooth, logistic.cal, m, g,cuts, emax.lim, legendloc, riskdist,mkh, connect.group, connect.smooth. The followingparameters apply to the plot method but not to val.prob:xlab, ylab, lim, statloc, cex.

Usage

val.prob(p, y, logit, group, weights=rep(1,length(y)), normwt=FALSE, pl=TRUE, smooth=TRUE, logistic.cal=TRUE, xlab="Predicted Probability", ylab="Actual Probability", lim=c(0, 1), m, g, cuts, emax.lim=c(0,1), legendloc=lim[1] + c(0.55 * diff(lim), 0.27 * diff(lim)), statloc=c(0,0.99), riskdist=c("predicted", "calibrated"), cex=.7, mkh=.02, connect.group=FALSE, connect.smooth=TRUE, g.group=4, evaluate=100, nmin=0)## S3 method for class 'val.prob'print(x, ...)## S3 method for class 'val.prob'plot(x, xlab="Predicted Probability", ylab="Actual Probability", lim=c(0,1), statloc=lim, stats=1:12, cex=.5, lwd.overall=4, quantiles=c(.05,.95), flag, ...)

Arguments

p

predicted probability

y

vector of binary outcomes

logit

predicted log odds of outcome. Specify either p or logit.

group

a grouping variable. If numeric this variable is grouped intog.group quantile groups (default is quartiles). Set group=TRUE touse the group algorithm but with a single stratum for val.prob.

weights

an optional numeric vector of per-observation weights (usually frequencies),used only if group is given.

normwt

set to TRUE to make weights sum to the number of non-missing observations.

pl

TRUE to plot calibration curves and optionally statistics

smooth

plot smooth fit to (p,y) using lowess(p,y,iter=0)

logistic.cal

plot linear logistic calibration fit to (p,y)

xlab

x-axis label, default is "Predicted Probability" for val.prob.

ylab

y-axis label, default is "Actual Probability" forval.prob.

lim

limits for both x and y axes

m

If grouped proportions are desired, average no. observations per group

g

If grouped proportions are desired, number of quantile groups

cuts

If grouped proportions are desired, actual cut points for constructingintervals, e.g. c(0,.1,.8,.9,1) or seq(0,1,by=.2)

emax.lim

Vector containing lowest and highest predicted probability over which tocompute Emax.

legendloc

If pl=TRUE, list with components x,y or vector c(x,y) for upper left cornerof legend for curves and points. Default is c(.55, .27) scaled tolim. Use locator(1) to use the mouse, FALSE to suppress legend.

statloc

D_{xy}, C, R^2, D, U, Q, Brier score, Intercept, Slope, and E_{max}will be added to plot, usingstatloc as the upper left corner of a box (default is c(0,.9)).You can specify a list or a vector. Use locator(1)for the mouse, FALSE to suppress statistics. This is plotted after the curve legends.

riskdist

Use "calibrated" to plot the relative frequency distribution ofcalibrated probabilities after dividing into 101 bins from lim[1] tolim[2].Set to "predicted" (the default as of rms 4.5-1) to use raw assigned risk, FALSE to omit risk distribution.Values are scaled so that highest bar is 0.15*(lim[2]-lim[1]).

cex

Character size for legend or for table of statistics when group is given

mkh

Size of symbols for legend. Default is 0.02 (see par()).

connect.group

Defaults to FALSE to only represent group fractions as triangles.Set to TRUE to also connect with a solid line.

connect.smooth

Defaults to TRUE to draw smoothed estimates using a dashed line.Set to FALSE to instead use dots at individual estimates.

g.group

number of quantile groups to use when group is given and variable isnumeric.

evaluate

number of points at which to store the lowess-calibration curve.Default is 100. If there are more than evaluate unique predictedprobabilities, evaluate equally-spaced quantiles of the uniquepredicted probabilities, with linearly interpolated calibrated values,are retained for plotting (and stored in the object returned byval.prob.

nmin

applies when group is given. When nmin > 0, val.prob will notstore coordinates of smoothed calibration curves in the outer tails,where there are fewer than nmin raw observations represented inthose tails. If for example nmin=50, the plot function will onlyplot the estimated calibration curve from a to b, where there are50 subjects with predicted probabilities < a and > b.nmin is ignored when computing accuracy statistics.

x

result of val.prob (with group in effect)

...

optional arguments for labcurve (through plot). Commonly usedoptions are col (vector of colors for the strata plus overall) andlty. Ignored for print.

stats

vector of column numbers of statistical indexes to write on plot

lwd.overall

line width for plotting the overall calibration curve

quantiles

a vector listing which quantiles should be indicated on eachcalibration curve using tick marks. The values in quantiles can beany number of values from the following: .01, .025, .05, .1, .25, .5, .75, .9, .95, .975, .99.By default the 0.05 and 0.95 quantiles are indicated.

flag

a function of the matrix of statistics (rows representing groups)returning a vector of character strings (one value for each group, including"Overall"). plot.val.prob will print this vector of charactervalues to the left of the statistics. The flag function can refer to columns of the matrix used as input to the function bytheir names given in the description above. The default functionreturns "*" if either ChiSq2 or B ChiSq issignificant at the 0.01 level and " " otherwise.

Details

The 2 d.f. \chi^2 test and Med OR exclude predicted orcalibrated predicted probabilities \leq 0 to zero or \geq 1,adjusting the sample size as needed.

Value

val.prob without group returns a vector with the following namedelements: Dxy, R2, D, D:Chi-sq, D:p,U, U:Chi-sq, U:p, Q, Brier,Intercept, Slope, S:z, S:p, Emax.When group is present val.prob returns an object of classval.prob containing a list with summary statistics and calibrationcurves for all the strata plus "Overall".

Author(s)

Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com

References

Harrell FE, Lee KL, Mark DB (1996): Multivariable prognostic models:Issues in developing models, evaluating assumptions and adequacy, andmeasuring and reducing errors. Stat in Med 15:361–387.

Harrell FE, Lee KL (1987): Using logistic calibration to assess theaccuracy of probability predictions (Technical Report).

Miller ME, Hui SL, Tierney WM (1991): Validation techniques forlogistic regression models. Stat in Med 10:1213–1226.

Stallard N (2009): Simple tests for the external validation of mortalityprediction scores. Stat in Med 28:377–388.

Harrell FE, Lee KL (1985): A comparison of the discriminationof discriminant analysis and logistic regression under multivariatenormality. In Biostatistics: Statistics in Biomedical, Public Health,and Environmental Sciences. The Bernard G. Greenberg Volume, ed. PKSen. New York: North-Holland, p. 333–343.

Cox DR (1970): The Analysis of Binary Data, 1st edition, section 4.4.London: Methuen.

Spiegelhalter DJ (1986):Probabilistic prediction in patient management.Stat in Med 5:421–433.

Rufibach K (2010):Use of Brier score to assess binary predictions. JClin Epi 63:938-939

Tjur T (2009):Coefficients of determination in logistic regressionmodels-A new proposal:The coefficient of discrimination. Am Statist63:366–372.

See Also

validate.lrm, lrm.fit, lrm,labcurve,wtd.stats, scat1d

Examples

# Fit logistic model on 100 observations simulated from the actual # model given by Prob(Y=1 given X1, X2, X3) = 1/(1+exp[-(-1 + 2X1)]),# where X1 is a random uniform [0,1] variable. Hence X2 and X3 are # irrelevant. After fitting a linear additive model in X1, X2,# and X3, the coefficients are used to predict Prob(Y=1) on a# separate sample of 100 observations. Note that data splitting is# an inefficient validation method unless n > 20,000.set.seed(1)n <- 200x1 <- runif(n)x2 <- runif(n)x3 <- runif(n)logit <- 2*(x1-.5)P <- 1/(1+exp(-logit))y <- ifelse(runif(n)<=P, 1, 0)d <- data.frame(x1,x2,x3,y)f <- lrm(y ~ x1 + x2 + x3, subset=1:100)pred.logit <- predict(f, d[101:200,])phat <- 1/(1+exp(-pred.logit))val.prob(phat, y[101:200], m=20, cex=.5) # subgroups of 20 obs.# Validate predictions more stringently by stratifying on whether# x1 is above or below the medianv <- val.prob(phat, y[101:200], group=x1[101:200], g.group=2)vplot(v)plot(v, flag=function(stats) ifelse( stats[,'ChiSq2'] > qchisq(.95,2) | stats[,'B ChiSq'] > qchisq(.95,1), '*', ' ') )# Stars rows of statistics in plot corresponding to significant# mis-calibration at the 0.05 level instead of the default, 0.01plot(val.prob(phat, y[101:200], group=x1[101:200], g.group=2), col=1:3) # 3 colors (1 for overall)# Weighted calibration curves# plot(val.prob(pred, y, group=age, weights=freqs))

[Package rms version 6.7-1 Index]

R: Validate Predicted Probabilities (2024)
Top Articles
Latest Posts
Article information

Author: Francesca Jacobs Ret

Last Updated:

Views: 6450

Rating: 4.8 / 5 (68 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Francesca Jacobs Ret

Birthday: 1996-12-09

Address: Apt. 141 1406 Mitch Summit, New Teganshire, UT 82655-0699

Phone: +2296092334654

Job: Technology Architect

Hobby: Snowboarding, Scouting, Foreign language learning, Dowsing, Baton twirling, Sculpting, Cabaret

Introduction: My name is Francesca Jacobs Ret, I am a innocent, super, beautiful, charming, lucky, gentle, clever person who loves writing and wants to share my knowledge and understanding with you.