Title: | Compute FAB (Frequentist and Bayes) Conformal Prediction Intervals |
---|---|
Description: | Computes and plots prediction intervals for numerical data or prediction sets for categorical data using prior information. Empirical Bayes procedures to estimate the prior information from multi-group data are included. See, e.g.,Bersson and Hoff (2022) <arXiv:2204.08122> "Optimal Conformal Prediction for Small Areas". |
Authors: | Elizabeth Bersson [aut, cre, cph] |
Maintainer: | Elizabeth Bersson <[email protected]> |
License: | GPL-3 |
Version: | 1.0.4 |
Built: | 2024-11-22 04:37:34 UTC |
Source: | https://github.com/betsybersson/fabprediction |
This function computes the Bayesian prediction set for a multinomial conjugate family.
bayesMultinomialPrediction( Y, alpha = 0.15, gamma = rep(1, length(Y)), category_names = 1:length(Y) )
bayesMultinomialPrediction( Y, alpha = 0.15, gamma = rep(1, length(Y)), category_names = 1:length(Y) )
Y |
Observed data vector of length K containing counts of observations from each of the K categories |
alpha |
Prediction mis-coverage rate |
gamma |
Dirichlet prior concentration for the K categories |
category_names |
Category names (optional) |
pred object
This function computes a Bayesian prediction interval based on a normal model.
bayesNormalPrediction(Y, alpha = 0.15, mu = 0, tau2 = 1)
bayesNormalPrediction(Y, alpha = 0.15, mu = 0, tau2 = 1)
Y |
Observed data vector |
alpha |
Prediction error rate |
mu |
Prior expected mean of the population mean |
tau2 |
Prior expected variance of the population mean |
pred object
This function computes a conformal prediction region under the distance-from-average non-conformity measure. That is, |a + bz*| <= |ci + di z^*| where i indexes training data.
dtaPrediction(Y, alpha = 0.15)
dtaPrediction(Y, alpha = 0.15)
Y |
Observed data vector |
alpha |
Prediction error rate |
pred object
This function returns an NxN identity matrix.
eye(N)
eye(N)
N |
dimension of square matrix |
NxN identity matrix
This function computes a FAB conformal prediction set as described in Bersson and Hoff 2023.
fabCategoricalPrediction( Y, alpha = 0.15, gamma = rep(1, length(Y)), category_names = 1:length(Y) )
fabCategoricalPrediction( Y, alpha = 0.15, gamma = rep(1, length(Y)), category_names = 1:length(Y) )
Y |
Observed data vector of length K containing counts of observations from each of the K categories |
alpha |
Prediction mis-coverage rate |
gamma |
Dirichlet prior concentration for the K categories |
category_names |
Category names (optional) |
pred object
This function computes a FAB conformal prediction region as described in Bersson and Hoff 2022.
fabContinuousPrediction(Y, alpha = 0.15, mu = 0, tau2 = 1)
fabContinuousPrediction(Y, alpha = 0.15, mu = 0, tau2 = 1)
Y |
Observed data vector |
alpha |
Prediction error rate |
mu |
Prior expected mean of the population mean |
tau2 |
Prior expected variance of the population mean |
pred object
A package for computing and plotting prediction intervals for numerical data or prediction sets for categorical data using prior information. Empirical Bayes procedures to estimate the prior information from multi-group data are included.
Elizabeth Bersson,
Maintainer: Elizabeth Bersson <[email protected]>
E. Bersson and P.D. Hoff. (2023) Frequentist Prediction Sets for Species Abundance using Indirect Information. Preprint.
E. Bersson and P.D. Hoff. (2023) Optimal Conformal Prediction for Small Areas. Journal of Survey Statistics and Methodology, forthcoming.
This function returns empirical Bayesian estimates for a specified group from the conjugate normal spatial Fay-Herriot model.
fayHerriotEB(j, Y, group, W = NA, X = NA)
fayHerriotEB(j, Y, group, W = NA, X = NA)
j |
Obtain EB values for group in index j- numeric value in group |
Y |
Data vector |
group |
index vecter of the same lenght as Y |
W |
Non-standardized adjacency matrix |
X |
Group-level covariates |
empirical Bayesian estimates of population mean and it's variance
Method of moment matching to obtain an initial guess of the MLE, as in Minka (2000).
initMoM(D)
initMoM(D)
D |
matrix (JxK) of counts; each row is a sample from a MN distribution with K categories |
Hessian
This function computes a prediction interval under assumed normality.
normalPrediction(Y, alpha = 0.15)
normalPrediction(Y, alpha = 0.15)
Y |
Observed data vector |
alpha |
Prediction error rate |
pred object
Plot a 'pred' object constructed for a categorical response
## S3 method for class 'pred' plot(x, ...)
## S3 method for class 'pred' plot(x, ...)
x |
pred object- a list classified as pred containing objects data and bound |
... |
additional parameters passed to the default plot method |
capability to plot pred object. More details: the command 'plot(obj)' plots the empirical densities of each category. Mass denoted in red indicates inclusion in the prediction set
This function returns plug-in values for a conjugate normal spatial Fay-Herriot model.
pluginValues(Y, group, W = NA, X = NA)
pluginValues(Y, group, W = NA, X = NA)
Y |
Data vector |
group |
Group membership of each entry in Y |
W |
Adjacency matrix |
X |
Group-level covariates |
plug-in values of spatial Fay-Herriot model
Obtain gradient of the marginal Dirichlet-multinomial likelihood
polyaGradient(D, gamma, Nj = rowSums(D), K = ncol(D))
polyaGradient(D, gamma, Nj = rowSums(D), K = ncol(D))
D |
matrix (JxK) of counts; each row is a sample from a MN distribution with K categories |
gamma |
current value of prior concentration parameter |
Nj |
sample sizes of the J groups |
K |
number of categories |
gradient
Obtain Hessian of the marginal Dirichlet-multinomial likelihood
polyaHessian(D, gamma, Nj = rowSums(D), K = ncol(D))
polyaHessian(D, gamma, Nj = rowSums(D), K = ncol(D))
D |
matrix (JxK) of counts; each row is a sample from a MN distribution with K categories |
gamma |
current value of prior concentration parameter |
Nj |
sample sizes of the J groups |
K |
number of categories |
Hessian
This function retuns the MLE of the prior concentration from a marginal Dirichlet-multinomial likelihood. Default method iterates a Newton-Raphson algorithm until convergence.
polyaMLE( D, init = NA, method = "Newton_Raphson", epsilon = 1e-04, print_progress = FALSE )
polyaMLE( D, init = NA, method = "Newton_Raphson", epsilon = 1e-04, print_progress = FALSE )
D |
matrix (JxK) of counts; each row is a sample from a MN distribution with K categories |
init |
If NA, use method moment matching procedure to obtain good init values |
method |
"Newton_Raphson", "fixed_point", "separate", "precision_only" |
epsilon |
convergence diagnostic |
print_progress |
if TRUE, print progress to screen |
mle of prior concentration from marginal Dirichlet-multinomial likelihood
This function computes a prediction interval from a number of methods.
predictionInterval(Y, method = "FAB", alpha = 0.15, mu = 0, tau2 = 1)
predictionInterval(Y, method = "FAB", alpha = 0.15, mu = 0, tau2 = 1)
Y |
Observed data vector |
method |
Choice of prediction method. Options include FAB, DTA, direct, Bayes. |
alpha |
Prediction error rate |
mu |
Prior expected mean of the population mean |
tau2 |
Prior expected variance of the population mean |
pred object containing prediction interval bounds and interval coverage
# example data data(radon) y_county9 = radon$radon[radon$group==9] fab.region = predictionInterval(y_county9, method = "FAB", alpha = .15, mu = 0.5,tau2 = 1) fab.region$bounds plot(fab.region)
# example data data(radon) y_county9 = radon$radon[radon$group==9] fab.region = predictionInterval(y_county9, method = "FAB", alpha = .15, mu = 0.5,tau2 = 1) fab.region$bounds plot(fab.region)
This function computes a prediction set from a number of methods.
predictionSet( Y, method = "FAB", alpha = 0.15, gamma = rep(1, length(Y)), category_names = 1:length(Y) )
predictionSet( Y, method = "FAB", alpha = 0.15, gamma = rep(1, length(Y)), category_names = 1:length(Y) )
Y |
Observed data vector |
method |
Choice of prediction method. Options include FAB, direct, Bayes. |
alpha |
Prediction mis-coverage rate |
gamma |
Dirichlet prior concentration for FAB/Bayes methods |
category_names |
Category names (optional) |
pred object containing prediction set and interval coverage
# obtain example categorical data set.seed(1) prob = rdirichlet(50:1) y = rmultinom(1,15,prob) fab.set = predictionSet(y, method = "FAB", gamma = c(50:1)) plot(fab.set)
# obtain example categorical data set.seed(1) prob = rdirichlet(50:1) y = rmultinom(1,15,prob) fab.set = predictionSet(y, method = "FAB", gamma = c(50:1)) plot(fab.set)
Data from a national US EPA survey of household radon values. County index contained in group column.
data(radon)
data(radon)
A matrix.
US Environmental Protection Agency (1992) National residential radon survey: summary report. Washington, DC; DOI EPA402-R-92-011.
Generate a random sample from a Dirichlet distribution
rdirichlet(gamma)
rdirichlet(gamma)
gamma |
Prior concentration vector of length K |
a vector of length K that is a random sample from a Dirichlet distribution
Row standardize a matrix
row_standardize(W)
row_standardize(W)
W |
matrix |
row-standardized matrix
Adjacency matrix for MN counties based on group index that matches radon data.
data(W)
data(W)
A matrix.