We suggest a novel two-stage methodology to precisely and effectively analyze individuals’s impediments to receiving the COVID-19 vaccine with correct choices of interpretable variables and their interactions. The primary stage, pre-screening, relies on the Bayes issue, a broadly used Bayesian methodology to rapidly test the correlation between variables and response. Thus, we are able to successfully filter out apparently irrelevant variables and keep away from pointless computational burdens and modeling challenges. Within the second stage of BMARS-based classification, the unknown operate is fitted by product-based spline foundation capabilities, which might routinely fine-tune the choice of key variables and their interactions.

### Stage I: Bayes-factor-based pre-screening

In our COVID-19 vaccination knowledge evaluation, the dimension of potential key variables is often too excessive to make use of Bayesian nonparametric fashions instantly. Subsequently, it’s obligatory to scale back the dimensionality of the variable area. We suggest to make the most of the mannequin comparability capacity of the Bayes issue and use it as a screening step to scale back the scale. Since our objective is to foretell vaccine impediments, it turns into a binary classification downside. Subsequently, we selected a way broadly used for classification duties, the Probit mannequin, during which the conditional likelihood of one of many two attainable attitudes towards the vaccine is the same as a linear mixture of the underlying variables, remodeled by the cumulative distribution operate of the usual Gaussian^{30,31}. For classification duties, a broadly used method is to mix the regression mannequin with a probit mannequin utilizing auxiliary variables. Particularly, within the classification framework, we use *z* to indicate the noticed response, which is a binary variable and *y* because the auxiliary variable. We assume the binary *z* to be 1 if (y>0) and 0 in any other case. For the probabilistic mannequin, it’s outlined as (p(z=1|y)=Phi (y)) the place (Phi) is the usual Gaussian cumulative distribution operate and *y* is outlined as (y sim mathcal {N}(varvec{ beta }{textbf {x}}+beta _0, sigma ^2)) the place (varvec{ x}) is the (p^*) dimensional explanatory variables (covariates), (varvec{ beta }) is the vector of regression parameters and (sigma ^2) is the error variance.

Excessive-dimensional knowledge evaluation is all the time a frightening activity. When the dimension (p^*) is excessive, we run into an issue referred to as “the curse of dimensionality”^{32}. Although the excessive dimensional variables often present extra data, additionally they result in increased computational prices. The convergence of optimization algorithms or Bayesian sampling in an area of excessive dimensions is often very sluggish. Additionally, it could hurt the estimation accuracy, which is because of the troublesome search in an area of excessive dimensions. Subsequently, an efficient and correct variable choice is crucial in high-dimensional modeling.

Pre-screening is a well-liked approach to rapidly filter out unimportant variables, making variable choice extra environment friendly in a a lot lower-dimension area utilizing a less complicated mannequin (like linear mannequin), particularly for ultrahigh-dimensional instances. In pre-screening strategies, it’s often assumed that if one variable is essential when predicting the response, it will likely be marginally related to the response. Completely different measurements of the affiliation are studied utilizing, for instance, p-value^{32,33,34}. Nonetheless, the pre-screening method haven’t been absolutely explored within the Bayesian paradigm.

We use an off-the-shelf Bayesian methodology, Bayes issue^{35,36}, for pre-screening. Extra particularly, the Bayes issue is a Bayesian various to classical speculation testing, which performs an essential position within the mannequin comparability and choice course of. Basically, the Bayes issue serves as a measure of how strongly knowledge help a selected mannequin in comparison with one other. The Bayes issue is outlined as a ratio of the marginal probability of two candidate fashions, sometimes thought to be a null and another speculation. The overall components is as beneath.

$$start{aligned} textual content {Bayes} textual content {issue} = fracM_1)M_2) = fracD)p(M_2)D)p(M_1) finish{aligned}$$

the place *D* denotes the accessible knowledge and (M_1) and (M_2) denote two potential fashions. A bigger worth of this ratio signifies extra help for (M_1), and vice versa.

Extra particularly, to test the impact of the *j*th variable (x_{j}) with the corresponding regression parameter (beta _{j}), we calculate the Bayes issue ((textual content {BF}_j)) by way of Probit regression mannequin as beneath

$$start{aligned} textual content {BF}_j = frac{p({textbf {z}} | mathcal {H}_1)}{p({textbf {z}} | mathcal {H}_0)}, finish{aligned}$$

the place speculation (mathscr {H}_1) assumes that (y sim mathcal {N}(beta _j x_j+beta _0, sigma _{j}^2)), speculation (mathscr {H}_0) assumes that (y sim mathcal {N}(beta _0, sigma ^2)), prior for (beta _j) is Gaussian distribution (p(beta _j)sim mathcal {N}(0,alpha )), and use conjugate prior for the variances.

To compute the intractable marginal probability (p({textbf {z}} | mathscr {H}_1)) (built-in over (varvec{ beta })), we select to make use of Laplace Approximation^{37,38,39}. Particularly, underneath (mathscr {H}_1), the posterior distribution of (beta _j) is

$$start{aligned} p(beta _j | D)&propto p(D | beta _j) p(beta _j) = f(beta _j), finish{aligned}$$

(1)

$$start{aligned} log f(beta _j)&= log p(D | beta _j) | log p(beta _j) = sum _{i=1}^N log Phi (z_ibeta _j x_{ij}) – frac{1}{2}beta _j^2. finish{aligned}$$

(2)

Suppose (beta _j^*) is a most of *f*, we are able to calculate the damaging Hessian at (beta _j^*)

$$start{aligned} A = – nabla nabla log f(beta _j^*) = sum _{i=1}^N [v_i(s_i + v_i)x_{ij}^2] + 1, quad v_i = frac{mathcal {N}(s_i | 0, 1)}{Phi (s_i)}, quad s_i = z_ibeta _j x_{ij}. finish{aligned}$$

(3)

Then, the approximate posterior will be written as (Q(beta _j) = mathcal {N}(beta _j | beta _j^*, A^{-1})). Thus, we are able to approximate the marginal probability

$$start{aligned} p(D | mathscr {H}_1) approx prod _{i=1}^N int p(varvec{z} | beta _j) Q(beta _j) dbeta _j = prod _{i=1}^N Phi left (frac{z_ibeta _j x_{ij}}{sqrt{x_{ij}A^{-1}x_{ij} + 1}}proper ). finish{aligned}$$

(4)

A bigger worth of (textual content {BF}_j) suggests our desire for the speculation (mathscr {H}_1) to the speculation (mathscr {H}_0), implying a possible key position of ({textbf {x}}_j) when predicting ({textbf {z}}). Then after calculating ({textual content {BF}_j, j=1,cdots ,p}), we are able to select the highest ranked variables with respect to (textual content {BF}_j). Say we choose *p* explantory variables out of (p^*) variables. Subsequent, we use these *p* chosen variables (varvec{ x}) for the Bayesian nonparametric classification mannequin.

### Stage II: BMARS-based classification modeling

In stage 2, we use a versatile nonlinear methodology to narrate the response *z* with the chosen explanatory variables from step 1. Extra particularly, we use Bayesian multivariate adaptive regression splines (BMARS)^{27,28} which is a Bayesian model of a versatile non-parametric regression and classification methodology named MARS^{40}. We prolong the beforehand outlined linear probit mannequin for nonlinear modeling utilizing product spline foundation capabilities. We use the probit mannequin outlined within the earlier part, for the *i*th remark (p(z_{i}=1|y_{i})=Phi (y_{i}), (i=1,cdots ,n)). Subsequent we use BMARS to narrate the auxilary variables *y* with the explanatory variables ({textbf {x}}) by a regression mannequin. In BMARS, for regression duties, the product-based spline foundation capabilities are usually not solely used to mannequin the unknown operate *f*, but additionally routinely choose the nonlinear interactions among the many variables. The mapping operate between the chosen variables ({textbf {x}}_i in mathscr {R}^p) and the auxiliary variable (y_i) as beneath

$$start{aligned} y_i&= f({textbf {x}}_i) + varepsilon _i, quad hat{f}({textbf {x}}_i) = sum _{j=1}^m alpha _j B_j({textbf {x}}_i), quad varepsilon _i {mathop {sim }limits ^{textual content {i.i.d}}} mathcal {N}(0, sigma ^2), finish{aligned}$$

(5)

the place *m* is the variety of foundation capabilities and (alpha _j) denotes the coefficient for the fundamental operate (B_j) which is designed as

$$start{aligned} B_j({textbf {x}}_i) = left{ start{array}{rcl} &{}1, &{}j=1 , &{}prod _{q=1}^{Q_j} [s_{qj}cdot ({textbf {x}}_{i,v(q,j)} – t_{qj})]_+, &{}jin {2,3,cdots ,m} finish{array} proper. finish{aligned}$$

(6)

the place the (s_{qj} in {-1,1}), the *v*(*q*, *j*) denotes the index of the variables and the set ({v(q,j);q=1,cdots ,Q_j}) are usually not repeated, the (t_{qj}) refers back to the partition location, ((cdot )_+ = max (0,cdot )), and (Q_j) is the polynomial diploma of the fundamental operate (B_j) and likewise signifies the variety of variables concerned in (B_j).

For probit mannequin, the posterior distribution shouldn’t be accessible in specific kind so we use Markov Chain Monte Carlo (MCMC) algorithm to simulate from the posterior distribution. Because the dimension of the mannequin *m* is unknown, we use the reversible bounce Metropolis-Hastings algorithm^{41}. Extra particularly, the mannequin parameters we’re excited by inside the Bayesian framework of BMARS^{27} are assumed to incorporate the variety of foundation capabilities *m*, in addition to their diploma of interplay (Q_j), their coefficients (alpha _j), their related cut up factors (t_{qj}), and the signal indicators (s_{qj}). We are able to use (varvec{theta }^{(m)} = { mathscr {B}_1,cdots ,mathscr {B}_m }) the place (mathscr {B}_j) to indicate the mannequin parameters ((Q_j, alpha _j, t_{1j}, cdots , t_{Q_j,j}, s_{1j}, cdots , s_{Q_j,j})) for every foundation operate (B_j). Then, the hierarchical mannequin will be written as

$$start{aligned} p(m, varvec{theta }^{(m)}, {textbf {y}}) = p(m)p(varvec{theta }^{(m)}|m)p({textbf {y}}|m, varvec{theta }^{(m)}), finish{aligned}$$

(7)

and the joint posterior for parameters *m* and (varvec{theta }^{(m)}) will be written within the following factorized kind

$$start{aligned} p(m, varvec{theta }^{(m)} | {textbf {y}}) = p(m|{textbf {y}}) p(varvec{theta }^{(m)} | m,{textbf {y}}). finish{aligned}$$

(8)

On this algorithm, we replace the mannequin randomly utilizing considered one of three steps, together with (a) altering a node place, (b) making a foundation operate, or (c) deleting a foundation operate, after which correcting the proposed new pattern by the Metropolis-Hastings step^{42,43}. Underneath this sampling scheme, samples based mostly on important variables usually tend to be accepted, which permits automated function choice by the algorithm and is essential for us to make coverage implications.