ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Bayesian expectile regression with single-index models

[version 1; peer review: awaiting peer review]
PUBLISHED 15 Apr 2026
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS AWAITING PEER REVIEW

This article is included in the Fallujah Multidisciplinary Science and Innovation gateway.

Abstract

Single-index expectile regression models provide a flexible semiparametric regression framework for high-dimensional covariates, and capture parameter heterogeneity and nonlinearity especially when focusing on different parts of the conditional distribution of the outcome of interest. Bayesian approaches have never been studied for such regression models. In this paper, we propose a Bayesian single-index expectile regression model using the asymmetric normal distribution (AND) for the error distribution. We design an MCMC method for posterior estimate. Simulations and real data analysis results show that the proposed approach performs very well compared with some existing approaches.

Keywords

Bayesian inference, Expectile regression, Single-index model, Asymetric Normal Distribution

1. Introduction

Expectiles of a probability distribution F, like the quantiles of a probability distribution F, represent different points of a distribution, but they are determined by tail expectations rather than tail probabilities. Expectiles depend on both the tail realizations and their probability, while quantiles only depend on the frequency of tail observations. There exists an one-to-one mapping from expectiles to quantiles,27 i.e. for each τth expectile, there is a corresponding θ th quantile, where τ and θ(0,1) . Hence, expectiles can be utilized to estimate quantiles.

The τ th expectile of F is the quantity eτ that satisfies

(1)
τ=eτ|xeτ|dF(x)|xeτ|dF(x)

Expectiles are a generalization of the mean and the expectile loss is a generalization of the mean squared error in the same way as quantiles are a generalization of the median and the quantile loss is a generalization of the mean absolute error. Standard regression model aims to estimate the conditional expectation of the outcome variable y given the vector of covariates x,i.e.,E(y|x) . In many applications, however, it is required to study conditional distributions beyond the mean (conditional expectation). A nice tool for this purpose was offered by17 in the form of expectile regression. Expectile regression17 models the relationship between the covariates and the conditional expectiles of the outcome variable. The methodology is a generalization of the mean regression and closely related to quantile regression. It uses expectiles-points that minimize an asymmetric quadratic loss function lτ(t)=t2.|τI(t<0)| rather than the absolute loss function lθ(t)=t.(θI(t<0)) used in quantile regression. Both the expectile level τ and the quantile level θ determine the degree of asymmetry of the loss function. Examining the asymmetric quadratic loss function reveals many of the properties that make expectile regression an attractive measure of risk.

Compared to the mean value, expectiles are more sensitive to extreme values, and to the shape of the distribution in general. Furthermore, standard regression implicitly assumes normally distributed residuals, while such an assumption is not necessary in expectile regression. Expectile regression often leads to more efficient estimators compared to quantile regression, especially when the underlying error distribution is close to normal or when you’re interested in extreme values of the conditional distribution. Estimation of expectile regression models can be done using iterative algorithms, such as weighted least squares or stochastic gradient descent, and reliable estimation approaches have been developed in both the classical and Bayesian literatures.

Since its inception, expectile regression has attracted considerable interest in the literature. It has been applied in many different areas: finance and risk management,3,6,14 actuarial science,4,7,16 ecology and environmental studies,10,21,24 social sciences,19,28 and so on. In the recent decades, there exists considerable interest in the study of nonparametric and semiparametric models. Single-index model provides an efficient way of coping with high-dimensional nonparametric estimation problems and gives more flexibility and capture parameter heterogeneity and nonlinearity. Expectile regression with single-index models is an efficient method to model asymmetric relationships in data while achieving dimension reduction. There exists a large literature on classical methods for expectile regression with single-index models, and we refer to12,13 for an overview. In contrast, a Bayesian method for estimating expectile regression with single index model has not been proposed, yet.

In this paper, we consider a single-index expectile regression model. For a given expectile level 0<τ<1 and training data {(xi,yi)}i=1n , it is given by

Eyi|xi(τ)=ϕ(xiβ),i=1,,n.

Here, Еyi|xi(τ ) is the τ th expectile function of yi given xi , xip is the covariate vector for the i -th observation, yi is the response corresponding to the covariate vector xi , ϕ(.) is the unknown univariate link function, and β=(β1,β2,,βk) is the parametric index vector which implicitly depends on the desired expectile level τ . Following15 and12 for the sake of identifiability, we assume that β=1 and that the first component of β is positive, · refers to the Euclidean norm.

The single index regression model is a form of dimension reduction in regression where the covariate vector xi is reduced to a one-dimensional index, so that ϕ(xiβ) is a univariate function instead of k-variate one, allowing for more interpretable and efficient modeling of the outcome of interest. In this paper, we establish a hierarchical Bayesian model by using asymmetric normal distribution (AND). As shown in Figure 1, the asymmetric normal distribution exhibits different shapes depending on the expectile level τ. For detailed information on Bayesian expectile regression methods, see,25,23,26 and.20 Following11 and,29 we assign a Gaussian process prior distribution on ϕ , to get a flexible nonparametric expectile regression model.

e7ed66e4-18e7-433a-8d86-8788ec3aac96_figure1.gif

Figure 1. Probability density functions of the AND with σ2=1 and τ=0.10,0.25,0.50 .30

This paper proceeds as follows. In Section 2, we introduce single-index expectile regression, the proposed Bayesian hierarchical model, and derive the corresponding MCMC samplers. Simulation studies are then presented in Section 3 followed by a real data example in Section 4. Conclusions is put in Section 5.

2. Methods

2.1 Single-Index expectile regression

In the single-index expectile regression, the regression coefficients β can be estimated through optimizing the following empirical loss function

(2)
minβi=1nƖτ(yiϕ(xiβ)),
where the loss function
(3)
Ɩτ(yiϕ(xiβ))=(yi-ϕ(xiβ))2|τI(yiϕ(xiβ)<0)|.

Equivalently, we may write (3) as

Ɩτ(yiϕ(xiβ))={τ(yiϕ(xiβ))2,ifyiϕ(xiβ)0,(1τ)(yiϕ(xiβ))2,ifyiϕ(xiβ)<0.         

Rather than minimizing the usual expectile loss function (2), we solve the minimization problem by constructing a Markov chain having the joint posterior for the expectile regression coefficients β as its stationary distribution with the minimizer of (2) as its global mode. The quadratic asymmetric loss function (2) is exactly equivalent to the AND; see23,25 and.20 The density function of an AND (μ,σ2,τ) is

(4)
p(yi)=2σ2π(τ(1τ)τ+1τ)exp(vi(yiϕ(xiβ))22σ2),
where, τ is the skew parameter, σ2 is the scale parameter, μi=ϕ(xiβ) is the location parameter and vi=|τI(yiϕ(xiβ)<0)|. Minimizing (2) is equivalent to maximizing the likelihood function of yi by assuming yi from an AND with μi=ϕ(xiβ) .

2.2 Bayesian hierarchical model

Following11,29,5 and,8 we model the nonparametric link function ϕ by a Gaussian process (GP) prior with mean zero and covariance function C(·,·) , i.e. ϕ~GP(0,C(·,·)) , where

(5)
C(βxi,βxj)=rexp((xixj)ββ(xixj)b),

Here, r and b are two hyperparameters. Following11 and,29 we replace β/b with a new index vector, still denoted by β , to simplify the estimation procedure. Thus, the Gaussian process (GP) prior in (5) can be written as

(6)
C(βxi,βxj)=rexp((xixj)ββ(xixj)).

To proceed a Bayesian analysis, we assign a Laplace prior for βj,j=1,,p, of the form2,22

(7)
f(βj|σ,λj)=j=1kλj2σexp{λi|βj|σ},
which extends Bayesian Lasso18 by allowing different penalization parameters (λj>0) for different regression coefficients. We further put a Gamma prior on the parameter λj of the form p(λj)λja1exp{bλj},p(σ2)=1/σ2 on σ2 and inverse Gamma prior on r,i.e.rIG(c,d) , where c and d are two hyperparameters. Thus, a fully bayesian approach for expectile adaptive lasso regression with single index model can be described as follows:
(8)
yi|β,σ2~AND(ϕ(xiβ),σ2,τ),
(9)
β|σ2,λj=1kλj2σexp{λj|βj|σ},
(10)
ϕn|β,r~GP(0,Cn(.,.)),
(11)
σ2~1/σ2,
(12)
λj~(λj)a1exp{bλj},
(13)
r~(1r)c+1exp(dr)
where λ=(λ1,,λk),ϕn=(ϕ1,,ϕn)=(ϕ(xiβ),,ϕ(xiβ)).

2.3 MCMC sampling

The posterior distribution of all parameters of interest is found via MCMC sampling algorithm and the details of full conditional distributions are given below.

  • 1. Sample the regression coefficients β from their posteriors using a random walk Metropolis-Hastings steps,

(14)
f(β|σ2,λ,r)f(y|σ2,ϕn)f(ϕn|β,r)dϕnf(β|σ2,λ)                                              (det[V+Cn])12exp{y(V+Cn)y2σ2}j=1kexp{λj|βj|σ2},
where the weight matrix V=diag(v1,,vn) adjusts for the expectile loss so that vi=|τI(yiϕ(xiβ)<0| .
  • 2. Sample the hyperparameter r from the posterior using a random walk Metropolis-Hastings steps,

(15)
                   f(r|β,σ2,λ)f(y|σ2,ϕn)f(ϕn|β,r )dϕnf(r)                                        (det[V+Cn])12exp{y(V+Cn)y2σ2}(1r)c+1exp(dr)
  • 3. Sample the nonparametric link function ϕn from a multivariate normal distribution with mean μn=Cn(V+Cn)1y and variance Σn=Cn(V+Cn)1V

  • 4. Sample σ2 from inverse Gamma (IG) with shape parameter (n1+k)/2 and scale parameter (i=1nvi(yiϕ(xiβ))2/2)+j=1kλj|βj|

  • 5. Sample λj from Gamma distribution with shape parameter a+1 and scale parameter b+|βj|/σ2

3. Simulation studies

In this section, we investigate the prediction accuracy of the proposed approach (BESIM) and compare its performance with a non-Bayesian single-index expectile regression12 referred as “ESIM”. We simulate data from the model

(16)
yi=ϕ(xiβ)+σ(xiβ)ui,       i=1,,n.

The covariates are simulatted independently from the uniform distribution on [0,1] and ui is i.i.d. N (0,3) . We experiment with four different scenarios by varying the sample size (n=50,150,250,500) and simulations are repeated 150 times for each of given n and τ(0.10,0.20,0.30,0.40,0.50,0.60,0.70,0.80,0.90) .

3.1 Simulation 1

The simulation setup is similar to Example 1 in11,29 with different parameter values for the regression coefficients and error distribution. We generate data sets from model (16), where ϕ(s)=sin(π(sA)CA) , β=(β1,β2,β3)=13(1,0,0),A=321.64512,C=32+1.64512,σ(xiβ)=0.5andui~N(0,3) .

For a Bayesian point estimator we consider the posterior mean using 15,000 iterations of the MCMC after 1,000 iterations as burn-in. The resulting estimates are summarized in boxplot Figures 2 and 3 based on 100 replications. These boxplots display the estimated coefficients, comparing BESIM and ESIM, with τ{0.10,0.50,0.90} . In general, the boxplots give the impression that the Bayesian estimates (BESIM) produces more precise and stable estimates than classical estimates (ESIM). Mean squared errors (MSE) of the estimates based on the 100 replications in each case are shown in Tables 1 and 2 for four sample sizes n=50,150,250 and 500 and all nine expectiles. MSE results show that the proposed method generally behaves better than the ESIM method in terms of the MMAD.

Table 1. Comparison of MSE results in Simulation 1 for BESIM and ESIM based on 150 replications when n=50 and n=150 .

n=50 n=150
τ Methods β1 β2 β3 β1 β2 β3
0.10ESIM0.01955430.01470650.03503430.04621570.00912050.0400359
0.10BESIM0.02630480.01810310.02002670.03820940.02766610.0088778
0.20ESIM0.02139200.01706070.01447940.01879690.04318080.0427004
0.20BESIM0.02069560.01671300.01580800.00651810.03601500.0328985
0.30ESIM0.03232320.03169210.03580280.03049210.03590990.0416645
0.30BESIM0.03848710.04590710.01804940.01268980.00317300.0167676
0.40ESIM0.03685090.03417200.04314010.04605280.02930050.0366979
0.40BESIM0.00347000.00441480.04702650.00307630.04420980.0328749
0.50ESIM0.03508710.04153190.03678160.04815500.04435930.0085951
0.50BESIM0.00172970.03866360.01762320.04425740.01225710.0295646
0.60ESIM0.01334860.01619010.00313930.04884980.00790650.0070142
0.60BESIM0.01248040.03325110.00738020.02038620.04944470.0395940
0.70ESIM0.00917330.02268940.00801500.00226250.03555850.0243689
0.70BESIM0.02533540.03718580.00271660.00863800.04806100.0149382
0.80ESIM0.03632340.00518130.03545100.03257030.02044590.0225520
0.80BESIM0.03713110.02351380.03930360.02188880.00472310.0315689
0.90ESIM0.04635900.00718710.04947100.02135330.02629220.0169395
0.90BESIM0.04401100.04814420.04602680.00729260.03053820.0360484

Table 2. Comparison of MSE results in Simulation 1 for BESIM and ESIM based on 150 replications when n=250 and n=500 .

n = 250 n = 500
τ Methods β1 β2 β3 β1 β2 β3
0.10ESIM0.00231840.00186010.01602160.01596540.00168790.0168082
0.10BESIM0.00799130.01617500.01578700.01956530.01469540.0055722
0.20ESIM0.01595830.01511750.01866340.01110820.00303000.0183915
0.20BESIM0.00447310.00527320.00431080.01269030.01012960.0134150
0.30ESIM0.01003530.01603540.00669630.01657640.01143150.0162483
0.30BESIM0.01016730.00179850.01040270.01475640.01702050.0065727
0.40ESIM0.01254520.01267790.01971870.00979150.01162930.0095833
0.40BESIM0.00986910.00202330.01899070.01415960.01175130.0149975
0.50ESIM0.00222550.01841940.00324270.00152210.00136740.0045875
0.50BESIM0.01456960.00163080.01832930.01650740.01271370.0040005
0.60ESIM0.01778840.00279870.01584760.00307930.01101990.0069175
0.60BESIM0.00448450.01356740.00725860.01039150.00723860.0132245
0.70ESIM0.00362830.00794140.01042640.01013760.01849890.0047943
0.70BESIM0.00721670.00356410.00362850.00137240.00384200.0131117
0.80ESIM0.01493260.00280200.00246890.00672980.01704740.0179972
0.80BESIM0.00915410.01888160.01721110.00410890.01413670.0104179
0.90ESIM0.00407580.01669960.01222180.00845450.01256200.0037598
0.90BESIM0.01642820.00951120.00404790.01493950.00647390.0199554
e7ed66e4-18e7-433a-8d86-8788ec3aac96_figure2.gif

Figure 2. Summarizing estimators of β for n=50,150 in Simulation 1. ‘BESIM 50 ’ denotes BESIM with n=50 , for example.

500 and all nine expectiles. MSE results show that the proposed method generally behaves better than the ESIM method in terms of the MMAD.

e7ed66e4-18e7-433a-8d86-8788ec3aac96_figure3.gif

Figure 3. Summarizing estimators of β for n=250,500 in Simulation 1. ‘BESIM 250 ’ denotes BESIM with n=250 , for example.

3.2 Simulation 2

In this simulation study, we simulate data from model (16), where β=(1,1,1)/3 and σ(xiβ)=1+sin(xiβ) . Mean squared errors (MSE) of the estimates based on the 100 replications in each case are shown in Tables 3 and 4 for four sample sizes n=50,150,250 and 500 and all nine expectiles. Again, MSE results show that the proposed method generally behaves better than the ESIM method in terms of the MSE.

Table 3. Comparison of MSE results in Simulation 2 for BESIM and ESIM based on 150 replications when n=50 and n=150 .

n=50 n=150
τ Methods β1 β2 β3 β1 β2 β3
0.10ESIM0.01071910.01809420.01946730.01757310.01556280.0063909
0.10BESIM0.02455360.01023680.02040300.02402200.01546820.0035139
0.20ESIM0.02312260.00725860.01897630.00611640.00189000.0075795
0.20BESIM0.01270130.02323480.00724810.00593920.01601060.0013501
0.30ESIM0.01179250.01225010.01099380.01813830.01511460.0077859
0.30BESIM0.02046270.02403460.01679020.01680550.00948310.0111475
0.40ESIM0.02344650.00213880.01673610.00991250.02425830.0018942
0.40BESIM0.00373760.00496270.01834070.00497920.02171800.0238382
0.50ESIM0.01849120.00195950.01143100.01634680.00846060.0191856
0.50BESIM0.01569280.01324970.01589880.01367450.02293150.0227252
0.60ESIM0.00165490.02163560.00302840.01531090.02183180.0242549
0.60BESIM0.00732660.02156840.01592640.01990630.00693570.0141723
0.70ESIM0.01515220.01028540.02222140.01249630.00454820.0173132
0.70BESIM0.01266530.00687370.00535820.00530460.00696450.0223847
0.80ESIM0.02493520.00284720.01352140.02206450.01266940.0234537
0.80BESIM0.01526640.00646410.01516810.01943870.00525560.0020447
0.90ESIM0.01000170.02057850.01573090.00452150.00800020.0154013
0.90BESIM0.00569140.01474020.00670660.00789140.02149910.0227348

Table 4. Comparison of MSE results in Simulation 2 for BESIM and ESIM based on 150 replications when n=250 and n=500 .

n = 250 n = 500
τ Methods β1 β2 β3 β1 β2 β3
0.10ESIM0.00355590.00603650.00926450.00409710.00232240.0074677
0.10BESIM0.00728960.00306400.00592150.00297160.00193710.0031230
0.20ESIM0.00910810.00184020.00542810.00891950.00950400.0096261
0.20BESIM0.00740590.00872560.00751690.00223370.00125250.0078955
0.30ESIM0.00999150.00495050.00889720.00218220.00857520.0096027
0.30BESIM0.00735080.00326480.00905810.00686630.00387640.0060716
0.40ESIM0.00789450.00290550.00651680.00877050.00171630.0084206
0.40BESIM0.00869040.00684510.00208720.00975460.00312300.0025890
0.50ESIM0.00719100.00297770.00583550.00855100.00161770.0033637
0.50BESIM0.00108480.00233060.00450460.00492040.00332040.0098460
0.60ESIM0.00200000.00270750.00177910.00352350.00591860.0024654
0.60BESIM0.00355380.00354680.00953050.00473940.00776480.0060467
0.70ESIM0.00770030.00820370.00554750.00202830.00445680.0049402
0.70BESIM0.00352220.00337580.00436290.00177900.00503710.0083322
0.80ESIM0.00241820.00213170.00636710.00565240.00169560.0046949
0.80BESIM0.00459660.00310010.00236930.00559600.00856060.0049265
0.90ESIM0.00970130.00168380.00985220.00290540.00986910.0033133
0.90BESIM0.00900780.00799910.00648960.00563030.00679100.0030292

4. Boston housing data

We examine the proposed method using the Boston Housing data (BHD). BHD was collected by9 in a study regarding the impact of clean air on housing prices, which is available in the R package spdep. BHD contains information collected from a random sample of size 506 population census areas in the Boston city. The description of the variables is summarized in Table 5.

Table 5. Description of covariates in the BHD.

Covariate Description
x1Crime rate per capita by city
x2Percentage of residential land allocated to land exceeding 25,000 square feet.
x3Percentage of non-retail space per city
x4Fictitious variable for the Charles River
x5Nitric oxide concentration
x6Average number of rooms per dwelling
x7Percentage of owner-occupied dwellings built before 1940
x8Weighted distances to five employment centers in Boston
x9Radial highway accessibility index
x10Full property tax rate per $10,000
x11Ratio of students to teachers by city
x12 1000(Bk0.63)2 where Bk is the proportion of Black residents
x13Decline in population %
y 0.50 quantiles of the owner-occupied dwellings

Table 6. MSE based on 5 -fold cross-validation results for the BHD.

MethodTest error
τ = 0.1 τ = 0.2 τ = 0.3 τ = 0.4 τ = 0.5 τ = 0.6 τ = 0.7 τ = 0.8 τ = 0.9
ESIM0.0530.0530.0520.0510.0500.0520.0520.0530.053
BESIM0.0460.0460.0440.0420.0400.0450.0460.0450.046

Similar as in Tables 1, 2, 3, and 4, we consider nine choices of τ=0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8  and 0.9. we consider 5-fold cross-validation to evaluate the performance of the both approaches (ESIM and BESIM). It can be seen that the BESIM performs better than its non-Bayesian counterpart, ESIM, uniformly for all expectiles considered.

5. Conclusion

Single-index expectile regression models provide a flexible semiparametric regression frame-work for high-dimensional covariates, and capture parameter heterogeneity and nonlinearity especially when focusing on different parts of the conditional distribution of the outcome of interest. In this paper, we introduce the Bayesian expectile regression with single-index model. A Bayesian hierarchical formulation is developed for expectile regression with single-index model (BESIM). Simulations and real data studies show that BESIM generally perform better compared with ESIM.

Ethical considerations

This study does not involve human participants or animals, therefore ethical approval was not required.

Data availability

The underlying data is available in Zenodo. https://doi.org/10.5281/zenodo.184795341

Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

Software availability

The R code supporting the findings of this study is openly available in the Zenodo repository at: https://doi.org/10.5281/zenodo.18681405.30

This code is also accessible via GitHub at: https://github.com/zena158/Bayesian-Single-Index-Expectile-Regression/tree/v1.0.1.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 15 Apr 2026
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Abdulhasan Z and Alhamzawi R. Bayesian expectile regression with single-index models [version 1; peer review: awaiting peer review]. F1000Research 2026, 15:516 (https://doi.org/10.12688/f1000research.174712.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status:
AWAITING PEER REVIEW
AWAITING PEER REVIEW
?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 15 Apr 2026
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.