Title: | Shrinkage for Extreme Partial Least-Squares (SEPaLS) |
---|---|
Description: | Regression context for the Partial Least Squares framework for Extreme values. Estimations of the Shrinkage for Extreme Partial Least-Squares (SEPaLS) estimators, an adaptation of the original Partial Least Squares (PLS) method tailored to the extreme-value framework. The SEPaLS project is a joint work by Stephane Girard, Hadrien Lorenzo and Julyan Arbel. R code to replicate the results of the paper is available at <https://github.com/hlorenzo/SEPaLS_simus>. Extremes within PLS was already studied by one of the authors, see M Bousebeta, G Enjolras, S Girard (2023) <doi:10.1016/j.jmva.2022.105101>. |
Authors: | Stephane Girard [aut], Julyan Arbel [aut], Hadrien Lorenzo [aut, cre, cph] |
Maintainer: | Hadrien Lorenzo <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2025-02-17 04:44:34 UTC |
Source: | https://github.com/hlorenzo/sepals |
Bootstrap function for SEPaLS estimator.
bootstrap.SEPaLS( X, Y, yn, type = c("vMF", "Laplace"), mu0 = NULL, kappa0 = NULL, lambda = NULL, B = 20 )
bootstrap.SEPaLS( X, Y, yn, type = c("vMF", "Laplace"), mu0 = NULL, kappa0 = NULL, lambda = NULL, B = 20 )
X |
|
Y |
|
yn |
|
type |
character, whether |
mu0 |
|
kappa0 |
|
lambda |
|
B |
positive integer. The number of bootstrap samples on which estimate the SEPaLS directions. Default to 20. |
A list with two elements:
ws
: A -dimensional matrix with each
row corresponding to the SEPaLS direction estimated on each
bootstrap sample.
cor
: The correlation of each estimate direction on the
Out-Of-Bag (OOB) sample with the response.
set.seed(5) n <- 3000 p <- 10 X <- matrix(rnorm(n*p),n,p) beta <- c(5:1,rep(0,p-5)) ; beta <- beta/sqrt(sum(beta^2)) Y <- (X%*%beta)^3 + rnorm(n) boot.sepals_Laplace <- bootstrap.SEPaLS(X,Y,yn=1,type="Laplace",lambda=0.01, B=100) boxplot(boot.sepals_Laplace$ws);abline(h=0,col="red",lty=2)
set.seed(5) n <- 3000 p <- 10 X <- matrix(rnorm(n*p),n,p) beta <- c(5:1,rep(0,p-5)) ; beta <- beta/sqrt(sum(beta^2)) Y <- (X%*%beta)^3 + rnorm(n) boot.sepals_Laplace <- bootstrap.SEPaLS(X,Y,yn=1,type="Laplace",lambda=0.01, B=100) boxplot(boot.sepals_Laplace$ws);abline(h=0,col="red",lty=2)
Maximum Likelihood estimator
maximum_Likelihood_SEPaLS(X, Y, yn)
maximum_Likelihood_SEPaLS(X, Y, yn)
X |
|
Y |
|
yn |
the quantile corresponding to the lowest values of |
The maximum likelihood estimator.
n <- 3000 p <- 10 X <- matrix(rnorm(n*p),n,p) beta <- c(5:1,rep(0,p-5)) ; beta <- beta/sqrt(sum(beta^2)) Y <- X%*%beta + rnorm(n,sd=1/3) estimators <- do.call(rbind,lapply(seq(0,1,length.out=100),function(pp){ yn <- quantile(Y,probs = pp) maximum_Likelihood_SEPaLS(X,Y,yn) })) matplot(estimators,type="l",lty=1,col=c(rep(2,5),rep(1,p-5))) abline(h=beta/sqrt(sum(beta^2)),col=c(rep(2,5),rep(1,p-5)))
n <- 3000 p <- 10 X <- matrix(rnorm(n*p),n,p) beta <- c(5:1,rep(0,p-5)) ; beta <- beta/sqrt(sum(beta^2)) Y <- X%*%beta + rnorm(n,sd=1/3) estimators <- do.call(rbind,lapply(seq(0,1,length.out=100),function(pp){ yn <- quantile(Y,probs = pp) maximum_Likelihood_SEPaLS(X,Y,yn) })) matplot(estimators,type="l",lty=1,col=c(rep(2,5),rep(1,p-5))) abline(h=beta/sqrt(sum(beta^2)),col=c(rep(2,5),rep(1,p-5)))
A subset of data from the 'agreste' French governmental website <https://agreste.agriculture.gouv.fr/agreste-web/servicon/I.2/listeTypeServicon/>.
data(ricaCarrots)
data(ricaCarrots)
'ricaCarrots'
A List of 3 objects:
a vector. The production of carrots (open field) (in quintals) for 598 French farms.
a matrix. The 259 covariates describing the same 598 French farms.
a matrix. Description of the 259 covariates.
<https://agreste.agriculture.gouv.fr/agreste-web/servicon/I.2/listeTypeServicon/>
Function to estimate SEPaLS estimators
SEPaLS( X, Y, yn, type = c("vMF", "Laplace"), mu0 = NULL, kappa0 = NULL, lambda = NULL )
SEPaLS( X, Y, yn, type = c("vMF", "Laplace"), mu0 = NULL, kappa0 = NULL, lambda = NULL )
X |
|
Y |
|
yn |
|
type |
character, wether |
mu0 |
|
kappa0 |
|
lambda |
|
The SEPaLS estimators are built depending on the value given to
type
:
vMF
: then the estimator is proportional to
where is the EPLS estimator, which coincides
with the maximum-likelihood estimator of SEPaLS for a threshold
.
Laplace
: then the estimator is proportional to
where is the soft-thresholding operator of threshold
.
A SEPaLS estimator
set.seed(1) n <- 3000 p <- 10 X <- matrix(rnorm(n*p),n,p) beta <- c(5:1,rep(0,p-5)) ; beta <- beta/sqrt(sum(beta^2)) Y <- (X%*%beta)^3 + rnorm(n,sd=1/3) mu0 <- rnorm(p) ; mu0 <- mu0/sqrt(sum(mu0^2)) sepals_vMF <- SEPaLS(X,Y,yn=1,type="vMF",mu0=mu0,kappa0=1) sepals_Laplace <- SEPaLS(X,Y,yn=1,type="Laplace",lambda=0.01)
set.seed(1) n <- 3000 p <- 10 X <- matrix(rnorm(n*p),n,p) beta <- c(5:1,rep(0,p-5)) ; beta <- beta/sqrt(sum(beta^2)) Y <- (X%*%beta)^3 + rnorm(n,sd=1/3) mu0 <- rnorm(p) ; mu0 <- mu0/sqrt(sum(mu0^2)) sepals_vMF <- SEPaLS(X,Y,yn=1,type="vMF",mu0=mu0,kappa0=1) sepals_Laplace <- SEPaLS(X,Y,yn=1,type="Laplace",lambda=0.01)