Title: | Dimension Reduction, Regression and Discrimination for Chemometrics |
---|---|
Description: | Data exploration and prediction with focus on high dimensional data and chemometrics. The package was initially designed about partial least squares regression and discrimination models and variants, in particular locally weighted PLS models (LWPLS). Then, it has been expanded to many other methods for analyzing high dimensional data. The name 'rchemo' comes from the fact that the package is orientated to chemometrics, but most of the provided methods are fully generic to other domains. Functions such as transform(), predict(), coef() and summary() are available. Tuning the predictive models is facilitated by generic functions gridscore() (validation dataset) and gridcv() (cross-validation). Faster versions are also available for models based on latent variables (LVs) (gridscorelv() and gridcvlv()) and ridge regularization (gridscorelb() and gridcvlb()). |
Authors: | Marion Brandolini-Bunlon [aut, cre], Benoit Jaillais [aut], Jean-Michel Roger [aut], Matthieu Lesnoff [aut] |
Maintainer: | Marion Brandolini-Bunlon <[email protected]> |
License: | GPL-3 |
Version: | 0.1-3 |
Built: | 2024-11-10 02:58:51 UTC |
Source: | https://github.com/chemhouse-group/rchemo |
Calculation of the centers (means) of classes of row observations of a data set.
aggmean(X, y = NULL)
aggmean(X, y = NULL)
X |
Data ( |
y |
Class membership ( |
ct |
centers (column-wise means) |
lev |
classes |
ni |
number of observations in each per class |
n <- 8 ; p <- 6 X <- matrix(rnorm(n * p, mean = 10), ncol = p, byrow = TRUE) y <- sample(1:2, size = n, replace = TRUE) aggmean(X, y) data(forages) Xtrain <- forages$Xtrain ytrain <- forages$ytrain table(ytrain) u <- aggmean(Xtrain, ytrain)$ct headm(u) plotsp(u, col = 1:4, main = "Means") x <- Xtrain[1:20, ] plotsp(x, ylab = "Absorbance", col = "grey") u <- aggmean(x)$ct plotsp(u, col = "red", add = TRUE, lwd = 2)
n <- 8 ; p <- 6 X <- matrix(rnorm(n * p, mean = 10), ncol = p, byrow = TRUE) y <- sample(1:2, size = n, replace = TRUE) aggmean(X, y) data(forages) Xtrain <- forages$Xtrain ytrain <- forages$ytrain table(ytrain) u <- aggmean(Xtrain, ytrain)$ct headm(u) plotsp(u, col = 1:4, main = "Means") x <- Xtrain[1:20, ] plotsp(x, ylab = "Absorbance", col = "grey") u <- aggmean(x)$ct plotsp(u, col = "red", add = TRUE, lwd = 2)
Computation of the AIC and Mallows's criteria for univariate PLSR models (Lesnoff et al. 2021). This function may receive modifications in the future (work in progress).
aicplsr( X, y, nlv, algo = NULL, meth = c("cg", "div", "cov"), correct = TRUE, B = 50, print = FALSE, ...)
aicplsr( X, y, nlv, algo = NULL, meth = c("cg", "div", "cov"), correct = TRUE, B = 50, print = FALSE, ...)
X |
A |
y |
A vector of length |
nlv |
The maximal number of latent variables (LVs) to consider in the model. |
algo |
a PLS algorithm. Default to |
meth |
Method used for estimating |
correct |
Logical. If |
B |
For |
print |
Logical. If |
... |
Optionnal arguments to pass in |
For a model with latent variables (LVs), function
aicplsr
calculates and
by:
where is the sum of squared residuals for the current evaluated model,
the estimated PLSR model complexity (i.e. nb. model's degrees of freedom),
an estimate of the irreductible error variance (computed from a low biased model) and
the number of training observations.
By default (argument correct
), the small sample size correction (so-called AICc) is applied to AIC and Cp for deucing the bias.
The functions returns two estimates of Cp ( and
), each corresponding to a different estimate of
.
The model complexity can be computed from three methods (argument
meth
).
crit |
dataframe with |
delta |
dataframe with the differences between the estimated values of |
opt |
vector with the optimal number of latent variables in the model (i.e. minimizing aic, cp1 and cp2 values) |
Burnham, K.P., Anderson, D.R., 2002. Model selection and multimodel inference: a practical informationtheoretic approach, 2nd ed. Springer, New York, NY, USA.
Burnham, K.P., Anderson, D.R., 2004. Multimodel Inference: Understanding AIC and BIC in Model Selection. Sociological Methods & Research 33, 261-304. https://doi.org/10.1177/0049124104268644
Efron, B., 2004. The Estimation of Prediction Error. Journal of the American Statistical Association 99, 619-632. https://doi.org/10.1198/016214504000000692
Eubank, R.L., 1999. Nonparametric Regression and Spline Smoothing, 2nd ed, Statistics: Textbooks and Monographs. Marcel Dekker, Inc., New York, USA.
Hastie, T., Tibshirani, R.J., 1990. Generalized Additive Models, Monographs on statistics and applied probablity. Chapman and Hall/CRC, New York, USA.
Hastie, T., Tibshirani, R., Friedman, J., 2009. The elements of statistical learning: data mining, inference, and prediction, 2nd ed. Springer, NewYork.
Hastie, T., Tibshirani, R., Wainwright, M., 2015. Statistical Learning with Sparsity: The Lasso and Generalizations. CRC Press
Hurvich, C.M., Tsai, C.-L., 1989. Regression and Time Series Model Selection in Small Samples. Biometrika 76, 297. https://doi.org/10.2307/2336663
Lesnoff, M., Roger, J.M., Rutledge, D.N., Submitted. Monte Carlo methods for estimating Mallows's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data. Journal of Chemometrics.
Mallows, C.L., 1973. Some Comments on Cp. Technometrics 15, 661-675. https://doi.org/10.1080/00401706.1973.10489103
Ye, J., 1998. On Measuring and Correcting the Effects of Data Mining and Model Selection. Journal of the American Statistical Association 93, 120-131. https://doi.org/10.1080/01621459.1998.10474094
Zuccaro, C., 1992. Mallows'Cp Statistic and Model Selection in Multiple Linear Regression. International Journal of Market Research. 34, 1-10. https://doi.org/10.1177/147078539203400204
data(cassav) Xtrain <- cassav$Xtrain ytrain <- cassav$ytrain nlv <- 25 res <- aicplsr(Xtrain, ytrain, nlv = nlv) names(res) headm(res$crit) z <- res$crit oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 4)) plot(z$df[-1]) plot(z$aic[-1], type = "b", main = "AIC") plot(z$cp1[-1], type = "b", main = "Cp1") plot(z$cp2[-1], type = "b", main = "Cp2") par(oldpar)
data(cassav) Xtrain <- cassav$Xtrain ytrain <- cassav$ytrain nlv <- 25 res <- aicplsr(Xtrain, ytrain, nlv = nlv) names(res) headm(res$crit) z <- res$crit oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 4)) plot(z$df[-1]) plot(z$aic[-1], type = "b", main = "AIC") plot(z$cp1[-1], type = "b", main = "Cp1") plot(z$cp2[-1], type = "b", main = "Cp2") par(oldpar)
ASD NIRS dataset, with gaps in the spectra at wawelengths = 1000 and 1800 nm.
data(asdgap)
data(asdgap)
A list with 1 element: the data frame with 5 spectra and 2151 variables.
Thanks to J.-F. Roger (Inrae, France) and M. Ecarnot (Inrae, France) for the method.
data(asdgap) names(asdgap) X <- asdgap$X numcol <- which(colnames(X) == "1000" | colnames(X) == "1800") numcol plotsp(X, lwd = 1.5) abline(v = as.numeric(colnames(X)[1]) + numcol - 1, col = "grey", lty = 3)
data(asdgap) names(asdgap) X <- asdgap$X numcol <- which(colnames(X) == "1000" | colnames(X) == "1800") numcol plotsp(X, lwd = 1.5) abline(v = as.numeric(colnames(X)[1]) + numcol - 1, col = "grey", lty = 3)
Functions managing blocks of data.
- blockscal
: Autoscales a list of blocks (i.e. sets of columns) of a training X-data, and eventually the blocks of new X-data. The scaling factor (computed on the training) is the "norm" of the block, i.e. the square root of the sum of the variances of each column of the block.
- mblocks
: Makes a list of blocks from X-data.
- hconcat
: Concatenates horizontally the blocks of a list.
blockscal(Xtrain, X = NULL, weights = NULL) mblocks(X, blocks) hconcat(X)
blockscal(Xtrain, X = NULL, weights = NULL) mblocks(X, blocks) hconcat(X)
Xtrain |
A list of blocks of training X-data |
X |
For |
blocks |
A list (of same length as the number of blocks) giving the column numbers in |
weights |
Weights ( |
For mblocks
: a list of blocks of X-data.
For hconcat
: a matrix concatenating a list of data blocks.
For blockscal
:
Xtrain |
A list of blocks of training X-data, after block autoscaling. |
X |
A list of blocks of new X-data, after block autoscaling. |
disp |
The scaling factor (computed on the training). |
The second example is equivalent to MB-PLSR
n <- 10 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) m <- 2 Xtest <- matrix(rnorm(m * p), ncol = p) colnames(Xtest) <- paste("v", 1:p, sep = "") Xtrain Xtest blocks <- list(1:2, 4, 6:8) zXtrain <- mblocks(Xtrain, blocks = blocks) zXtest <- mblocks(Xtest, blocks = blocks) zXtrain blockscal(zXtrain, zXtest) res <- blockscal(zXtrain, zXtest) hconcat(res$Xtrain) hconcat(res$X) ## example of equivalence with MB-PLSR n <- 10 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) m <- 2 Xtest <- matrix(rnorm(m * p), ncol = p) colnames(Xtest) <- paste("v", 1:p, sep = "") Xtrain Xtest blocks <- list(1:2, 4, 6:8) X1 <- mblocks(Xtrain, blocks = blocks) X1 <- lapply(1:length(X1), function(x) scale(X1[[x]])) res <- blockscal(X1) zXtrain <- hconcat(res$Xtrain) nlv <- 3 fm <- plskern(zXtrain, ytrain, nlv = nlv)
n <- 10 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) m <- 2 Xtest <- matrix(rnorm(m * p), ncol = p) colnames(Xtest) <- paste("v", 1:p, sep = "") Xtrain Xtest blocks <- list(1:2, 4, 6:8) zXtrain <- mblocks(Xtrain, blocks = blocks) zXtest <- mblocks(Xtest, blocks = blocks) zXtrain blockscal(zXtrain, zXtest) res <- blockscal(zXtrain, zXtest) hconcat(res$Xtrain) hconcat(res$X) ## example of equivalence with MB-PLSR n <- 10 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) m <- 2 Xtest <- matrix(rnorm(m * p), ncol = p) colnames(Xtest) <- paste("v", 1:p, sep = "") Xtrain Xtest blocks <- list(1:2, 4, 6:8) X1 <- mblocks(Xtrain, blocks = blocks) X1 <- lapply(1:length(X1), function(x) scale(X1[[x]])) res <- blockscal(X1) zXtrain <- hconcat(res$Xtrain) nlv <- 3 fm <- plskern(zXtrain, ytrain, nlv = nlv)
A NIRS dataset (absorbance) describing the concentration of a natural pigment in samples of tropical shrubs. Spectra were recorded from 400 to 2498 nm at 2 nm intervals.
data(cassav)
data(cassav)
A list with the following components:
For the reference (calibration) data:
Xtrain
A matrix whose rows are the NIR absorbance spectra (= log10(1 / Reflectance)).
ytrain
A vector of the response variable (pigment concentration).
year
A vector of the year of data collection (2009 to 2012; the test set correponds to year 2013).
For the test data:
Xtest
A matrix whose rows are the NIR absorbance spectra (= log10(1 / Reflectance)).
ytest
A vector of the response variable (pigment concentration).
Davrieux, F., Dufour, D., Dardenne, P., Belalcazar, J., Pizarro, M., Luna, J., Londono, L., Jaramillo, A., Sanchez, T., Morante, N., Calle, F., Becerra Lopez-Lavalle, L., Ceballos, H., 2016. LOCAL regression algorithm improves near infrared spectroscopy predictions when the target constituent evolves in breeding populations. Journal of Near Infrared Spectroscopy 24, 109. https://doi.org/10.1255/jnirs.1213
CIAT Cassava Project (Colombia), CIRAD Qualisud Research Unit, and funded mainly by the CGIAR Research Program on Roots, Tubers and Bananas (RTB) with support from CGIAR Trust Fund contributors (https://www.cgiar.org/funders/).
data(cassav) str(cassav)
data(cassav) str(cassav)
Conjugate gradient algorithm (CG) for the normal equations (CGLS algorithm 7.4.1, Bjorck 1996, p.289)
cglsr(X, y, nlv, reorth = TRUE, filt = FALSE) ## S3 method for class 'Cglsr' coef(object, ..., nlv = NULL) ## S3 method for class 'Cglsr' predict(object, X, ..., nlv = NULL)
cglsr(X, y, nlv, reorth = TRUE, filt = FALSE) ## S3 method for class 'Cglsr' coef(object, ..., nlv = NULL) ## S3 method for class 'Cglsr' predict(object, X, ..., nlv = NULL)
X |
For the main function: Training X-data ( |
y |
Univariate training Y-data ( |
nlv |
The number(s) of CG iterations. |
reorth |
Logical. If |
filt |
Logical. If |
object |
For auxiliary functions: A fitted model, output of a call to the main functions. |
... |
For auxiliary functions: Optional arguments. Not used. |
The code for re-orthogonalization (Hansen 1998) and filter factors (Vogel 1987, Hansen 1998) computations is a transcription (with few adaptations) of the matlab function 'cgls' (Saunders et al. https://web.stanford.edu/group/SOL/software/cgls/; Hansen 2008).
The filter factors can be used to compute the model complexity of CGLSR and PLSR models (see dfplsr_cg
).
Data and
are internally centered.
Missing values are not allowed.
For cglsr
:
B |
matrix with the model coefficients for the fix nlv. |
gnew |
squared norm of the s vector |
xmeans |
variable means for the training X-data |
ymeans |
variable means for the training Y-data |
F |
If |
For coef.Cglsr
:
int |
intercept value. |
B |
matrix with the model coefficients. |
For predict.Cglsr
:
pred |
list of matrices, with the predicted values for each number |
Bjorck, A., 1996. Numerical Methods for Least Squares Problems, Other Titles in Applied Mathematics. Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9781611971484
Hansen, P.C., 1998. Rank-Deficient and Discrete Ill-Posed Problems, Mathematical Modeling and Computation. Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9780898719697
Hansen, P.C., 2008. Regularization Tools version 4.0 for Matlab 7.3. Numer Algor 46, 189-194. https://doi.org/10.1007/s11075-007-9136-9
Manne R. Analysis of two partial-least-squares algorithms for multivariate calibration. Chemometrics Intell. Lab. Syst. 1987; 2: 187-197.
Phatak A, De Hoog F. Exploiting the connection between PLS, Lanczos methods and conjugate gradients: alternative proofs of some properties of PLS. J. Chemometrics 2002; 16: 361-367.
Vogel, C. R., "Solving ill-conditioned linear systems using the conjugate gradient method", Report, Dept. of Mathematical Sciences, Montana State University, 1987.
z <- ozone$X u <- which(!is.na(rowSums(z))) X <- z[u, -4] y <- z[u, 4] dim(X) headm(X) Xtest <- X[1:2, ] ytest <- y[1:2] nlv <- 10 fm <- cglsr(X, y, nlv = nlv) coef(fm) coef(fm, nlv = 1) predict(fm, Xtest) predict(fm, Xtest, nlv = 1:3) pred <- predict(fm, Xtest)$pred msep(pred, ytest) cglsr(X, y, nlv = 5, filt = TRUE)$F
z <- ozone$X u <- which(!is.na(rowSums(z))) X <- z[u, -4] y <- z[u, 4] dim(X) headm(X) Xtest <- X[1:2, ] ytest <- y[1:2] nlv <- 10 fm <- cglsr(X, y, nlv = nlv) coef(fm) coef(fm, nlv = 1) predict(fm, Xtest) predict(fm, Xtest, nlv = 1:3) pred <- predict(fm, Xtest)$pred msep(pred, ytest) cglsr(X, y, nlv = 5, filt = TRUE)$F
Finding and removing duplicated row observations in datasets.
checkdupl(X, Y = NULL, digits = NULL)
checkdupl(X, Y = NULL, digits = NULL)
X |
A dataset. |
Y |
A dataset compared to |
digits |
The number of digits when rounding the data before the duplication test. Default to |
a dataframe with the row numbers in the first and second datasets that are identical, and the values of the variables.
X1 <- matrix(c(1:5, 1:5, c(1, 2, 7, 4, 8)), nrow = 3, byrow = TRUE) dimnames(X1) <- list(1:3, c("v1", "v2", "v3", "v4", "v5")) X2 <- matrix(c(6:10, 1:5, c(1, 2, 7, 6, 12)), nrow = 3, byrow = TRUE) dimnames(X2) <- list(1:3, c("v1", "v2", "v3", "v4", "v5")) X1 X2 checkdupl(X1, X2) checkdupl(X1) checkdupl(matrix(rnorm(20), nrow = 5)) res <- checkdupl(X1) s <- unique(res$rownum2) zX1 <- X1[-s, ] zX1
X1 <- matrix(c(1:5, 1:5, c(1, 2, 7, 4, 8)), nrow = 3, byrow = TRUE) dimnames(X1) <- list(1:3, c("v1", "v2", "v3", "v4", "v5")) X2 <- matrix(c(6:10, 1:5, c(1, 2, 7, 6, 12)), nrow = 3, byrow = TRUE) dimnames(X2) <- list(1:3, c("v1", "v2", "v3", "v4", "v5")) X1 X2 checkdupl(X1, X2) checkdupl(X1) checkdupl(matrix(rnorm(20), nrow = 5)) res <- checkdupl(X1) s <- unique(res$rownum2) zX1 <- X1[-s, ] zX1
Find and count NA values in each row observation of a dataset.
checkna(X)
checkna(X)
X |
A dataset. |
A data frame summarizing the numbers of NA by rows.
X <- data.frame( v1 = c(NA, rnorm(9)), v2 = c(NA, rnorm(8), NA), v3 = c(NA, NA, NA, rnorm(7)) ) X checkna(X)
X <- data.frame( v1 = c(NA, rnorm(9)), v2 = c(NA, rnorm(8), NA), v3 = c(NA, NA, NA, rnorm(7)) ) X checkna(X)
Variable selection for high-dimensionnal data with the COVSEL method (Roger et al. 2011).
covsel(X, Y, nvar = NULL, scaly = TRUE, weights = NULL)
covsel(X, Y, nvar = NULL, scaly = TRUE, weights = NULL)
X |
X-data ( |
Y |
Y-data ( |
nvar |
Number of variables to select in |
scaly |
If |
weights |
Weights ( |
sel |
A dataframe where variable |
weights |
The weights used for the row observations. |
Roger, J.M., Palagos, B., Bertrand, D., Fernandez-Ahumada, E., 2011. CovSel: Variable selection for highly multivariate and multi-response calibration: Application to IR spectroscopy. Chem. Lab. Int. Syst. 106, 216-223.
n <- 6 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) Y <- matrix(rnorm(n * 2), ncol = 2) covsel(X, Y, nvar = 3)
n <- 6 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) Y <- matrix(rnorm(n * 2), ncol = 2) covsel(X, Y, nvar = 3)
Calculation of the first derivatives, by finite differences, of the row observations (e.g. spectra) of a dataset.
dderiv(X, n = 5, ts = 1)
dderiv(X, n = 5, ts = 1)
X |
X-data ( |
n |
The number of points (i.e. columns of |
ts |
A scaling factor for the finite differences (by default, |
A matrix of the transformed data.
data(cassav) X <- cassav$Xtest n <- 15 Xp_derivate1 <- dderiv(X, n = n) Xp_derivate2 <- dderiv(dderiv(X, n), n) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plotsp(X, main = "Signal") plotsp(Xp_derivate1, main = "Corrected signal") abline(h = 0, lty = 2, col = "grey") par(oldpar)
data(cassav) X <- cassav$Xtest n <- 15 Xp_derivate1 <- dderiv(X, n = n) Xp_derivate2 <- dderiv(dderiv(X, n), n) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plotsp(X, main = "Signal") plotsp(Xp_derivate1, main = "Corrected signal") abline(h = 0, lty = 2, col = "grey") par(oldpar)
Polynomial de-trend transformation of row observations (e.g. spectra) of a dataset. The function fits an orthogonal polynom of a given degree to each observation and returns the residuals.
detrend(X, degree = 1)
detrend(X, degree = 1)
X |
X-data ( |
degree |
Degree of the polynom. |
detrend
uses function poly
of package stats
.
A matrix of the transformed data.
data(cassav) X <- cassav$Xtest degree <- 1 Xp <- detrend(X, degree = degree) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plotsp(X, main = "Signal") plotsp(Xp, main = "Corrected signal") abline(h = 0, lty = 2, col = "grey") par(oldpar)
data(cassav) X <- cassav$Xtest degree <- 1 Xp <- detrend(X, degree = degree) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plotsp(X, main = "Signal") plotsp(Xp, main = "Corrected signal") abline(h = 0, lty = 2, col = "grey") par(oldpar)
Computation of the model complexity (number of degrees of freedom) of univariate PLSR models (with intercept). See Lesnoff et al. 2021 for an illustration.
(1) Estimation from the CGLSR algorithm (Hansen, 1998).
- dfplsr_cov
(2) Monte Carlo estimation (Ye, 1998 and Efron, 2004). Details in relation with the functions are given in Lesnoff et al. 2021.
- dfplsr_cov
: The covariances are computed by parametric bootstrap (Efron, 2004, Eq. 2.16). The residual variance is estimated from a low-biased model.
- dfplsr_div
: The divergencies are computed by perturbation analysis(Ye, 1998 and Efron, 2004). This is a Stein unbiased risk estimation (SURE) of
.
dfplsr_cg(X, y, nlv, reorth = TRUE) dfplsr_cov( X, y, nlv, algo = NULL, maxlv = 50, B = 30, print = FALSE, ...) dfplsr_div( X, y, nlv, algo = NULL, eps = 1e-2, B = 30, print = FALSE, ...)
dfplsr_cg(X, y, nlv, reorth = TRUE) dfplsr_cov( X, y, nlv, algo = NULL, maxlv = 50, B = 30, print = FALSE, ...) dfplsr_div( X, y, nlv, algo = NULL, eps = 1e-2, B = 30, print = FALSE, ...)
X |
A |
y |
A vector of length |
nlv |
The maximal number of latent variables (LVs) to consider in the model. |
reorth |
For |
algo |
a PLS algorithm. Default to |
maxlv |
For |
eps |
For |
B |
For |
print |
Logical. If |
... |
Optionnal arguments to pass in the function defined in |
Missing values are not allowed.
The example below reproduces the numerical illustration given by Kramer & Sugiyama 2011 on the Ozone data (Fig. 1, center).
The function from the R package "plsdof" v0.2-9 (Kramer & Braun 2019) is used for
calculations (
), and automatically scales the X matrix before PLS. The example scales also X for consistency when using the other functions.
For the Monte Carlo estimations, B
Should be increased for more stability
A list of outputs :
df |
vector with the model complexity for the models with |
cov |
For |
Efron, B., 2004. The Estimation of Prediction Error. Journal of the American Statistical Association 99, 619-632. https://doi.org/10.1198/016214504000000692
Hastie, T., Tibshirani, R.J., 1990. Generalized Additive Models, Monographs on statistics and applied probablity. Chapman and Hall/CRC, New York, USA.
Hastie, T., Tibshirani, R., Friedman, J., 2009. The elements of statistical learning: data mining, inference, and prediction, 2nd ed. Springer, NewYork.
Hastie, T., Tibshirani, R., Wainwright, M., 2015. Statistical Learning with Sparsity: The Lasso and Generalizations. CRC Press
Kramer, N., Braun, M.L., 2007. Kernelizing PLS, degrees of freedom, and efficient model selection, in: Proceedings of the 24th International Conference on Machine Learning, ICML 07. Association for Computing Machinery, New York, NY, USA, pp. 441-448. https://doi.org/10.1145/1273496.1273552
Kramer, N., Sugiyama, M., 2011. The Degrees of Freedom of Partial Least Squares Regression. Journal of the American Statistical Association 106, 697-705. https://doi.org/10.1198/jasa.2011.tm10107
Kramer, N., Braun, M. L. 2019. plsdof: Degrees of Freedom and Statistical Inference for Partial Least Squares Regression. R package version 0.2-9. https://cran.r-project.org
Lesnoff, M., Roger, J.M., Rutledge, D.N., 2021. Monte Carlo methods for estimating Mallow's Cp and AIC criteria for PLSR models. Illustration on agronomic spectroscopic NIR data. Journal of Chemometrics, 35(10), e3369. https://doi.org/10.1002/cem.3369
Stein, C.M., 1981. Estimation of the Mean of a Multivariate Normal Distribution. The Annals of Statistics 9, 1135-1151.
Ye, J., 1998. On Measuring and Correcting the Effects of Data Mining and Model Selection. Journal of the American Statistical Association 93, 120-131. https://doi.org/10.1080/01621459.1998.10474094
Zou, H., Hastie, T., Tibshirani, R., 2007. On the degrees of freedom of the lasso. The Annals of Statistics 35, 2173-2192. https://doi.org/10.1214/009053607000000127
## EXAMPLE 1 data(ozone) z <- ozone$X u <- which(!is.na(rowSums(z))) X <- z[u, -4] y <- z[u, 4] dim(X) Xs <- scale(X) nlv <- 12 res <- dfplsr_cg(Xs, y, nlv = nlv) df.kramer <- c(1.000000, 3.712373, 6.456417, 11.633565, 12.156760, 11.715101, 12.349716, 12.192682, 13.000000, 13.000000, 13.000000, 13.000000, 13.000000) znlv <- 0:nlv plot(znlv, res$df, type = "l", col = "red", ylim = c(0, 15), xlab = "Nb components", ylab = "df") lines(znlv, znlv + 1, col = "grey40") points(znlv, df.kramer, pch = 16) abline(h = 1, lty = 2, col = "grey") legend("bottomright", legend=c("dfplsr_cg","Naive df","df.kramer"), col=c("red","grey40","black"), lty=c(1,1,0), pch=c(NA,NA,16), bty="n") ## EXAMPLE 2 data(ozone) z <- ozone$X u <- which(!is.na(rowSums(z))) X <- z[u, -4] y <- z[u, 4] dim(X) Xs <- scale(X) nlv <- 12 B <- 50 u <- dfplsr_cov(Xs, y, nlv = nlv, B = B) v <- dfplsr_div(Xs, y, nlv = nlv, B = B) df.kramer <- c(1.000000, 3.712373, 6.456417, 11.633565, 12.156760, 11.715101, 12.349716, 12.192682, 13.000000, 13.000000, 13.000000, 13.000000, 13.000000) znlv <- 0:nlv plot(znlv, u$df, type = "l", col = "red", ylim = c(0, 15), xlab = "Nb components", ylab = "df") lines(znlv, v$df, col = "blue") lines(znlv, znlv + 1, col = "grey40") points(znlv, df.kramer, pch = 16) abline(h = 1, lty = 2, col = "grey") legend("bottomright", legend=c("dfplsr_cov","dfplsr_div","Naive df","df.kramer"), col=c("blue","red","grey40","black"), lty=c(1,1,1,0), pch=c(NA,NA,NA,16), bty="n")
## EXAMPLE 1 data(ozone) z <- ozone$X u <- which(!is.na(rowSums(z))) X <- z[u, -4] y <- z[u, 4] dim(X) Xs <- scale(X) nlv <- 12 res <- dfplsr_cg(Xs, y, nlv = nlv) df.kramer <- c(1.000000, 3.712373, 6.456417, 11.633565, 12.156760, 11.715101, 12.349716, 12.192682, 13.000000, 13.000000, 13.000000, 13.000000, 13.000000) znlv <- 0:nlv plot(znlv, res$df, type = "l", col = "red", ylim = c(0, 15), xlab = "Nb components", ylab = "df") lines(znlv, znlv + 1, col = "grey40") points(znlv, df.kramer, pch = 16) abline(h = 1, lty = 2, col = "grey") legend("bottomright", legend=c("dfplsr_cg","Naive df","df.kramer"), col=c("red","grey40","black"), lty=c(1,1,0), pch=c(NA,NA,16), bty="n") ## EXAMPLE 2 data(ozone) z <- ozone$X u <- which(!is.na(rowSums(z))) X <- z[u, -4] y <- z[u, 4] dim(X) Xs <- scale(X) nlv <- 12 B <- 50 u <- dfplsr_cov(Xs, y, nlv = nlv, B = B) v <- dfplsr_div(Xs, y, nlv = nlv, B = B) df.kramer <- c(1.000000, 3.712373, 6.456417, 11.633565, 12.156760, 11.715101, 12.349716, 12.192682, 13.000000, 13.000000, 13.000000, 13.000000, 13.000000) znlv <- 0:nlv plot(znlv, u$df, type = "l", col = "red", ylim = c(0, 15), xlab = "Nb components", ylab = "df") lines(znlv, v$df, col = "blue") lines(znlv, znlv + 1, col = "grey40") points(znlv, df.kramer, pch = 16) abline(h = 1, lty = 2, col = "grey") legend("bottomright", legend=c("dfplsr_cov","dfplsr_div","Naive df","df.kramer"), col=c("blue","red","grey40","black"), lty=c(1,1,1,0), pch=c(NA,NA,NA,16), bty="n")
Direct kernel PLSR (DKPLSR) (Bennett & Embrechts 2003). The method builds kernel Gram matrices and then runs a usual PLSR algorithm on them. This is faster (but not equivalent) to the "true" NIPALS KPLSR algorithm such as described in Rosipal & Trejo (2001).
dkplsr(X, Y, weights = NULL, nlv, kern = "krbf", ...) ## S3 method for class 'Dkpls' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Dkpls' coef(object, ..., nlv = NULL) ## S3 method for class 'Dkplsr' predict(object, X, ..., nlv = NULL)
dkplsr(X, Y, weights = NULL, nlv, kern = "krbf", ...) ## S3 method for class 'Dkpls' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Dkpls' coef(object, ..., nlv = NULL) ## S3 method for class 'Dkplsr' predict(object, X, ..., nlv = NULL)
X |
For the main function: Matrix with the training X-data ( |
Y |
Matrix with the training Y-data ( |
weights |
vector of weights ( |
nlv |
For the main function: The number(s) of LVs to calculate. — For auxiliary functions: The number(s) of LVs to consider. |
kern |
Name of the function defining the considered kernel for building the Gram matrix. See |
... |
Optional arguments to pass in the kernel function defined in |
object |
For auxiliary functions: A fitted model, output of a call to the main function. |
For dkplsr
:
X |
Matrix with the training X-data ( |
fm |
List with the outputs of the PLSR (( |
K |
kernel Gram matrix |
kern |
kernel function |
dots |
Optional arguments passed in the kernel function |
For transform.Dkplsr
: A matrix () with the projection of the new X-data on the X-scores
For predict.Dkplsr
:
pred |
A list of matrices ( |
K |
kernel Gram matrix ( |
For coef.Dkplsr
:
int |
matrix (1,nlv) with the intercepts |
B |
matrix (n,nlv) with the coefficients |
The second example concerns the fitting of the function sinc(x) described in Rosipal & Trejo 2001 p. 105-106
Bennett, K.P., Embrechts, M.J., 2003. An optimization perspective on kernel partial least squares regression, in: Advances in Learning Theory: Methods, Models and Applications, NATO Science Series III: Computer & Systems Sciences. IOS Press Amsterdam, pp. 227-250.
Rosipal, R., Trejo, L.J., 2001. Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. Journal of Machine Learning Research 2, 97-123.
## EXAMPLE 1 n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] nlv <- 2 fm <- dkplsr(Xtrain, Ytrain, nlv = nlv, kern = "krbf", gamma = .8) transform(fm, Xtest) transform(fm, Xtest, nlv = 1) coef(fm) coef(fm, nlv = 1) predict(fm, Xtest) predict(fm, Xtest, nlv = 0:nlv)$pred pred <- predict(fm, Xtest)$pred msep(pred, Ytest) nlv <- 2 fm <- dkplsr(Xtrain, Ytrain, nlv = nlv, kern = "kpol", degree = 2, coef0 = 10) predict(fm, Xtest, nlv = nlv) ## EXAMPLE 2 x <- seq(-10, 10, by = .2) x[x == 0] <- 1e-5 n <- length(x) zy <- sin(abs(x)) / abs(x) y <- zy + rnorm(n, 0, .2) plot(x, y, type = "p") lines(x, zy, lty = 2) X <- matrix(x, ncol = 1) nlv <- 3 fm <- dkplsr(X, y, nlv = nlv) pred <- predict(fm, X)$pred plot(X, y, type = "p") lines(X, zy, lty = 2) lines(X, pred, col = "red")
## EXAMPLE 1 n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] nlv <- 2 fm <- dkplsr(Xtrain, Ytrain, nlv = nlv, kern = "krbf", gamma = .8) transform(fm, Xtest) transform(fm, Xtest, nlv = 1) coef(fm) coef(fm, nlv = 1) predict(fm, Xtest) predict(fm, Xtest, nlv = 0:nlv)$pred pred <- predict(fm, Xtest)$pred msep(pred, Ytest) nlv <- 2 fm <- dkplsr(Xtrain, Ytrain, nlv = nlv, kern = "kpol", degree = 2, coef0 = 10) predict(fm, Xtest, nlv = nlv) ## EXAMPLE 2 x <- seq(-10, 10, by = .2) x[x == 0] <- 1e-5 n <- length(x) zy <- sin(abs(x)) / abs(x) y <- zy + rnorm(n, 0, .2) plot(x, y, type = "p") lines(x, zy, lty = 2) X <- matrix(x, ncol = 1) nlv <- 3 fm <- dkplsr(X, y, nlv = nlv) pred <- predict(fm, X)$pred plot(X, y, type = "p") lines(X, zy, lty = 2) lines(X, pred, col = "red")
Direct kernel ridge regression (DKRR), following the same approcah as for DKPLSR (Bennett & Embrechts 2003). The method builds kernel Gram matrices and then runs a RR algorithm on them. This is not equivalent to the "true" KRR (= LS-SVM) algorithm.
dkrr(X, Y, weights = NULL, lb = 1e-2, kern = "krbf", ...) ## S3 method for class 'Dkrr' coef(object, ..., lb = NULL) ## S3 method for class 'Dkrr' predict(object, X, ..., lb = NULL)
dkrr(X, Y, weights = NULL, lb = 1e-2, kern = "krbf", ...) ## S3 method for class 'Dkrr' coef(object, ..., lb = NULL) ## S3 method for class 'Dkrr' predict(object, X, ..., lb = NULL)
X |
For the main function: Training X-data ( |
Y |
Training Y-data ( |
weights |
Weights ( |
lb |
A value of regularization parameter |
kern |
Name of the function defining the considered kernel for building the Gram matrix. See |
... |
Optional arguments to pass in the kernel function defined in |
object |
For the auxiliary functions: A fitted model, output of a call to the main function. |
For dkrr
:
X |
Matrix with the training X-data ( |
fm |
List with the outputs of the RR (( |
K |
kernel Gram matrix |
kern |
kernel function |
dots |
Optional arguments passed in the kernel function |
For predict.Dkrr
:
pred |
A list of matrices ( |
K |
kernel Gram matrix ( |
For coef.Dkrr
:
int |
matrix (1,nlv) with the intercepts |
B |
matrix (n,nlv) with the coefficients |
df |
model complexity (number of degrees of freedom) |
The second example concerns the fitting of the function sinc(x) described in Rosipal & Trejo 2001 p. 105-106
Bennett, K.P., Embrechts, M.J., 2003. An optimization perspective on kernel partial least squares regression, in: Advances in Learning Theory: Methods, Models and Applications, NATO Science Series III: Computer & Systems Sciences. IOS Press Amsterdam, pp. 227-250.
Rosipal, R., Trejo, L.J., 2001. Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. Journal of Machine Learning Research 2, 97-123.
## EXAMPLE 1 n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] lb <- 2 fm <- dkrr(Xtrain, Ytrain, lb = lb, kern = "krbf", gamma = .8) coef(fm) coef(fm, lb = .6) predict(fm, Xtest) predict(fm, Xtest, lb = c(0.1, .8)) pred <- predict(fm, Xtest)$pred msep(pred, Ytest) lb <- 2 fm <- dkrr(Xtrain, Ytrain, lb = lb, kern = "kpol", degree = 2, coef0 = 10) predict(fm, Xtest) ## EXAMPLE 1 x <- seq(-10, 10, by = .2) x[x == 0] <- 1e-5 n <- length(x) zy <- sin(abs(x)) / abs(x) y <- zy + rnorm(n, 0, .2) plot(x, y, type = "p") lines(x, zy, lty = 2) X <- matrix(x, ncol = 1) fm <- dkrr(X, y, lb = .01, gamma = .5) pred <- predict(fm, X)$pred plot(X, y, type = "p") lines(X, zy, lty = 2) lines(X, pred, col = "red")
## EXAMPLE 1 n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] lb <- 2 fm <- dkrr(Xtrain, Ytrain, lb = lb, kern = "krbf", gamma = .8) coef(fm) coef(fm, lb = .6) predict(fm, Xtest) predict(fm, Xtest, lb = c(0.1, .8)) pred <- predict(fm, Xtest)$pred msep(pred, Ytest) lb <- 2 fm <- dkrr(Xtrain, Ytrain, lb = lb, kern = "kpol", degree = 2, coef0 = 10) predict(fm, Xtest) ## EXAMPLE 1 x <- seq(-10, 10, by = .2) x[x == 0] <- 1e-5 n <- length(x) zy <- sin(abs(x)) / abs(x) y <- zy + rnorm(n, 0, .2) plot(x, y, type = "p") lines(x, zy, lty = 2) X <- matrix(x, ncol = 1) fm <- dkrr(X, y, lb = .01, gamma = .5) pred <- predict(fm, X)$pred plot(X, y, type = "p") lines(X, zy, lty = 2) lines(X, pred, col = "red")
Prediction of the normal probability density of multivariate observations.
dmnorm(X = NULL, mu = NULL, sigma = NULL) ## S3 method for class 'Dmnorm' predict(object, X, ...)
dmnorm(X = NULL, mu = NULL, sigma = NULL) ## S3 method for class 'Dmnorm' predict(object, X, ...)
X |
For the main function: Training data ( |
mu |
The mean ( |
sigma |
The covariance matrix ( |
object |
For the auxiliary functions: A result of a call to |
... |
For the auxiliary functions: Optional arguments. |
For dmnorm
:
mu |
means of the X variables |
Uinv |
inverse of the Cholesky decomposition of the covariance matrix |
det |
squared determinant of the Cholesky decomposition of the covariance matrix |
For predict
:
pred |
Prediction of the normal probability density of new multivariate observations |
data(iris) X <- iris[, 1:2] Xtrain <- X[1:40, ] Xtest <- X[40:50, ] fm <- dmnorm(Xtrain) fm k <- 50 x1 <- seq(min(Xtrain[, 1]), max(Xtrain[, 1]), length.out = k) x2 <- seq(min(Xtrain[, 2]), max(Xtrain[, 2]), length.out = k) zX <- expand.grid(x1, x2) pred <- predict(fm, zX)$pred contour(x1, x2, matrix(pred, nrow = 50)) points(Xtest, col = "red", pch = 16)
data(iris) X <- iris[, 1:2] Xtrain <- X[1:40, ] Xtest <- X[40:50, ] fm <- dmnorm(Xtrain) fm k <- 50 x1 <- seq(min(Xtrain[, 1]), max(Xtrain[, 1]), length.out = k) x2 <- seq(min(Xtrain[, 2]), max(Xtrain[, 2]), length.out = k) zX <- expand.grid(x1, x2) pred <- predict(fm, zX)$pred contour(x1, x2, matrix(pred, nrow = 50)) points(Xtest, col = "red", pch = 16)
Faster alternative to aggregate
to calculate a summary statistic over data subsets.
dtagg
uses function data.table::data.table
of package data.table
.
dtagg(formula, data, FUN = mean, ...)
dtagg(formula, data, FUN = mean, ...)
formula |
A left and right-hand-sides formula defing the variable and the aggregation levels on which is calculated the statistic. |
data |
A dataframe. |
FUN |
Function defining the statistic to compute (default to |
... |
Eventual additional arguments to pass through |
A dataframe, with the values of the agregation level(s) and the corresponding computed statistic value.
dat <- data.frame(matrix(rnorm(2 * 100), ncol = 2)) names(dat) <- c("y1", "y2") dat$typ1 <- sample(1:2, size = nrow(dat), TRUE) dat$typ2 <- sample(1:3, size = nrow(dat), TRUE) headm(dat) dtagg(y1 ~ 1, data = dat) dtagg(y1 ~ typ1 + typ2, data = dat) dtagg(y1 ~ typ1 + typ2, data = dat, trim = .2)
dat <- data.frame(matrix(rnorm(2 * 100), ncol = 2)) names(dat) <- c("y1", "y2") dat$typ1 <- sample(1:2, size = nrow(dat), TRUE) dat$typ2 <- sample(1:3, size = nrow(dat), TRUE) headm(dat) dtagg(y1 ~ 1, data = dat) dtagg(y1 ~ typ1 + typ2, data = dat) dtagg(y1 ~ typ1 + typ2, data = dat, trim = .2)
The function builds a table of dummy variables from a qualitative variable. A binary (i.e. 0/1) variable is created for each level of the qualitative variable.
dummy(y)
dummy(y)
y |
A qualitative variable. |
Y |
A matrix of dummy variables (i.e. binary variables), each representing a given level of the qualitative variable. |
lev |
levels of the qualitative variable. |
ni |
number of observations per level of the qualitative variable. |
y <- c(1, 1, 3, 2, 3) dummy(y) y <- c("B", "a", "B") dummy(y) dummy(as.factor(y))
y <- c(1, 1, 3, 2, 3) dummy(y) y <- c("B", "a", "B") dummy(y) dummy(as.factor(y))
Pre-processing a X-dataset by external parameter orthogonalization (EPO; Roger et al 2003). The objective is to remove from a dataset X some "detrimental" information (e.g. humidity effect) represented by a dataset
.
EPO consists in orthogonalizing the row observations of to the detrimental sub-space defined by the first
non-centered PCA loadings vectors of
.
Function eposvd
uses a SVD factorization of and returns
the orthogonalization matrix, and
the considered loading vectors of
.
eposvd(D, nlv)
eposvd(D, nlv)
D |
A dataset |
nlv |
The number of first loadings vectors of |
The data corrected from the detrimental information can be computed by
.
Rows of the corrected matrix Xcorr are orthogonal to the loadings vectors (columns of P):
.
M |
orthogonalization matrix. |
P |
detrimental directions matrix (p, nlv) (loadings of D = columns of P). |
Roger, J.-M., Chauchard, F., Bellon-Maurel, V., 2003. EPO-PLS external parameter orthogonalisation of PLS application to temperature-independent measurement of sugar content of intact fruits. Chemometrics and Intelligent Laboratory Systems 66, 191-204. https://doi.org/10.1016/S0169-7439(03)00051-0
Roger, J.-M., Boulet, J.-C., 2018. A review of orthogonal projections for calibration. Journal of Chemometrics 32, e3045. https://doi.org/10.1002/cem.3045
n <- 4 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p) m <- 3 D <- matrix(rnorm(m * p), ncol = p) nlv <- 2 res <- eposvd(D, nlv = nlv) M <- res$M P <- res$P M P
n <- 4 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p) m <- 3 D <- matrix(rnorm(m * p), ncol = p) nlv <- 2 res <- eposvd(D, nlv = nlv) M <- res$M P <- res$P M P
—– Matrix () of distances between row observations of two datasets
(
) and Y (
)
- euclsq
: Squared Euclidean distance
- mahsq
: Squared Mahalanobis distance
—– Matrix () of distances between row observations of a dataset
(
) and a vector
(
)
- euclsq_mu
: Squared Euclidean distance
- mahsq_mu
: Squared Euclidean distance
euclsq(X, Y = NULL) euclsq_mu(X, mu) mahsq(X, Y = NULL, Uinv = NULL) mahsq_mu(X, mu, Uinv = NULL)
euclsq(X, Y = NULL) euclsq_mu(X, mu) mahsq(X, Y = NULL, Uinv = NULL) mahsq_mu(X, mu, Uinv = NULL)
X |
X-data ( |
Y |
Data ( |
mu |
Vector ( |
Uinv |
For Mahalanobis distance. The inverse of a Choleski factorization matrix of the covariance matrix of |
A distance matrix.
n <- 5 ; p <- 3 X <- matrix(rnorm(n * p), ncol = p) euclsq(X) as.matrix(stats::dist(X)^2) euclsq(X, X) Y <- X[c(1, 3), ] euclsq(X, Y) euclsq_mu(X, Y[2, ]) i <- 3 euclsq(X, X[i, , drop = FALSE]) euclsq_mu(X, X[i, ]) S <- cov(X) * (n - 1) / n i <- 3 mahsq(X)[i, , drop = FALSE] stats::mahalanobis(X, X[i, ], S) mahsq(X) Y <- X[c(1, 3), ] mahsq(X, Y)
n <- 5 ; p <- 3 X <- matrix(rnorm(n * p), ncol = p) euclsq(X) as.matrix(stats::dist(X)^2) euclsq(X, X) Y <- X[c(1, 3), ] euclsq(X, Y) euclsq_mu(X, Y[2, ]) i <- 3 euclsq(X, X[i, , drop = FALSE]) euclsq_mu(X, X[i, ]) S <- cov(X) * (n - 1) / n i <- 3 mahsq(X)[i, , drop = FALSE] stats::mahalanobis(X, X[i, ], S) mahsq(X) Y <- X[c(1, 3), ] mahsq(X, Y)
Factorial discriminant analysis (FDA). The functions maximize the compromise , i.e.
with constraint
. Vectors
are the linear discrimant coefficients "LD".
- fda
: Eigen factorization of
- fdasvd
: Weighted SVD factorization of the matrix of the class centers.
If is singular, W^(-1) is replaced by a MP pseudo-inverse.
fda(X, y, nlv = NULL) fdasvd(X, y, nlv = NULL) ## S3 method for class 'Fda' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Fda' summary(object, ...)
fda(X, y, nlv = NULL) fdasvd(X, y, nlv = NULL) ## S3 method for class 'Fda' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Fda' summary(object, ...)
X |
For the main functions: Training X-data ( |
y |
Training class membership ( |
nlv |
For the main functions: The number(s) of LVs to calculate. — For the auxiliary functions: The number(s) of LVs to consider. |
object |
For the auxiliary functions: A fitted model, output of a call to the main function. |
... |
For the auxiliary functions: Optional arguments. Not used. |
For fda
and fdasvd
:
T |
X-scores matrix (n,nlv). |
P |
X-loadings matrix (p,nlv) = coefficients of the linear discriminant function = "LD" of function lda of package MASS. |
Tcenters |
projection of the class centers in the score space. |
eig |
vector of eigen values |
sstot |
total variance |
W |
unbiased within covariance matrix |
xmeans |
means of the X variables |
lev |
y levels |
ni |
number of observations per level of the y variable |
For transform.Fda
: scores of the new X-data in the model.
For summary.Fda
:
explvar |
Explained variance by PCA of the class centers in transformed scale. |
Saporta G., 2011. Probabilités analyse des données et statistique. Editions Technip, Paris, France.
data(iris) X <- iris[, 1:4] y <- iris[, 5] table(y) fm <- fda(X, y) headm(fm$T) transform(fm, X[1:3, ]) summary(fm) plotxy(fm$T, group = y, ellipse = TRUE, zeroes = TRUE, pch = 16, cex = 1.5, ncol = 2) points(fm$Tcenters, pch = 8, col = "blue", cex = 1.5)
data(iris) X <- iris[, 1:4] y <- iris[, 5] table(y) fm <- fda(X, y) headm(fm$T) transform(fm, X[1:3, ]) summary(fm) plotxy(fm$T, group = y, ellipse = TRUE, zeroes = TRUE, pch = 16, cex = 1.5, ncol = 2) points(fm$Tcenters, pch = 8, col = "blue", cex = 1.5)
A NIRS dataset (pre-processed absorbance) describing the class membership of forages. Spectra were recorded from 1100 to 2498 nm at 2 nm intervals.
data(forages)
data(forages)
A list with 4 components: Xtrain
, ytrain
, Xtest
, ytest
.
For the reference (calibration) data:
Xtrain
A matrix whose rows are the pre-processed NIR absorbance spectra (= log10(1 / Reflectance)).
ytrain
A vector of the response variable (class membership).
For the test data:
Xtest
A matrix whose rows are the pre-processed NIR absorbance spectra (= log10(1 / Reflectance)).
ytest
A vector of the response variable (class membership).
data(forages) str(forages)
data(forages) str(forages)
Function getknn
selects the nearest neighbours of each row observation of a new data set (= query) within a training data set, based on a dissimilarity measure.
getknn
uses function get.knnx
of package FNN
(Beygelzimer et al.) available on CRAN.
getknn(Xtrain, X, k = NULL, diss = c("eucl", "mahal"), algorithm = "brute", list = TRUE)
getknn(Xtrain, X, k = NULL, diss = c("eucl", "mahal"), algorithm = "brute", list = TRUE)
Xtrain |
Training X-data ( |
X |
New X-data ( |
k |
The number of nearest neighbors to select in |
diss |
The type of dissimilarity used. Possible values are "eucl" (default; Euclidean distance) or "mahal" (Mahalanobis distance). |
algorithm |
Search algorithm used for Euclidean and Mahalanobis distances. Default to |
list |
If |
A list of outputs, such as:
nn |
A dataframe ( |
d |
A dataframe ( |
listnn |
Same as |
listd |
Same as |
n <- 10 p <- 4 X <- matrix(rnorm(n * p), ncol = p) Xtrain <- X Xtest <- X[c(1, 3), ] m <- nrow(Xtest) k <- 3 getknn(Xtrain, Xtest, k = k) fm <- pcasvd(Xtrain, nlv = 2) Ttrain <- fm$T Ttest <- transform(fm, Xtest) getknn(Ttrain, Ttest, k = k, diss = "mahal")
n <- 10 p <- 4 X <- matrix(rnorm(n * p), ncol = p) Xtrain <- X Xtest <- X[c(1, 3), ] m <- nrow(Xtest) k <- 3 getknn(Xtrain, Xtest, k = k) fm <- pcasvd(Xtrain, nlv = 2) Ttrain <- fm$T Ttest <- transform(fm, Xtest) getknn(Ttrain, Ttest, k = k, diss = "mahal")
Functions for cross-validating predictive models.
The functions return "scores" (average error rates) of predictions for a given model and a grid of parameter values, calculated from a cross-validation process.
- gridcv
: Can be used for any model.
- gridcvlv
: Specific to models using regularization by latent variables (LVs) (e.g. PLSR). Much faster than gridcv
.
- gridcvlb
: Specific to models using ridge regularization (e.g. RR). Much faster than gridcv
.
gridcv(X, Y, segm, score, fun, pars, verb = TRUE) gridcvlv(X, Y, segm, score, fun, nlv, pars = NULL, verb = TRUE) gridcvlb(X, Y, segm, score, fun, lb, pars = NULL, verb = TRUE)
gridcv(X, Y, segm, score, fun, pars, verb = TRUE) gridcvlv(X, Y, segm, score, fun, nlv, pars = NULL, verb = TRUE) gridcvlb(X, Y, segm, score, fun, lb, pars = NULL, verb = TRUE)
X |
Training X-data ( |
Y |
Training Y-data ( |
segm |
|
score |
A function calculating a prediction score (e.g. |
fun |
A function corresponding to the predictive model. |
nlv |
For |
lb |
For |
pars |
A list of named vectors. Each vector must correspond to an argument of the model function and gives the parameter values to consider for this argument. (see details) |
verb |
Logical. If |
Argument pars
(the grid) must be a list of named vectors, each vector corresponding to an argument of the model function and giving the parameter values to consider for this argument. This list can eventually be built with function mpars
, which returns all the combinations of the input parameters, see the examples.
For gridcvlv
, pars
must not contain nlv
(nb. LVs), and for gridcvlb
, lb
(regularization parameter ).
Dataframes with the prediction scores for the grid.
Examples are given: - with PLSR, using gridcv and gridcvlv (much faster) - with PLSLDA, using gridcv and gridcvlv (much faster) - with RR, using gridcv and gridcvlb (much faster) - with KRR, using gridcv and gridcvlb (much faster) - with LWPLSR, using gridcvlv
## EXAMPLE WITH PLSR n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) Y <- cbind(y, 10 * rnorm(n)) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm nlv <- 5 pars <- mpars(nlv = 1:nlv) pars gridcv( X, Y, segm, score = msep, fun = plskern, pars = pars, verb = TRUE) gridcvlv( X, Y, segm, score = msep, fun = plskern, nlv = 0:nlv, verb = TRUE) ## EXAMPLE WITH PLSLDA n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) y <- sample(c(1, 4, 10), size = n, replace = TRUE) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm nlv <- 5 pars <- mpars(nlv = 1:nlv, prior = c("unif", "prop")) pars gridcv( X, y, segm, score = err, fun = plslda, pars = pars, verb = TRUE) pars <- mpars(prior = c("unif", "prop")) pars gridcvlv( X, y, segm, score = err, fun = plslda, nlv = 1:nlv, pars = pars, verb = TRUE) ## EXAMPLE WITH RR n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) Y <- cbind(y, 10 * rnorm(n)) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm lb <- c(.1, 1) pars <- mpars(lb = lb) pars gridcv( X, Y, segm, score = msep, fun = rr, pars = pars, verb = TRUE) gridcvlb( X, Y, segm, score = msep, fun = rr, lb = lb, verb = TRUE) ## EXAMPLE WITH KRR n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) Y <- cbind(y, 10 * rnorm(n)) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm lb <- c(.1, 1) gamma <- 10^(-1:1) pars <- mpars(lb = lb, gamma = gamma) pars gridcv( X, Y, segm, score = msep, fun = krr, pars = pars, verb = TRUE) pars <- mpars(gamma = gamma) gridcvlb( X, Y, segm, score = msep, fun = krr, lb = lb, pars = pars, verb = TRUE) ## EXAMPLE WITH LWPLSR n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) Y <- cbind(y, 10 * rnorm(n)) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm nlvdis <- 5 h <- c(1, Inf) k <- c(10, 20) nlv <- 5 pars <- mpars(nlvdis = nlvdis, diss = "mahal", h = h, k = k) pars res <- gridcvlv( X, Y, segm, score = msep, fun = lwplsr, nlv = 0:nlv, pars = pars, verb = TRUE) res
## EXAMPLE WITH PLSR n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) Y <- cbind(y, 10 * rnorm(n)) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm nlv <- 5 pars <- mpars(nlv = 1:nlv) pars gridcv( X, Y, segm, score = msep, fun = plskern, pars = pars, verb = TRUE) gridcvlv( X, Y, segm, score = msep, fun = plskern, nlv = 0:nlv, verb = TRUE) ## EXAMPLE WITH PLSLDA n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) y <- sample(c(1, 4, 10), size = n, replace = TRUE) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm nlv <- 5 pars <- mpars(nlv = 1:nlv, prior = c("unif", "prop")) pars gridcv( X, y, segm, score = err, fun = plslda, pars = pars, verb = TRUE) pars <- mpars(prior = c("unif", "prop")) pars gridcvlv( X, y, segm, score = err, fun = plslda, nlv = 1:nlv, pars = pars, verb = TRUE) ## EXAMPLE WITH RR n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) Y <- cbind(y, 10 * rnorm(n)) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm lb <- c(.1, 1) pars <- mpars(lb = lb) pars gridcv( X, Y, segm, score = msep, fun = rr, pars = pars, verb = TRUE) gridcvlb( X, Y, segm, score = msep, fun = rr, lb = lb, verb = TRUE) ## EXAMPLE WITH KRR n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) Y <- cbind(y, 10 * rnorm(n)) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm lb <- c(.1, 1) gamma <- 10^(-1:1) pars <- mpars(lb = lb, gamma = gamma) pars gridcv( X, Y, segm, score = msep, fun = krr, pars = pars, verb = TRUE) pars <- mpars(gamma = gamma) gridcvlb( X, Y, segm, score = msep, fun = krr, lb = lb, pars = pars, verb = TRUE) ## EXAMPLE WITH LWPLSR n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) Y <- cbind(y, 10 * rnorm(n)) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm nlvdis <- 5 h <- c(1, Inf) k <- c(10, 20) nlv <- 5 pars <- mpars(nlvdis = nlvdis, diss = "mahal", h = h, k = k) pars res <- gridcvlv( X, Y, segm, score = msep, fun = lwplsr, nlv = 0:nlv, pars = pars, verb = TRUE) res
Functions for tuning predictive models on a validation set.
The functions return "scores" (average error rates) of predictions for a given model and a grid of parameter values, calculated on a validation dataset.
- gridscore
: Can be used for any model.
- gridscorelv
: Specific to models using regularization by latent variables (LVs) (e.g. PLSR). Much faster than gridscore
.
- gridscorelb
: Specific to models using ridge regularization (e.g. RR). Much faster than gridscore
.
gridscore(Xtrain, Ytrain, X, Y, score, fun, pars, verb = FALSE) gridscorelv(Xtrain, Ytrain, X, Y, score, fun, nlv, pars = NULL, verb = FALSE) gridscorelb(Xtrain, Ytrain, X, Y, score, fun, lb, pars = NULL, verb = FALSE)
gridscore(Xtrain, Ytrain, X, Y, score, fun, pars, verb = FALSE) gridscorelv(Xtrain, Ytrain, X, Y, score, fun, nlv, pars = NULL, verb = FALSE) gridscorelb(Xtrain, Ytrain, X, Y, score, fun, lb, pars = NULL, verb = FALSE)
Xtrain |
Training X-data ( |
Ytrain |
Training Y-data ( |
X |
Validation X-data ( |
Y |
Validation Y-data ( |
score |
A function calculating a prediction score (e.g. |
fun |
A function corresponding to the predictive model. |
nlv |
For |
lb |
For |
pars |
A list of named vectors. Each vector must correspond to an argument of the model function and gives the parameter values to consider for this argument. (see details) |
verb |
Logical. If |
Argument pars
(the grid) must be a list of named vectors, each vector corresponding to an argument of the model function and giving the parameter values to consider for this argument. This list can eventually be built with function mpars
, which returns all the combinations of the input parameters, see the examples.
For gridscorelv
, pars
must not contain nlv
(nb. LVs), and for gridscorelb
, lb
(regularization parameter ).
A dataframe with the prediction scores for the grid.
Examples are given: - with PLSR, using gridscore and gridscorelv (much faster) - with PLSLDA, using gridscore and gridscorelv (much faster) - with RR, using gridscore and gridscorelb (much faster) - with KRR, using gridscore and gridscorelb (much faster) - with LWPLSR, using gridscorelv
## EXAMPLE WITH PLSR n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 10 * rnorm(n)) m <- 3 Xtest <- Xtrain[1:m, ] Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1] nlv <- 5 pars <- mpars(nlv = 1:nlv) pars gridscore( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = plskern, pars = pars, verb = TRUE ) gridscorelv( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = plskern, nlv = 0:nlv, verb = TRUE ) fm <- plskern(Xtrain, Ytrain, nlv = nlv) pred <- predict(fm, Xtest)$pred msep(pred, Ytest) ## EXAMPLE WITH PLSLDA n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) y <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtrain <- X ; ytrain <- y m <- 5 Xtest <- X[1:m, ] ; ytest <- y[1:m] nlv <- 5 pars <- mpars(nlv = 1:nlv, prior = c("unif", "prop")) pars gridscore( Xtrain, ytrain, Xtest, ytest, score = err, fun = plslda, pars = pars, verb = TRUE ) fm <- plslda(Xtrain, ytrain, nlv = nlv) pred <- predict(fm, Xtest)$pred err(pred, ytest) pars <- mpars(prior = c("unif", "prop")) pars gridscorelv( Xtrain, ytrain, Xtest, ytest, score = err, fun = plslda, nlv = 1:nlv, pars = pars, verb = TRUE ) ## EXAMPLE WITH RR n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 10 * rnorm(n)) m <- 3 Xtest <- Xtrain[1:m, ] Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1] lb <- c(.1, 1) pars <- mpars(lb = lb) pars gridscore( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = rr, pars = pars, verb = TRUE ) gridscorelb( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = rr, lb = lb, verb = TRUE ) ## EXAMPLE WITH KRR n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 10 * rnorm(n)) m <- 3 Xtest <- Xtrain[1:m, ] Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1] lb <- c(.1, 1) gamma <- 10^(-1:1) pars <- mpars(lb = lb, gamma = gamma) pars gridscore( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = krr, pars = pars, verb = TRUE ) pars <- mpars(gamma = gamma) gridscorelb( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = krr, lb = lb, pars = pars, verb = TRUE ) ## EXAMPLE WITH LWPLSR n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 10 * rnorm(n)) m <- 3 Xtest <- Xtrain[1:m, ] Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1] nlvdis <- 5 h <- c(1, Inf) k <- c(10, 20) nlv <- 5 pars <- mpars(nlvdis = nlvdis, diss = "mahal", h = h, k = k) pars res <- gridscorelv( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = lwplsr, nlv = 0:nlv, pars = pars, verb = TRUE ) res
## EXAMPLE WITH PLSR n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 10 * rnorm(n)) m <- 3 Xtest <- Xtrain[1:m, ] Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1] nlv <- 5 pars <- mpars(nlv = 1:nlv) pars gridscore( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = plskern, pars = pars, verb = TRUE ) gridscorelv( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = plskern, nlv = 0:nlv, verb = TRUE ) fm <- plskern(Xtrain, Ytrain, nlv = nlv) pred <- predict(fm, Xtest)$pred msep(pred, Ytest) ## EXAMPLE WITH PLSLDA n <- 50 ; p <- 8 X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) y <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtrain <- X ; ytrain <- y m <- 5 Xtest <- X[1:m, ] ; ytest <- y[1:m] nlv <- 5 pars <- mpars(nlv = 1:nlv, prior = c("unif", "prop")) pars gridscore( Xtrain, ytrain, Xtest, ytest, score = err, fun = plslda, pars = pars, verb = TRUE ) fm <- plslda(Xtrain, ytrain, nlv = nlv) pred <- predict(fm, Xtest)$pred err(pred, ytest) pars <- mpars(prior = c("unif", "prop")) pars gridscorelv( Xtrain, ytrain, Xtest, ytest, score = err, fun = plslda, nlv = 1:nlv, pars = pars, verb = TRUE ) ## EXAMPLE WITH RR n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 10 * rnorm(n)) m <- 3 Xtest <- Xtrain[1:m, ] Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1] lb <- c(.1, 1) pars <- mpars(lb = lb) pars gridscore( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = rr, pars = pars, verb = TRUE ) gridscorelb( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = rr, lb = lb, verb = TRUE ) ## EXAMPLE WITH KRR n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 10 * rnorm(n)) m <- 3 Xtest <- Xtrain[1:m, ] Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1] lb <- c(.1, 1) gamma <- 10^(-1:1) pars <- mpars(lb = lb, gamma = gamma) pars gridscore( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = krr, pars = pars, verb = TRUE ) pars <- mpars(gamma = gamma) gridscorelb( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = krr, lb = lb, pars = pars, verb = TRUE ) ## EXAMPLE WITH LWPLSR n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 10 * rnorm(n)) m <- 3 Xtest <- Xtrain[1:m, ] Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1] nlvdis <- 5 h <- c(1, Inf) k <- c(10, 20) nlv <- 5 pars <- mpars(nlvdis = nlvdis, diss = "mahal", h = h, k = k) pars res <- gridscorelv( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = lwplsr, nlv = 0:nlv, pars = pars, verb = TRUE ) res
Function headm
displays the first part and the dimension of a data set.
headm(X)
headm(X)
X |
A matrix or dataframe. |
first 6 rows and columns of a dataset, number of rows, number of columns, dataset class.
n <- 1000 p <- 200 X <- matrix(rnorm(n * p), nrow = n) headm(X)
n <- 1000 p <- 200 X <- matrix(rnorm(n * p), nrow = n) headm(X)
Resampling of signals by interpolation methods, including linear, spline, and cubic interpolation.
The function uses function interp1
of package signal
available on the CRAN.
interpl(X, w, meth = "cubic", ...)
interpl(X, w, meth = "cubic", ...)
X |
X-data ( |
w |
A vector of the values where to interpolate (typically within the range of |
meth |
The method of interpolation. See |
... |
Optional arguments to pass in function |
A matrix of the interpolated signals.
data(cassav) X <- cassav$Xtest headm(X) w <- seq(500, 2400, length = 10) zX <- interpl(X, w, meth = "spline") headm(zX) plotsp(zX)
data(cassav) X <- cassav$Xtest headm(X) w <- seq(500, 2400, length = 10) zX <- interpl(X, w, meth = "spline") headm(zX) plotsp(zX)
KNN weighted discrimination. For each new observation to predict, a number of nearest neighbors is selected and the prediction is calculated by the most frequent class in
in this neighborhood.
knnda(X, y, nlvdis, diss = c("eucl", "mahal"), h, k) ## S3 method for class 'Knnda' predict(object, X, ...)
knnda(X, y, nlvdis, diss = c("eucl", "mahal"), h, k) ## S3 method for class 'Knnda' predict(object, X, ...)
X |
For the main function: Training X-data ( |
y |
Training class membership ( |
nlvdis |
The number of LVs to consider in the global PLS used for the dimension reduction before calculating the dissimilarities. If |
diss |
The type of dissimilarity used for defining the neighbors. Possible values are "eucl" (default; Euclidean distance), "mahal" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)). |
h |
A scale scalar defining the shape of the weight function. Lower is |
k |
The number of nearest neighbors to select for each observation to predict. |
object |
For the auxiliary functions: A fitted model, output of a call to the main function. |
... |
For the auxiliary functions: Optional arguments. Not used. |
In function knnda
, the dissimilarities used for computing the neighborhood and the weights can be calculated from the original X-data or after a dimension reduction (argument nlvdis
). In the last case, global PLS scores are computed from and the dissimilarities are calculated on these scores. For high dimension X-data, the dimension reduction is in general required for using the Mahalanobis distance.
For knnda
:list with input arguments.
For predict.Knnda
:
pred |
prediction calculated for each observation by the most frequent class in |
listnn |
list with the neighbors used for each observation to be predicted |
listd |
list with the distances to the neighbors used for each observation to be predicted |
listw |
list with the weights attributed to the neighbors used for each observation to be predicted |
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 fm <- knnda( Xtrain, ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k ) res <- predict(fm, Xtest) names(res) res$pred err(res$pred, ytest)
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 fm <- knnda( Xtrain, ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k ) res <- predict(fm, Xtest) names(res) res$pred err(res$pred, ytest)
KNN weighted regression. For each new observation to predict, a number of nearest neighbors is selected and the prediction is calculated by the average (eventually weighted) of the response
over this neighborhood.
knnr(X, Y, nlvdis, diss = c("eucl", "mahal"), h, k) ## S3 method for class 'Knnr' predict(object, X, ...)
knnr(X, Y, nlvdis, diss = c("eucl", "mahal"), h, k) ## S3 method for class 'Knnr' predict(object, X, ...)
X |
For the main function: Training X-data ( |
Y |
Training Y-data ( |
nlvdis |
The number of LVs to consider in the global PLS used for the dimension reduction before calculating the dissimilarities. If |
diss |
The type of dissimilarity used for defining the neighbors. Possible values are "eucl" (default; Euclidean distance), "mahal" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)). |
h |
A scale scalar defining the shape of the weight function. Lower is |
k |
The number of nearest neighbors to select for each observation to predict. |
object |
— For the auxiliary functions: A fitted model, output of a call to the main function. |
... |
— For the auxiliary functions: Optional arguments. Not used. |
In function knnr
, the dissimilarities used for computing the neighborhood and the weights can be calculated from the original X-data or after a dimension reduction (argument nlvdis
). In the last case, global PLS scores are computed from and the dissimilarities are calculated on these scores. For high dimension X-data, the dimension reduction is in general required for using the Mahalanobis distance.
For knnr
:list with input arguments.
For predict.Knnr
:
pred |
prediction calculated for each observation by the average (eventually weighted) of the response |
listnn |
list with the neighbors used for each observation to be predicted |
listd |
list with the distances to the neighbors used for each observation to be predicted |
listw |
list with the weights attributed to the neighbors used for each observation to be predicted |
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
n <- 30 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 fm <- knnr( Xtrain, Ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k) res <- predict(fm, Xtest) names(res) res$pred msep(res$pred, Ytest)
n <- 30 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 fm <- knnr( Xtrain, Ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k) res <- predict(fm, Xtest) names(res) res$pred msep(res$pred, Ytest)
Kernel PCA (Scholkopf et al. 1997, Scholkopf & Smola 2002, Tipping 2001) by SVD factorization of the weighted Gram matrix .
is a (
) diagonal matrix of weights for the observations (rows of
).
kpca(X, weights = NULL, nlv, kern = "krbf", ...) ## S3 method for class 'Kpca' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Kpca' summary(object, ...)
kpca(X, weights = NULL, nlv, kern = "krbf", ...) ## S3 method for class 'Kpca' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Kpca' summary(object, ...)
X |
For the main function: Training X-data ( |
weights |
Weights ( |
nlv |
The number of PCs to calculate. |
kern |
Name of the function defining the considered kernel for building the Gram matrix. See |
... |
Optional arguments to pass in the kernel function defined in |
object |
— For the auxiliary functions: A fitted model, output of a call to the main functions. |
For kpca
:
X |
Training X-data ( |
Kt |
Gram matrix |
T |
X-scores matrix. |
P |
X-loadings matrix. |
sv |
vector of singular values |
eig |
vector of eigenvalues. |
weights |
vector of observation weights. |
kern |
kern function. |
dots |
Optional arguments. |
For transform.Kpca
: X-scores matrix for new X-data.
For summary.Kpca
:
explvar |
explained variance matrix. |
Scholkopf, B., Smola, A., Muller, K.-R., 1997. Kernel principal component analysis, in: Gerstner, W., Germond, A., Hasler, M., Nicoud, J.-D. (Eds.), Artificial Neural Networks - ICANN 97, Lecture Notes in Computer Science. Springer, Berlin, Heidelberg, pp. 583-588. https://doi.org/10.1007/BFb0020217
Scholkopf, B., Smola, A.J., 2002. Learning with kernels: support vector machines, regularization, optimization, and beyond, Adaptive computation and machine learning. MIT Press, Cambridge, Mass.
Tipping, M.E., 2001. Sparse kernel principal component analysis. Advances in neural information processing systems, MIT Press. http://papers.nips.cc/paper/1791-sparse-kernel-principal-component-analysis.pdf
## EXAMPLE 1 n <- 5 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) nlv <- 3 kpca(X, nlv = nlv, kern = "krbf") fm <- kpca(X, nlv = nlv, kern = "krbf", gamma = .6) fm$T transform(fm, X[1:2, ]) transform(fm, X[1:2, ], nlv = 1) summary(fm) ## EXAMPLE 2 n <- 5 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) nlv <- 3 pcasvd(X, nlv = nlv)$T kpca(X, nlv = nlv, kern = "kpol")$T
## EXAMPLE 1 n <- 5 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) nlv <- 3 kpca(X, nlv = nlv, kern = "krbf") fm <- kpca(X, nlv = nlv, kern = "krbf", gamma = .6) fm$T transform(fm, X[1:2, ]) transform(fm, X[1:2, ], nlv = 1) summary(fm) ## EXAMPLE 2 n <- 5 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) nlv <- 3 pcasvd(X, nlv = nlv)$T kpca(X, nlv = nlv, kern = "kpol")$T
NIPALS Kernel PLSR algorithm described in Rosipal & Trejo (2001).
The algorithm is slow for .
kplsr(X, Y, weights = NULL, nlv, kern = "krbf", tol = .Machine$double.eps^0.5, maxit = 100, ...) ## S3 method for class 'Kplsr' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Kplsr' coef(object, ..., nlv = NULL) ## S3 method for class 'Kplsr' predict(object, X, ..., nlv = NULL)
kplsr(X, Y, weights = NULL, nlv, kern = "krbf", tol = .Machine$double.eps^0.5, maxit = 100, ...) ## S3 method for class 'Kplsr' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Kplsr' coef(object, ..., nlv = NULL) ## S3 method for class 'Kplsr' predict(object, X, ..., nlv = NULL)
X |
For the main function: Training X-data ( |
Y |
Training Y-data ( |
weights |
Weights ( |
nlv |
The number(s) of LVs to calculate. — For the auxiliary functions: The number(s) of LVs to consider. |
kern |
Name of the function defining the considered kernel for building the Gram matrix. See |
tol |
Tolerance level for stopping the NIPALS iterations. |
maxit |
Maximum number of NIPALS iterations. |
... |
Optional arguments to pass in the kernel function defined in |
object |
For the auxiliary functions: A fitted model, output of a call to the main function. |
For kplsr
:
X |
Training X-data ( |
Kt |
Gram matrix |
T |
X-scores matrix. |
C |
The Y-loading weights matrix. |
U |
intermediate output. |
R |
The PLS projection matrix (p,nlv). |
ymeans |
the centering vector of Y (q,1). |
weights |
vector of observation weights. |
kern |
kern function. |
dots |
Optional arguments. |
For transform.Kplsr
: X-scores matrix for new X-data.
For coef.Kplsr
:
int |
intercept values matrix. |
beta |
beta coefficient matrix. |
For predict.Kplsr
:
pred |
predicted values matrix for new X-data. |
The second example concerns the fitting of the function sinc(x) described in Rosipal & Trejo 2001 p. 105-106
Rosipal, R., Trejo, L.J., 2001. Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. Journal of Machine Learning Research 2, 97-123.
## EXAMPLE 1 n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] nlv <- 2 fm <- kplsr(Xtrain, Ytrain, nlv = nlv, kern = "krbf", gamma = .8) transform(fm, Xtest) transform(fm, Xtest, nlv = 1) coef(fm) coef(fm, nlv = 1) predict(fm, Xtest) predict(fm, Xtest, nlv = 0:nlv)$pred pred <- predict(fm, Xtest)$pred msep(pred, Ytest) nlv <- 2 fm <- kplsr(Xtrain, Ytrain, nlv = nlv, kern = "kpol", degree = 2, coef0 = 10) predict(fm, Xtest, nlv = nlv) ## EXAMPLE 2 x <- seq(-10, 10, by = .2) x[x == 0] <- 1e-5 n <- length(x) zy <- sin(abs(x)) / abs(x) y <- zy + rnorm(n, 0, .2) plot(x, y, type = "p") lines(x, zy, lty = 2) X <- matrix(x, ncol = 1) nlv <- 2 fm <- kplsr(X, y, nlv = nlv) pred <- predict(fm, X)$pred plot(X, y, type = "p") lines(X, zy, lty = 2) lines(X, pred, col = "red")
## EXAMPLE 1 n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] nlv <- 2 fm <- kplsr(Xtrain, Ytrain, nlv = nlv, kern = "krbf", gamma = .8) transform(fm, Xtest) transform(fm, Xtest, nlv = 1) coef(fm) coef(fm, nlv = 1) predict(fm, Xtest) predict(fm, Xtest, nlv = 0:nlv)$pred pred <- predict(fm, Xtest)$pred msep(pred, Ytest) nlv <- 2 fm <- kplsr(Xtrain, Ytrain, nlv = nlv, kern = "kpol", degree = 2, coef0 = 10) predict(fm, Xtest, nlv = nlv) ## EXAMPLE 2 x <- seq(-10, 10, by = .2) x[x == 0] <- 1e-5 n <- length(x) zy <- sin(abs(x)) / abs(x) y <- zy + rnorm(n, 0, .2) plot(x, y, type = "p") lines(x, zy, lty = 2) X <- matrix(x, ncol = 1) nlv <- 2 fm <- kplsr(X, y, nlv = nlv) pred <- predict(fm, X)$pred plot(X, y, type = "p") lines(X, zy, lty = 2) lines(X, pred, col = "red")
Discrimination (DA) based on kernel PLSR (KPLSR)
kplsrda(X, y, weights = NULL, nlv, kern = "krbf", ...) ## S3 method for class 'Kplsrda' predict(object, X, ..., nlv = NULL)
kplsrda(X, y, weights = NULL, nlv, kern = "krbf", ...) ## S3 method for class 'Kplsrda' predict(object, X, ..., nlv = NULL)
X |
For main function: Training X-data ( |
y |
Training class membership ( |
weights |
Weights ( |
nlv |
For main function: The number(s) of LVs to calculate. — For auxiliary function: The number(s) of LVs to consider. |
kern |
Name of the function defining the considered kernel for building the Gram matrix. See |
... |
Optional arguments to pass in the kernel function defined in |
object |
For auxiliary function: A fitted model, output of a call to the main functions. |
The training variable (univariate class membership) is transformed to a dummy table containing
columns, where
is the number of classes present in
. Each column is a dummy variable (0/1). Then, a kernel PLSR (KPLSR) is run on the
data and the dummy table, returning predictions of the dummy variables. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.
For kplsrda
:
fm |
list with the kplsrda model: ( |
lev |
y levels |
ni |
number of observations by level of y |
For predict.Kplsrda
:
pred |
predicted class for each observation |
posterior |
calculated probability of belonging to a class for each observation |
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] nlv <- 2 fm <- kplsrda(Xtrain, ytrain, nlv = nlv) names(fm) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred err(pred, ytest) predict(fm, Xtest, nlv = 0:nlv)$posterior predict(fm, Xtest, nlv = 0)$posterior predict(fm, Xtest, nlv = 0:nlv)$pred predict(fm, Xtest, nlv = 0)$pred
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] nlv <- 2 fm <- kplsrda(Xtrain, ytrain, nlv = nlv) names(fm) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred err(pred, ytest) predict(fm, Xtest, nlv = 0:nlv)$posterior predict(fm, Xtest, nlv = 0)$posterior predict(fm, Xtest, nlv = 0:nlv)$pred predict(fm, Xtest, nlv = 0)$pred
Building Gram matrices for different kernels (e.g. Scholkopf & Smola 2002).
- radial basis: exp(-gamma * |x - y|^2)
- polynomial: (gamma * x' * y + coef0)^degree
- sigmoid: tanh(gamma * x' * y + coef0)
krbf(X, Y = NULL, gamma = 1) kpol(X, Y = NULL, degree = 1, gamma = 1, coef0 = 0) ktanh(X, Y = NULL, gamma = 1, coef0 = 0)
krbf(X, Y = NULL, gamma = 1) kpol(X, Y = NULL, degree = 1, gamma = 1, coef0 = 0) ktanh(X, Y = NULL, gamma = 1, coef0 = 0)
X |
Dataset ( |
Y |
Dataset ( |
gamma |
value of the gamma parameter in the kernel calculation. |
degree |
For |
coef0 |
For |
Gram matrix
Scholkopf, B., Smola, A.J., 2002. Learning with kernels: support vector machines, regularization, optimization, and beyond, Adaptive computation and machine learning. MIT Press, Cambridge, Mass.
n <- 5 ; p <- 3 Xtrain <- matrix(rnorm(n * p), ncol = p) Xtest <- Xtrain[1:2, , drop = FALSE] gamma <- .8 krbf(Xtrain, gamma = gamma) krbf(Xtest, Xtrain, gamma = gamma) exp(-.5 * euclsq(Xtest, Xtrain) / gamma^2) kpol(Xtrain, degree = 2, gamma = .5, coef0 = 1)
n <- 5 ; p <- 3 Xtrain <- matrix(rnorm(n * p), ncol = p) Xtest <- Xtrain[1:2, , drop = FALSE] gamma <- .8 krbf(Xtrain, gamma = gamma) krbf(Xtest, Xtrain, gamma = gamma) exp(-.5 * euclsq(Xtest, Xtrain) / gamma^2) kpol(Xtrain, degree = 2, gamma = .5, coef0 = 1)
Kernel ridge regression models (KRR = LS-SVMR) (Suykens et al. 2000, Bennett & Embrechts 2003, Krell 2018).
krr(X, Y, weights = NULL, lb = 1e-2, kern = "krbf", ...) ## S3 method for class 'Krr' coef(object, ..., lb = NULL) ## S3 method for class 'Krr' predict(object, X, ..., lb = NULL)
krr(X, Y, weights = NULL, lb = 1e-2, kern = "krbf", ...) ## S3 method for class 'Krr' coef(object, ..., lb = NULL) ## S3 method for class 'Krr' predict(object, X, ..., lb = NULL)
X |
For main function: Training X-data ( |
Y |
Training Y-data ( |
weights |
Weights ( |
lb |
A value of regularization parameter |
kern |
Name of the function defining the considered kernel for building the Gram matrix. See |
... |
Optional arguments to pass in the kernel function defined in |
object |
— For auxiliary function: A fitted model, output of a call to the main function. |
For krr
:
X |
Training X-data ( |
K |
Gram matrix |
Kt |
Gram matrix |
U |
intermediate output. |
UtDY |
intermediate output. |
sv |
singular values of the matrix (1,n) |
lb |
value of regularization parameter |
ymeans |
the centering vector of Y (q,1) |
weights |
the weights vector of X-variables (p,1) |
kern |
kern function. |
dots |
Optional arguments. |
For coef.Krr
:
int |
matrix (1,nlv) with the intercepts |
alpha |
matrix (n,nlv) with the coefficients |
df |
model complexity (number of degrees of freedom) |
For predict.Krr
:
pred |
A list of matrices ( |
KRR is close to the particular SVMR setting the coefficient to zero (no marges excluding observations). The difference is that a L2-norm optimization is done, instead L1 in SVM.
The second example concerns the fitting of the function sinc(x) described in Rosipal & Trejo 2001 p. 105-106
Bennett, K.P., Embrechts, M.J., 2003. An optimization perspective on kernel partial least squares regression, in: Advances in Learning Theory: Methods, Models and Applications, NATO Science Series III: Computer & Systems Sciences. IOS Press Amsterdam, pp. 227-250.
Cawley, G.C., Talbot, N.L.C., 2002. Reduced Rank Kernel Ridge Regression. Neural Processing Letters 16, 293-302. https://doi.org/10.1023/A:1021798002258
Krell, M.M., 2018. Generalizing, Decoding, and Optimizing Support Vector Machine Classification. arXiv:1801.04929.
Saunders, C., Gammerman, A., Vovk, V., 1998. Ridge Regression Learning Algorithm in Dual Variables, in: In Proceedings of the 15th International Conference on Machine Learning. Morgan Kaufmann, pp. 515-521.
Suykens, J.A.K., Lukas, L., Vandewalle, J., 2000. Sparse approximation using least squares support vector machines. 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353). https://doi.org/10.1109/ISCAS.2000.856439
Welling, M., n.d. Kernel ridge regression. Department of Computer Science, University of Toronto, Toronto, Canada. https://www.ics.uci.edu/~welling/classnotes/papers_class/Kernel-Ridge.pdf
## EXAMPLE 1 n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] lb <- 2 fm <- krr(Xtrain, Ytrain, lb = lb, kern = "krbf", gamma = .8) coef(fm) coef(fm, lb = .6) predict(fm, Xtest) predict(fm, Xtest, lb = c(0.1, .6)) pred <- predict(fm, Xtest)$pred msep(pred, Ytest) lb <- 2 fm <- krr(Xtrain, Ytrain, lb = lb, kern = "kpol", degree = 2, coef0 = 10) predict(fm, Xtest) ## EXAMPLE 2 x <- seq(-10, 10, by = .2) x[x == 0] <- 1e-5 n <- length(x) zy <- sin(abs(x)) / abs(x) y <- zy + rnorm(n, 0, .2) plot(x, y, type = "p") lines(x, zy, lty = 2) X <- matrix(x, ncol = 1) fm <- krr(X, y, lb = .1, gamma = .5) pred <- predict(fm, X)$pred plot(X, y, type = "p") lines(X, zy, lty = 2) lines(X, pred, col = "red")
## EXAMPLE 1 n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] lb <- 2 fm <- krr(Xtrain, Ytrain, lb = lb, kern = "krbf", gamma = .8) coef(fm) coef(fm, lb = .6) predict(fm, Xtest) predict(fm, Xtest, lb = c(0.1, .6)) pred <- predict(fm, Xtest)$pred msep(pred, Ytest) lb <- 2 fm <- krr(Xtrain, Ytrain, lb = lb, kern = "kpol", degree = 2, coef0 = 10) predict(fm, Xtest) ## EXAMPLE 2 x <- seq(-10, 10, by = .2) x[x == 0] <- 1e-5 n <- length(x) zy <- sin(abs(x)) / abs(x) y <- zy + rnorm(n, 0, .2) plot(x, y, type = "p") lines(x, zy, lty = 2) X <- matrix(x, ncol = 1) fm <- krr(X, y, lb = .1, gamma = .5) pred <- predict(fm, X)$pred plot(X, y, type = "p") lines(X, zy, lty = 2) lines(X, pred, col = "red")
Discrimination (DA) based on kernel ridge regression (KRR).
krrda(X, y, weights = NULL, lb = 1e-5, kern = "krbf", ...) ## S3 method for class 'Krrda' predict(object, X, ..., lb = NULL)
krrda(X, y, weights = NULL, lb = 1e-5, kern = "krbf", ...) ## S3 method for class 'Krrda' predict(object, X, ..., lb = NULL)
X |
For main function: Training X-data ( |
y |
Training class membership ( |
weights |
Weights ( |
lb |
A value of regularization parameter |
kern |
Name of the function defining the considered kernel for building the Gram matrix. See |
... |
Optional arguments to pass in the kernel function defined in |
object |
— For auxiliary function: A fitted model, output of a call to the main functions. |
The training variable (univariate class membership) is transformed to a dummy table containing
columns, where
is the number of classes present in
. Each column is a dummy variable (0/1). Then, a kernel ridge regression (KRR) is run on the
data and the dummy table, returning predictions of the dummy variables. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.
For krrda
:
fm |
List with the outputs of the RR (( |
lev |
y levels |
ni |
number of observations by level of y |
For predict.Krrda
:
pred |
matrix or list of matrices (if lb is a vector), with predicted class for each observation |
posterior |
matrix or list of matrices (if lb is a vector), calculated probability of belonging to a class for each observation |
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] lb <- 1 fm <- krrda(Xtrain, ytrain, lb = lb) names(fm) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred err(pred, ytest) predict(fm, Xtest, lb = 0:2) predict(fm, Xtest, lb = 0)
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] lb <- 1 fm <- krrda(Xtrain, ytrain, lb = lb) names(fm) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred err(pred, ytest) predict(fm, Xtest, lb = 0:2) predict(fm, Xtest, lb = 0)
Probabilistic (parametric) linear and quadratic discriminant analysis.
lda(X, y, prior = c("unif", "prop")) qda(X, y, prior = c("unif", "prop")) ## S3 method for class 'Lda' predict(object, X, ...) ## S3 method for class 'Qda' predict(object, X, ...)
lda(X, y, prior = c("unif", "prop")) qda(X, y, prior = c("unif", "prop")) ## S3 method for class 'Lda' predict(object, X, ...) ## S3 method for class 'Qda' predict(object, X, ...)
X |
For the main functions: Training X-data ( |
y |
Training class membership ( |
prior |
The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in |
object |
For the auxiliary functions: A fitted model, output of a call to the main functions. |
... |
For the auxiliary functions: Optional arguments. Not used. |
For each observation to predict, the posterior probability to belong to a given class is estimated using the Bayes' formula, assuming priors (proportional or uniform) and a multivariate Normal distribution for the dependent variables . The prediction is the class with the highest posterior probability.
LDA assumes homogeneous covariance matrices for the classes while QDA assumes different covariance matrices. The functions use
dmnorm
for estimating the multivariate Normal densities.
For lda
and qda
:
ct |
centers (column-wise means) for classes of observations. |
W |
unbiased within covariance matrices for classes of observations. |
wprior |
prior probabilities of the classes. |
lev |
y levels. |
ni |
number of observations by level of y. |
For predict.Lda
and predict.Qda
:
pred |
predicted classes of observations. |
ds |
Prediction of the normal probability density. |
posterior |
posterior probabilities of the classes. |
Saporta, G., 2011. Probabilités analyse des données et statistique. Editions Technip, Paris, France.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
## EXAMPLE 1 data(iris) X <- iris[, 1:4] y <- iris[, 5] N <- nrow(X) nTest <- round(.25 * N) nTraining <- N - nTest s <- sample(1:N, nTest) Xtrain <- X[-s, ] ytrain <- y[-s] Xtest <- X[s, ] ytest <- y[s] prior <- "unif" fm <- lda(Xtrain, ytrain, prior = prior) res <- predict(fm, Xtest) names(res) headm(res$pred) headm(res$ds) headm(res$posterior) err(res$pred, ytest) ## EXAMPLE 2 data(iris) X <- iris[, 1:4] y <- iris[, 5] N <- nrow(X) nTest <- round(.25 * N) nTraining <- N - nTest s <- sample(1:N, nTest) Xtrain <- X[-s, ] ytrain <- y[-s] Xtest <- X[s, ] ytest <- y[s] prior <- "prop" fm <- lda(Xtrain, ytrain, prior = prior) res <- predict(fm, Xtest) names(res) headm(res$pred) headm(res$ds) headm(res$posterior) err(res$pred, ytest)
## EXAMPLE 1 data(iris) X <- iris[, 1:4] y <- iris[, 5] N <- nrow(X) nTest <- round(.25 * N) nTraining <- N - nTest s <- sample(1:N, nTest) Xtrain <- X[-s, ] ytrain <- y[-s] Xtest <- X[s, ] ytest <- y[s] prior <- "unif" fm <- lda(Xtrain, ytrain, prior = prior) res <- predict(fm, Xtest) names(res) headm(res$pred) headm(res$ds) headm(res$posterior) err(res$pred, ytest) ## EXAMPLE 2 data(iris) X <- iris[, 1:4] y <- iris[, 5] N <- nrow(X) nTest <- round(.25 * N) nTraining <- N - nTest s <- sample(1:N, nTest) Xtrain <- X[-s, ] ytrain <- y[-s] Xtest <- X[s, ] ytest <- y[s] prior <- "prop" fm <- lda(Xtrain, ytrain, prior = prior) res <- predict(fm, Xtest) names(res) headm(res$pred) headm(res$ds) headm(res$posterior) err(res$pred, ytest)
Linear regression models (uses function lm
).
lmr(X, Y, weights = NULL) ## S3 method for class 'Lmr' coef(object, ...) ## S3 method for class 'Lmr' predict(object, X, ...)
lmr(X, Y, weights = NULL) ## S3 method for class 'Lmr' coef(object, ...) ## S3 method for class 'Lmr' predict(object, X, ...)
X |
For the main function: Training X-data ( |
Y |
Training Y-data ( |
weights |
Weights ( |
object |
For the auxiliary functions:A fitted model, output of a call to the main functions. |
... |
For the auxiliary functions: Optional arguments. Not used. |
For lmr
:
coefficients |
coefficient matrix. |
residuals |
residual matrix. |
effects |
component relating to the linear fit, for use by extractor functions. |
rank |
the numeric rank of the fitted linear model. |
fitted.values |
the fitted mean values. |
assign |
component relating to the linear fit, for use by extractor functions. |
qr |
component relating to the linear fit, for use by extractor functions. |
df.residual |
the residual degrees of freedom. |
xlevels |
(only where relevant) a record of the levels of the factors used in fitting. |
call |
the matched call. |
terms |
the terms object used. |
model |
the model frame used. |
For coef.Lmr
:
int |
matrix (1,nlv) with the intercepts |
B |
matrix (n,nlv) with the coefficients |
For predict.Lmr
:
pred |
A list of matrices ( |
n <- 8 ; p <- 3 X <- matrix(rnorm(n * p, mean = 10), ncol = p, byrow = TRUE) y <- rnorm(n) Y <- cbind(y, rnorm(n)) Xtrain <- X[1:6, ] ; Ytrain <- Y[1:6, ] Xtest <- X[7:8, ] ; Ytest <- Y[7:8, ] fm <- lmr(Xtrain, Ytrain) coef(fm) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred msep(pred, Ytest)
n <- 8 ; p <- 3 X <- matrix(rnorm(n * p, mean = 10), ncol = p, byrow = TRUE) y <- rnorm(n) Y <- cbind(y, rnorm(n)) Xtrain <- X[1:6, ] ; Ytrain <- Y[1:6, ] Xtest <- X[7:8, ] ; Ytest <- Y[7:8, ] fm <- lmr(Xtrain, Ytrain) coef(fm) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred msep(pred, Ytest)
Discrimination (DA) based on linear regression (LMR).
lmrda(X, y, weights = NULL) ## S3 method for class 'Lmrda' predict(object, X, ...)
lmrda(X, y, weights = NULL) ## S3 method for class 'Lmrda' predict(object, X, ...)
X |
For the main function: Training X-data ( |
y |
Training class membership ( |
weights |
Weights ( |
object |
For the auxiliary function: A fitted model, output of a call to the main functions. |
... |
For the auxiliary function: Optional arguments. Not used. |
The training variable (univariate class membership) is transformed to a dummy table containing
columns, where
is the number of classes present in
. Each column is a dummy variable (0/1). Then, a linear regression model (LMR) is run on the
data and the dummy table, returning predictions of the dummy variables. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.
For lrmda
:
fm |
List with the outputs(( |
lev |
y levels. |
ni |
number of observations by level of y. |
For predict.Lrmda
:
pred |
predicted classes of observations. |
posterior |
posterior probability of belonging to a class for each observation. |
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] fm <- lmrda(Xtrain, ytrain) names(fm) predict(fm, Xtest) coef(fm$fm) pred <- predict(fm, Xtest)$pred err(pred, ytest)
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] fm <- lmrda(Xtrain, ytrain) names(fm) predict(fm, Xtest) coef(fm$fm) pred <- predict(fm, Xtest)$pred err(pred, ytest)
locw
and locwlv
are generic working functions returning predictions of KNN locally weighted (LW) models. One specific (= local) model is fitted for each observation to predict, and a prediction is returned. See the wrapper lwplsr
(KNN-LWPLSR) for an example of use.
In KNN-LW models, the prediction is built from two sequential steps, therafter referred to as and
, respectively. For each new observation to predict, the two steps are as follow:
- . The
nearest neighbors (in the training data set) are selected and the prediction model is fitted (in the next step) only on this neighborhood. It is equivalent to give a weight = 1 to the neighbors, and a weight = 0 to the other training observations, which corresponds to a binary weighting.
- . Each of the
nearest neighbors eventually receives a weight (different from the usual
) before fitting the model. The weight depend from the dissimilarity (preliminary calculated) between the observation and the neighbor. This corresponds to a within-neighborhood weighting.
The prediction model used in step has to be defined in a function specified in argument
fun
. If there are new observations to predict, a list of
vectors defining the
neighborhoods has to be provided (argument
listnn
). Each of the vectors contains the indexes of the nearest neighbors in the training set. The
vectors are not necessary of same length, i.e. the neighborhood size can vary between observations to predict. If there is a weighting in step
, a list of
vectors of weights have to be provided (argument
listw
). Then locw
fits the model successively for each of the neighborhoods, and returns the corresponding
predictions.
Function locwlv
is dedicated to prediction models based on latent variables (LVs) calculations, such as PLSR. It is much faster and recommended.
locw(Xtrain, Ytrain, X, listnn, listw = NULL, fun, verb = FALSE, ...) locwlv(Xtrain, Ytrain, X, listnn, listw = NULL, fun, nlv, verb = FALSE, ...)
locw(Xtrain, Ytrain, X, listnn, listw = NULL, fun, verb = FALSE, ...) locwlv(Xtrain, Ytrain, X, listnn, listw = NULL, fun, nlv, verb = FALSE, ...)
Xtrain |
Training X-data ( |
Ytrain |
Training Y-data ( |
X |
New X-data ( |
listnn |
A list of |
listw |
A list of |
fun |
A function corresponding to the prediction model to fit on the |
nlv |
For |
verb |
Logical. If |
... |
Optional arguments to pass in function |
pred |
matrix or list of matrices (if |
Lesnoff M, Metz M, Roger J-M. Comparison of locally weighted PLS strategies for regression and discrimination on agronomic NIR data. Journal of Chemometrics. 2020;n/a(n/a):e3209. doi:10.1002/cem.3209.
n <- 50 ; p <- 30 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p, byrow = TRUE) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) k <- 5 z <- getknn(Xtrain, Xtest, k = k) listnn <- z$listnn listd <- z$listd listnn listd listw <- lapply(listd, wdist, h = 2) listw nlv <- 2 locw(Xtrain, Ytrain, Xtest, listnn = listnn, fun = plskern, nlv = nlv) locw(Xtrain, Ytrain, Xtest, listnn = listnn, listw = listw, fun = plskern, nlv = nlv) locwlv(Xtrain, Ytrain, Xtest, listnn = listnn, listw = listw, fun = plskern, nlv = nlv) locwlv(Xtrain, Ytrain, Xtest, listnn = listnn, listw = listw, fun = plskern, nlv = 0:nlv)
n <- 50 ; p <- 30 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p, byrow = TRUE) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) k <- 5 z <- getknn(Xtrain, Xtest, k = k) listnn <- z$listnn listd <- z$listd listnn listd listw <- lapply(listd, wdist, h = 2) listw nlv <- 2 locw(Xtrain, Ytrain, Xtest, listnn = listnn, fun = plskern, nlv = nlv) locw(Xtrain, Ytrain, Xtest, listnn = listnn, listw = listw, fun = plskern, nlv = nlv) locwlv(Xtrain, Ytrain, Xtest, listnn = listnn, listw = listw, fun = plskern, nlv = nlv) locwlv(Xtrain, Ytrain, Xtest, listnn = listnn, listw = listw, fun = plskern, nlv = 0:nlv)
Function lwplsr
fits KNN-LWPLSR models described in Lesnoff et al. (2020). The function uses functions getknn
, locw
and PLSR functions. See the code for details. Many variants of such pipelines can be build using locw
.
lwplsr(X, Y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, cri = 4, verb = FALSE) ## S3 method for class 'Lwplsr' predict(object, X, ..., nlv = NULL)
lwplsr(X, Y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, cri = 4, verb = FALSE) ## S3 method for class 'Lwplsr' predict(object, X, ..., nlv = NULL)
X |
— For the main function: Training X-data ( |
Y |
Training Y-data ( |
nlvdis |
The number of LVs to consider in the global PLS used for the dimension reduction before calculating the dissimilarities (see details). If |
diss |
The type of dissimilarity used for defining the neighbors. Possible values are "eucl" (default; Euclidean distance), "mahal" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)). |
h |
A scale scalar defining the shape of the weight function. Lower is |
k |
The number of nearest neighbors to select for each observation to predict. |
nlv |
The number(s) of LVs to calculate in the local PLSR models. |
cri |
Argument |
verb |
Logical. If |
object |
— For the auxiliary function: A fitted model, output of a call to the main function. |
... |
— For the auxiliary function: Optional arguments. |
- LWPLSR: This is a particular case of "weighted PLSR" (WPLSR) (e.g. Schaal et al. 2002). In WPLSR, a priori weights, different from the usual (standard PLSR), are given to the
training observations. These weights are used for calculating (i) the PLS scores and loadings and (ii) the regression model of the response(s) over the scores (by weighted least squares). LWPLSR is a particular case of WPLSR. "L" comes from "localized": the weights are defined from dissimilarities (e.g. distances) between the new observation to predict and the training observations. By definition of LWPLSR, the weights, and therefore the fitted WPLSR model, change for each new observation to predict.
- KNN-LWPLSR: Basic versions of LWPLSR (e.g. Sicard & Sabatier 2006, Kim et al 2011) use, for each observation to predict, all the training observation. This can be very time consuming, in particular for large
. A faster and often more efficient strategy is to preliminary select, in the training set, a number of
nearest neighbors to the observation to predict (this is referred to as
in function
locw
) and then to apply LWPLSR only to this pre-selected neighborhood (this is referred to as in
locw
). This strategy corresponds to KNN-LWPLSR.
In function lwplsr
, the dissimilarities used for computing the weights can be calculated from the original X-data or after a dimension reduction (argument nlvdis
). In the last case, global PLS scores are computed from and the dissimilarities are calculated on these scores. For high dimension X-data, the dimension reduction is in general required for using the Mahalanobis distance.
For lwplsr
: object of class Lwplsr
For predict.Lwplsr
:
pred |
prediction calculated for each observation |
listnn |
list with the neighbors used for each observation to be predicted |
listd |
list with the distances to the neighbors used for each observation to be predicted |
listw |
list with the weights attributed to the neighbors used for each observation to be predicted |
Kim, S., Kano, M., Nakagawa, H., Hasebe, S., 2011. Estimation of active pharmaceutical ingredients content using locally weighted partial least squares and statistical wavelength selection. Int. J. Pharm., 421, 269-274.
Lesnoff, M., Metz, M., Roger, J.-M., 2020. Comparison of locally weighted PLS strategies for regression and discrimination on agronomic NIR data. Journal of Chemometrics, e3209. https://doi.org/10.1002/cem.3209
Schaal, S., Atkeson, C., Vijayamakumar, S. 2002. Scalable techniques from nonparametric statistics for the real time robot learning. Applied Intell., 17, 49-60.
Sicard, E. Sabatier, R., 2006. Theoretical framework for local PLS1 regression and application to a rainfall data set. Comput. Stat. Data Anal., 51, 1393-1410.
n <- 30 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 nlv <- 2 fm <- lwplsr( Xtrain, Ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv) res <- predict(fm, Xtest) names(res) res$pred msep(res$pred, Ytest) res <- predict(fm, Xtest, nlv = 0:2) res$pred
n <- 30 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 nlv <- 2 fm <- lwplsr( Xtrain, Ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv) res <- predict(fm, Xtest) names(res) res$pred msep(res$pred, Ytest) res <- predict(fm, Xtest, nlv = 0:2) res$pred
Ensemblist method where the predictions are calculated by averaging the predictions of KNN-LWPLSR models (lwplsr
) built with different numbers of latent variables (LVs).
For instance, if argument nlv
is set to nlv = "5:10"
, the prediction for a new observation is the simple average of the predictions returned by the models with 5 LVS, 6 LVs, ... 10 LVs, respectively.
lwplsr_agg( X, Y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, cri = 4, verb = FALSE ) ## S3 method for class 'Lwplsr_agg' predict(object, X, ...)
lwplsr_agg( X, Y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, cri = 4, verb = FALSE ) ## S3 method for class 'Lwplsr_agg' predict(object, X, ...)
X |
For the main function: Training X-data ( |
Y |
Training Y-data ( |
nlvdis |
The number of LVs to consider in the global PLS used for the dimension reduction before calculating the dissimilarities. If |
diss |
The type of dissimilarity used for defining the neighbors. Possible values are "eucl" (default; Euclidean distance), "mahal" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)). |
h |
A scale scalar defining the shape of the weight function. Lower is |
k |
The number of nearest neighbors to select for each observation to predict. |
nlv |
A character string such as "5:20" defining the range of the numbers of LVs to consider (here: the models with nb LVS = 5, 6, ..., 20 are averaged). Syntax such as "10" is also allowed (here: correponds to the single model with 10 LVs). |
cri |
Argument |
verb |
Logical. If |
object |
For the auxiliary function: A fitted model, output of a call to the main function. |
... |
For the auxiliary function: Optional arguments. Not used. |
For lwplsr_agg
: object of class Lwplsr_agg
For predict.Lwplsr_agg
:
pred |
prediction calculated for each observation, which is the most occurent level (vote) over the predictions returned by the models with different numbers of LVS respectively |
listnn |
list with the neighbors used for each observation to be predicted |
listd |
list with the distances to the neighbors used for each observation to be predicted |
listw |
list with the weights attributed to the neighbors used for each observation to be predicted |
In the examples, gridscore
and gricv
have been used as there is no sense to use gridscorelv
and gricvlv
.
## EXAMPLE 1 n <- 30 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 nlv <- "2:6" fm <- lwplsr_agg( Xtrain, Ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv) names(fm) res <- predict(fm, Xtest) names(res) res$pred msep(res$pred, Ytest) ## EXAMPLE 2 n <- 30 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) nlvdis <- 5 ; diss <- "mahal" h <- c(2, Inf) k <- c(10, 20) nlv <- c("1:3", "2:5") pars <- mpars(nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv) pars res <- gridscore( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = lwplsr_agg, pars = pars) res ## EXAMPLE 3 n <- 30 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm res <- gridcv( Xtrain, Ytrain, segm, score = msep, fun = lwplsr_agg, pars = pars, verb = TRUE) res
## EXAMPLE 1 n <- 30 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 nlv <- "2:6" fm <- lwplsr_agg( Xtrain, Ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv) names(fm) res <- predict(fm, Xtest) names(res) res$pred msep(res$pred, Ytest) ## EXAMPLE 2 n <- 30 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) nlvdis <- 5 ; diss <- "mahal" h <- c(2, Inf) k <- c(10, 20) nlv <- c("1:3", "2:5") pars <- mpars(nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv) pars res <- gridscore( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = lwplsr_agg, pars = pars) res ## EXAMPLE 3 n <- 30 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 100 * ytrain) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- rnorm(m) Ytest <- cbind(ytest, 10 * ytest) K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm res <- gridcv( Xtrain, Ytrain, segm, score = msep, fun = lwplsr_agg, pars = pars, verb = TRUE) res
- lwplsrda
: KNN-LWPLSRDA models. This is the same methodology as for lwplsr
except that PLSR is replaced by PLSRDA (plsrda
). See the help page of lwplsr
for details.
- lwplslda
and lwplsqda
: Same as above, but PLSRDA is replaced by either PLSLDA (plslda
) or PLSQDA ((plsqda
), respecively.
lwplsrda( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, cri = 4, verb = FALSE ) lwplslda( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, prior = c("unif", "prop"), cri = 4, verb = FALSE ) lwplsqda( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, prior = c("unif", "prop"), cri = 4, verb = FALSE ) ## S3 method for class 'Lwplsrda' predict(object, X, ..., nlv = NULL) ## S3 method for class 'Lwplsprobda' predict(object, X, ..., nlv = NULL)
lwplsrda( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, cri = 4, verb = FALSE ) lwplslda( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, prior = c("unif", "prop"), cri = 4, verb = FALSE ) lwplsqda( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, prior = c("unif", "prop"), cri = 4, verb = FALSE ) ## S3 method for class 'Lwplsrda' predict(object, X, ..., nlv = NULL) ## S3 method for class 'Lwplsprobda' predict(object, X, ..., nlv = NULL)
X |
For the main functions: Training X-data ( |
y |
Training class membership ( |
nlvdis |
The number of LVs to consider in the global PLS used for the dimension reduction before calculating the dissimilarities. If |
diss |
The type of dissimilarity used for defining the neighbors. Possible values are "eucl" (default; Euclidean distance), "mahal" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)). |
h |
A scale scalar defining the shape of the weight function. Lower is |
k |
The number of nearest neighbors to select for each observation to predict. |
nlv |
The number(s) of LVs to calculate in the local PLSDA models. |
prior |
The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in |
cri |
Argument |
verb |
Logical. If |
object |
For the auxiliary functions: A fitted model, output of a call to the main function. |
... |
For the auxiliary functions: Optional arguments. Not used. |
For lwplsrda
, lwplslda
, lwplsqda
: object of class Lwplsrda
or Lwplsprobda
,
For predict.Lwplsrda
, predict.Lwplsprobda
:
pred |
class predicted for each observation |
listnn |
list with the neighbors used for each observation to be predicted |
listd |
list with the distances to the neighbors used for each observation to be predicted |
listw |
list with the weights attributed to the neighbors used for each observation to be predicted |
n <- 50 ; p <- 7 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- sample(c(1, 4, 10), size = m, replace = TRUE) nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 nlv <- 2 fm <- lwplsrda( Xtrain, ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv ) res <- predict(fm, Xtest) res$pred res$listnn err(res$pred, ytest) res <- predict(fm, Xtest, nlv = 0:2) res$pred
n <- 50 ; p <- 7 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 4 Xtest <- matrix(rnorm(m * p), ncol = p) ytest <- sample(c(1, 4, 10), size = m, replace = TRUE) nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 nlv <- 2 fm <- lwplsrda( Xtrain, ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv ) res <- predict(fm, Xtest) res$pred res$listnn err(res$pred, ytest) res <- predict(fm, Xtest, nlv = 0:2) res$pred
Ensemblist method where the predictions are calculated by "averaging" the predictions of KNN-LWPLSDA models built with different numbers of latent variables (LVs).
For instance, if argument nlv
is set to nlv = "5:10"
, the prediction for a new observation is the most occurent level (vote) over the predictions returned by the models with 5 LVS, 6 LVs, ... 10 LVs, respectively.
- lwplsrda_agg
: use plsrda
.
- lwplslda_agg
: use plslda
.
- lwplsqda_agg
: use plsqda
.
lwplsrda_agg( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, cri = 4, verb = FALSE ) lwplslda_agg( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, prior = c("unif", "prop"), cri = 4, verb = FALSE ) lwplsqda_agg( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, prior = c("unif", "prop"), cri = 4, verb = FALSE ) ## S3 method for class 'Lwplsrda_agg' predict(object, X, ...) ## S3 method for class 'Lwplsprobda_agg' predict(object, X, ...)
lwplsrda_agg( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, cri = 4, verb = FALSE ) lwplslda_agg( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, prior = c("unif", "prop"), cri = 4, verb = FALSE ) lwplsqda_agg( X, y, nlvdis, diss = c("eucl", "mahal"), h, k, nlv, prior = c("unif", "prop"), cri = 4, verb = FALSE ) ## S3 method for class 'Lwplsrda_agg' predict(object, X, ...) ## S3 method for class 'Lwplsprobda_agg' predict(object, X, ...)
X |
For the main functions: Training X-data ( |
y |
Training class membership ( |
nlvdis |
The number of LVs to consider in the global PLS used for the dimension reduction before calculating the dissimilarities. If |
diss |
The type of dissimilarity used for defining the neighbors. Possible values are "eucl" (default; Euclidean distance), "mahal" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)). |
h |
A scale scalar defining the shape of the weight function. Lower is |
k |
The number of nearest neighbors to select for each observation to predict. |
nlv |
A character string such as "5:20" defining the range of the numbers of LVs to consider (here: the models with nb LVS = 5, 6, ..., 20 are averaged). Syntax such as "10" is also allowed (here: correponds to the single model with 10 LVs). |
prior |
For |
cri |
Argument |
verb |
Logical. If |
object |
For the auxiliary functions: A fitted model, output of a call to the main function. |
... |
For the auxiliary functions: Optional arguments. Not used. |
For lwplsrda_agg
, lwplslda_agg
and lwplsqda_agg
: object of class lwplsrda_agg
, lwplslda_agg
or lwplsqda_agg
For predict.Lwplsrda_agg
and predict.Lwplsprobda_agg
:
pred |
prediction calculated for each observation, which is the most occurent level (vote) over the predictions returned by the models with different numbers of LVS respectively |
listnn |
list with the neighbors used for each observation to be predicted |
listd |
list with the distances to the neighbors used for each observation to be predicted |
listw |
list with the weights attributed to the neighbors used for each observation to be predicted |
The first example concerns KNN-LWPLSRDA-AGG. The second example concerns KNN-LWPLSLDA-AGG.
## KNN-LWPLSRDA-AGG n <- 40 ; p <- 7 X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) y <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtrain <- X ; ytrain <- y m <- 5 Xtest <- X[1:m, ] ; ytest <- y[1:m] nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 nlv <- "2:4" fm <- lwplsrda_agg( Xtrain, ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv) res <- predict(fm, Xtest) res$pred res$listnn nlvdis <- 5 ; diss <- "mahal" h <- c(2, Inf) k <- c(10, 15) nlv <- c("1:3", "2:4") pars <- mpars(nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv) pars res <- gridscore( Xtrain, ytrain, Xtest, ytest, score = err, fun = lwplsrda_agg, pars = pars) res segm <- segmkf(n = n, K = 3, nrep = 1) res <- gridcv( Xtrain, ytrain, segm, score = err, fun = lwplsrda_agg, pars = pars, verb = TRUE) names(res) res$val ## KNN-LWPLSLDA-AGG n <- 40 ; p <- 7 X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) y <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtrain <- X ; ytrain <- y m <- 5 Xtest <- X[1:m, ] ; ytest <- y[1:m] nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 nlv <- "2:4" fm <- lwplslda_agg( Xtrain, ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv, prior = "prop") res <- predict(fm, Xtest) res$pred res$listnn nlvdis <- 5 ; diss <- "mahal" h <- c(2, Inf) k <- c(10, 15) nlv <- c("1:3", "2:4") pars <- mpars(nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv, prior = c("unif", "prop")) pars res <- gridscore( Xtrain, ytrain, Xtest, ytest, score = err, fun = lwplslda_agg, pars = pars) res segm <- segmkf(n = n, K = 3, nrep = 1) res <- gridcv( Xtrain, ytrain, segm, score = err, fun = lwplslda_agg, pars = pars, verb = TRUE) names(res) res$val
## KNN-LWPLSRDA-AGG n <- 40 ; p <- 7 X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) y <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtrain <- X ; ytrain <- y m <- 5 Xtest <- X[1:m, ] ; ytest <- y[1:m] nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 nlv <- "2:4" fm <- lwplsrda_agg( Xtrain, ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv) res <- predict(fm, Xtest) res$pred res$listnn nlvdis <- 5 ; diss <- "mahal" h <- c(2, Inf) k <- c(10, 15) nlv <- c("1:3", "2:4") pars <- mpars(nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv) pars res <- gridscore( Xtrain, ytrain, Xtest, ytest, score = err, fun = lwplsrda_agg, pars = pars) res segm <- segmkf(n = n, K = 3, nrep = 1) res <- gridcv( Xtrain, ytrain, segm, score = err, fun = lwplsrda_agg, pars = pars, verb = TRUE) names(res) res$val ## KNN-LWPLSLDA-AGG n <- 40 ; p <- 7 X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) y <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtrain <- X ; ytrain <- y m <- 5 Xtest <- X[1:m, ] ; ytest <- y[1:m] nlvdis <- 5 ; diss <- "mahal" h <- 2 ; k <- 10 nlv <- "2:4" fm <- lwplslda_agg( Xtrain, ytrain, nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv, prior = "prop") res <- predict(fm, Xtest) res$pred res$listnn nlvdis <- 5 ; diss <- "mahal" h <- c(2, Inf) k <- c(10, 15) nlv <- c("1:3", "2:4") pars <- mpars(nlvdis = nlvdis, diss = diss, h = h, k = k, nlv = nlv, prior = c("unif", "prop")) pars res <- gridscore( Xtrain, ytrain, Xtest, ytest, score = err, fun = lwplslda_agg, pars = pars) res segm <- segmkf(n = n, K = 3, nrep = 1) res <- gridcv( Xtrain, ytrain, segm, score = err, fun = lwplslda_agg, pars = pars, verb = TRUE) names(res) res$val
Calculation of within (matW
) and between (matB
) covariance matrices for classes of observations.
matW(X, y) matB(X, y)
matW(X, y) matB(X, y)
X |
Data ( |
y |
Class membership ( |
The denominator in the variance calculations is .
For (matW
):
W |
within covariance matrix. |
Wi |
list of covariance matrices for each class. |
lev |
classes |
ni |
number of observations in each per class |
For (matB
):
B |
between covariance matrix. |
ct |
matrix of class centers. |
lev |
classes |
ni |
number of observations in each per class |
n <- 8 ; p <- 3 X <- matrix(rnorm(n * p), ncol = p) y <- sample(1:2, size = n, replace = TRUE) X y matW(X, y) matB(X, y) matW(X, y)$W + matB(X, y)$B (n - 1) / n * cov(X)
n <- 8 ; p <- 3 X <- matrix(rnorm(n * p), ncol = p) y <- sample(1:2, size = n, replace = TRUE) X y matW(X, y) matB(X, y) matW(X, y)$W + matB(X, y)$B (n - 1) / n * cov(X)
Smoothing, by moving average, of the row observations (e.g. spectra) of a dataset.
mavg(X, n = 5)
mavg(X, n = 5)
X |
X-data ( |
n |
The number of points (i.e. columns of |
A matrix of the transformed data.
data(cassav) X <- cassav$Xtest headm(X) Xp <- mavg(X, n = 11) headm(Xp) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plotsp(X, main = "Signal") plotsp(Xp, main = "Corrected signal") abline(h = 0, lty = 2, col = "grey") par(oldpar)
data(cassav) X <- cassav$Xtest headm(X) Xp <- mavg(X, n = 11) headm(Xp) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plotsp(X, main = "Signal") plotsp(Xp, main = "Corrected signal") abline(h = 0, lty = 2, col = "grey") par(oldpar)
Algorithm fitting a multi-block PLS1 or PLS2 model between dependent variables and responses
, based on the "Improved kernel algorithm #1" proposed by Dayal and MacGregor (1997).
For weighted versions, see for instance Schaal et al. 2002, Siccard & Sabatier 2006, Kim et al. 2011 and Lesnoff et al. 2020.
Auxiliary functions
transform
Calculates the LVs for any new matrix from the model.
summary
returns summary information for the model.
coef
Calculates b-coefficients from the model, adjuted for raw data.
predict
Calculates the predictions for any new matrix from the model.
mbplsr(Xlist, Y, blockscaling = TRUE, weights = NULL, nlv, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) ## S3 method for class 'Mbplsr' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Mbplsr' summary(object, X, ...) ## S3 method for class 'Mbplsr' coef(object, ..., nlv = NULL) ## S3 method for class 'Mbplsr' predict(object, X, ..., nlv = NULL)
mbplsr(Xlist, Y, blockscaling = TRUE, weights = NULL, nlv, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) ## S3 method for class 'Mbplsr' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Mbplsr' summary(object, X, ...) ## S3 method for class 'Mbplsr' coef(object, ..., nlv = NULL) ## S3 method for class 'Mbplsr' predict(object, X, ..., nlv = NULL)
Xlist |
For the main function: list of training X-data ( |
X |
For the auxiliary functions: list of new X-data, with the same variables than the training X-data. |
Y |
Training Y-data ( |
blockscaling |
logical. If TRUE, the scaling factor (computed on the training) is the "norm" of the block, i.e. the square root of the sum of the variances of each column of the block. |
weights |
Weights ( |
nlv |
For the main functions: The number(s) of LVs to calculate. — For the auxiliary functions: The number(s) of LVs to consider. |
Xscaling |
vector (of length Xlist) of variable scaling for each datablock, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
Yscaling |
character. variable scaling for the Y-block, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
object |
For the auxiliary functions: A fitted model, output of a call to the main functions. |
... |
For the auxiliary functions: Optional arguments. Not used. |
A list of outputs, such as
T |
The X-score matrix ( |
P |
The X-loadings matrix ( |
W |
The X-loading weights matrix ( |
C |
The Y-loading weights matrix (C = t(Beta), where Beta is the scores regression coefficients matrix). |
R |
The PLS projection matrix ( |
xmeans |
The list of centering vectors of |
ymeans |
The centering vector of |
xscales |
The list of |
yscales |
The vector of |
weights |
Weights applied to the training observations. |
TT |
the X-score normalization factor. |
blockscaling |
block scaling. |
Xnorms |
"norm" of each block, i.e. the square root of the sum of the variances of each column of each block, computed on the training, and used as scaling factor |
.
U |
intermediate output. |
For transform.Mbplsr
: X-scores matrix for new Xlist-data.
For summary.Mbplsr
:
explvarx |
matrix of explained variances. |
For coef.Mbplsr
:
int |
matrix (1,nlv) with the intercepts |
B |
matrix (n,nlv) with the coefficients |
For predict.Mbplsr
:
pred |
A list of matrices ( |
Andersson, M., 2009. A comparison of nine PLS1 algorithms. Journal of Chemometrics 23, 518-529.
Dayal, B.S., MacGregor, J.F., 1997. Improved PLS algorithms. Journal of Chemometrics 11, 73-85.
Hoskuldsson, A., 1988. PLS regression methods. Journal of Chemometrics 2, 211-228. https://doi.org/10.1002/cem.1180020306
Kim, S., Kano, M., Nakagawa, H., Hasebe, S., 2011. Estimation of active pharmaceutical ingredients content using locally weighted partial least squares and statistical wavelength selection. Int. J. Pharm., 421, 269-274.
Lesnoff, M., Metz, M., Roger, J.M., 2020. Comparison of locally weighted PLS strategies for regression and discrimination on agronomic NIR Data. Journal of Chemometrics. e3209. https://onlinelibrary.wiley.com/doi/abs/10.1002/cem.3209
Rannar, S., Lindgren, F., Geladi, P., Wold, S., 1994. A PLS kernel algorithm for data sets with many variables and fewer objects. Part 1: Theory and algorithm. Journal of Chemometrics 8, 111-125. https://doi.org/10.1002/cem.1180080204
Schaal, S., Atkeson, C., Vijayamakumar, S. 2002. Scalable techniques from nonparametric statistics for the real time robot learning. Applied Intell., 17, 49-60.
Sicard, E. Sabatier, R., 2006. Theoretical framework for local PLS1 regression and application to a rainfall data set. Comput. Stat. Data Anal., 51, 1393-1410.
Tenenhaus, M., 1998. La régression PLS: théorie et pratique. Editions Technip, Paris, France.
Wold, S., Sjostrom, M., Eriksson, l., 2001. PLS-regression: a basic tool for chemometrics. Chem. Int. Lab. Syst., 58, 109-130.
mbplsr_mbplsda_allsteps
function to help determine the optimal number of latent variables, perform a permutation test, calculate model parameters and predict new observations.
n <- 10 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) m <- 2 Xtest <- matrix(rnorm(m * p), ncol = p) colnames(Xtrain) <- colnames(Xtest) <- paste("v", 1:p, sep = "") Xtrain Xtest blocks <- list(1:2, 4, 6:8) X1 <- mblocks(Xtrain, blocks = blocks) X2 <- mblocks(Xtest, blocks = blocks) nlv <- 3 fm <- mbplsr(Xlist = X1, Y = ytrain, Xscaling = c("sd","none","none"), blockscaling = TRUE, weights = NULL, nlv = nlv) summary(fm, X1) coef(fm) transform(fm, X2) predict(fm, X2)
n <- 10 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) m <- 2 Xtest <- matrix(rnorm(m * p), ncol = p) colnames(Xtrain) <- colnames(Xtest) <- paste("v", 1:p, sep = "") Xtrain Xtest blocks <- list(1:2, 4, 6:8) X1 <- mblocks(Xtrain, blocks = blocks) X2 <- mblocks(Xtest, blocks = blocks) nlv <- 3 fm <- mbplsr(Xlist = X1, Y = ytrain, Xscaling = c("sd","none","none"), blockscaling = TRUE, weights = NULL, nlv = nlv) summary(fm, X1) coef(fm) transform(fm, X2) predict(fm, X2)
Help determine the optimal number of latent variables by cross-validation, perform a permutation test, calculate model parameters and predict new observations, for mbplsr (mbplsr
), mbplsrda (mbplsrda
), mbplslda (mbplslda
) or mbplsqda (mbplsqda
) models.
mbplsr_mbplsda_allsteps(Xlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y, Yscaling = c("none","pareto","sd")[1], weights = NULL, newXlist = NULL, newXnames = NULL, method = c("mbplsr", "mbplsrda","mbplslda","mbplsqda")[1], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlv, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[1], nbrep = 30, seed = 123, samplingk = NULL, nfolds = 10, npermut = 30, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], import = c("R","ChemFlow","W4M")[1], outputfilename = NULL)
mbplsr_mbplsda_allsteps(Xlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y, Yscaling = c("none","pareto","sd")[1], weights = NULL, newXlist = NULL, newXnames = NULL, method = c("mbplsr", "mbplsrda","mbplslda","mbplsqda")[1], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlv, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[1], nbrep = 30, seed = 123, samplingk = NULL, nfolds = 10, npermut = 30, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], import = c("R","ChemFlow","W4M")[1], outputfilename = NULL)
Xlist |
list of training X-data ( |
Xnames |
names of the X-matrices |
Xscaling |
vector of Xlist length. X variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
Y |
Training Y-data ( |
Yscaling |
Y variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
weights |
Weights ( |
newXlist |
list of new X-data ( |
newXnames |
names of the newX-matrices |
method |
method to apply among "plsr", "plsrda","plslda","plsqda" |
prior |
for plslda or plsqda models : The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in |
step |
step of the analysis among "nlvtest" (cross-validation to help determine the optimal number of latent variables), "permutation" (permutation test),"model" (model calculation),"prediction" (prediction of newX-data or X-data if any)) |
nlv |
number of latent variables to test if step is "nlvtest"; number of latent variables of the model if step is not "nlvtest". |
modeloutput |
if step is "model": outputs among "scores", "loadings", "coef" (regression coefficients), "vip" (Variable Importance in Projection; the VIP calculation being based on the proportion of Y-variance explained by the components, as proposed by Mehmood et al (2012, 2020).) |
cvmethod |
if step is "nlvtest" or "permutation": "kfolds" for k-folds cross-validation, or "loo" for leave-one-out. |
nbrep |
if step is "nlvtest" and cvmethod is "kfolds": An integer, setting the number of CV repetitions. Default value is 30. Must me set to 1 if cvmethod is "loo" |
seed |
if step is "nlvtest" and cvmethod is "kfolds", or if step is "permutation: a numeric. Seed used for the repeated resampling |
samplingk |
A vector of length n. The elements are the values of a qualitative variable used for stratified partition creation. If NULL, the first observation is set in the first fold, the second observation in the second fold, etc... |
nfolds |
if cvmethod is "kfolds". An integer, setting the number of partitions to create. Default value is 10. |
npermut |
if step is "permutation": An integer, setting the number of Y-Block with permutated responses to create. Default value is 30. |
criterion |
if step is "nlvtest" or "permutation" and method is "plsrda", "plslda" or "plsqda": optimisation criterion among "rmse" and "err" (for classification error rate))) |
selection |
if step is "nlvtest": a character indicating the selection method to use to choose the optimal combination of components, among "localmin","globalmin","1std". If "localmin": the optimal combination corresponds to the first local minimum of the mean CV rmse or error rate. If "globalmin" : the optimal combination corresponds to the minimum mean CV rmse or error rate. If "1std" (one standard error rule) : it corresponds to the first combination after which the mean cross-validated rmse or error rate does not decrease significantly. |
import |
If "R", X and Y are in the global environment, and the observation names are in rownames. If "ChemFlow", X and Y are tabulated tables (.txt), and the observation names are in the first column. If "W4M", X and Y are tabulated tables (.txt), and the observation names are in the headers of X, and in the first column of Y. |
outputfilename |
character: If not NULL, name of the tabular file, in which the function outputs have to be written.) |
If step is "nlvtest": table with rmsecv or cross-validated classification error rates. The suggested optimal number of latent variables is indicated by the binary "optimum" variable.
If step is "permutation": table with the dissimilarity between the original and the permutated Y-block, and the rmsecv or cross-validated classification error rates obtained with the permutated Y-block by the model and the given number of latent variables.
If step is "model": tables of scores, loadings, regression coefficients, and vip values, depending of the "modeloutput" parameter.
If step is "prediction": table of predicted scores and predicted classes or values.
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) colnames(Xtrain) <- paste0("V",1:p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8]) Xtestlist <- list(Xtest[,1:3], Xtest[,4:8]) nlv <- 5 resnlvtestmbplsrda <- mbplsr_mbplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("mbplsr", "mbplsrda","mbplslda","mbplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlv = 5, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) respermutationmbplsrda <- mbplsr_mbplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("mbplsr", "mbplsrda","mbplslda","mbplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[2], nlv = 1, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) plotxy(respermutationmbplsrda, pch=16) abline (h = respermutationmbplsrda[respermutationmbplsrda[,"permut_dyssimilarity"]==0,"res_permut"]) resmodelmbplsrda <- mbplsr_mbplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("mbplsr", "mbplsrda","mbplslda","mbplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[3], nlv = 1, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) resmodelmbplsrda$scores resmodelmbplsrda$loadings resmodelmbplsrda$coef resmodelmbplsrda$vip respredictionmbplsrda <- mbplsr_mbplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("mbplsr", "mbplsrda","mbplslda","mbplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[4], nlv = 1, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL)
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) colnames(Xtrain) <- paste0("V",1:p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8]) Xtestlist <- list(Xtest[,1:3], Xtest[,4:8]) nlv <- 5 resnlvtestmbplsrda <- mbplsr_mbplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("mbplsr", "mbplsrda","mbplslda","mbplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlv = 5, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) respermutationmbplsrda <- mbplsr_mbplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("mbplsr", "mbplsrda","mbplslda","mbplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[2], nlv = 1, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) plotxy(respermutationmbplsrda, pch=16) abline (h = respermutationmbplsrda[respermutationmbplsrda[,"permut_dyssimilarity"]==0,"res_permut"]) resmodelmbplsrda <- mbplsr_mbplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("mbplsr", "mbplsrda","mbplslda","mbplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[3], nlv = 1, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) resmodelmbplsrda$scores resmodelmbplsrda$loadings resmodelmbplsrda$coef resmodelmbplsrda$vip respredictionmbplsrda <- mbplsr_mbplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("mbplsr", "mbplsrda","mbplslda","mbplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[4], nlv = 1, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL)
Multi-block discrimination (DA) based on PLS.
The training variable (univariate class membership) is firstly transformed to a dummy table containing
columns, where
is the number of classes present in
. Each column is a dummy variable (0/1). Then, a PLS2 is implemented on the
data and the dummy table, returning latent variables (LVs) that are used as dependent variables in a DA model.
- mbplsrda
: Usual "PLSDA". A linear regression model predicts the Y-dummy table from the PLS2 LVs. This corresponds to the PLSR2 of the X-data and the Y-dummy table. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.
- mbplslda
and mbplsqda
: Probabilistic LDA and QDA are run over the PLS2 LVs, respectively.
mbplsrda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) mbplslda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, prior = c("unif", "prop"), Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) mbplsqda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, prior = c("unif", "prop"), Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) ## S3 method for class 'Mbplsrda' predict(object, X, ..., nlv = NULL) ## S3 method for class 'Mbplsprobda' predict(object, X, ..., nlv = NULL)
mbplsrda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) mbplslda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, prior = c("unif", "prop"), Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) mbplsqda(Xlist, y, blockscaling = TRUE, weights = NULL, nlv, prior = c("unif", "prop"), Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) ## S3 method for class 'Mbplsrda' predict(object, X, ..., nlv = NULL) ## S3 method for class 'Mbplsprobda' predict(object, X, ..., nlv = NULL)
Xlist |
For the main functions: list of training X-data ( |
X |
For the auxiliary functions: list of new X-data ( |
y |
Training class membership ( |
blockscaling |
logical. If TRUE, the scaling factor (computed on the training) is the "norm" of the block, i.e. the square root of the sum of the variances of each column of the block. |
weights |
Weights ( |
nlv |
The number(s) of LVs to calculate. |
prior |
The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in |
Xscaling |
vector (of length Xlist) of variable scaling for each datablock, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
Yscaling |
character. variable scaling for the Y-block after binary transformation, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
object |
For the auxiliary functions: A fitted model, output of a call to the main functions. |
... |
For the auxiliary functions: Optional arguments. Not used. |
For mbplsrda
:
fm |
list with the MB-PLS model: ( |
lev |
classes |
ni |
number of observations in each class |
For mbplslda
, mbplsqda
:
fm |
list with
[[1]] the MB-PLS model: ( |
lev |
classes |
ni |
number of observations in each class |
For predict.Mbplsrda
, predict.Mbplsprobda
:
pred |
predicted class for each observation |
posterior |
calculated probability of belonging to a class for each observation |
The first example concerns MB-PLSDA, and the second one concerns MB-PLS LDA.
fm
are PLS1 models, and zfm
are PLS2 models.
mbplsr_mbplsda_allsteps
function to help determine the optimal number of latent variables, perform a permutation test, calculate model parameters and predict new observations.
## EXAMPLE OF MB-PLSDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8]) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] Xtestlist <- list(Xtest[,1:3], Xtest[,4:8]) nlv <- 5 fm <- mbplsrda(Xtrainlist, ytrain, Xscaling = "sd", nlv = nlv) names(fm) predict(fm, Xtestlist) predict(fm, Xtestlist, nlv = 0:2)$pred pred <- predict(fm, Xtestlist)$pred err(pred, ytest) zfm <- fm$fm transform(zfm, Xtestlist) transform(zfm, Xtestlist, nlv = 1) summary(zfm, Xtrainlist) coef(zfm) coef(zfm, nlv = 0) coef(zfm, nlv = 2) ## EXAMPLE OF MB-PLS LDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8]) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] Xtestlist <- list(Xtest[,1:3], Xtest[,4:8]) nlv <- 5 fm <- mbplslda(Xtrainlist, ytrain, Xscaling = "none", nlv = nlv) predict(fm, Xtestlist) predict(fm, Xtestlist, nlv = 1:2)$pred zfm <- fm[[1]][[1]] class(zfm) names(zfm) summary(zfm, Xtrainlist) transform(zfm, Xtestlist) coef(zfm) ## EXAMPLE OF MB-PLS QDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8]) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] Xtestlist <- list(Xtest[,1:3], Xtest[,4:8]) nlv <- 5 fm <- mbplsqda(Xtrainlist, ytrain, Xscaling = "none", nlv = nlv) predict(fm, Xtestlist) predict(fm, Xtestlist, nlv = 1:2)$pred zfm <- fm[[1]][[1]] class(zfm) names(zfm) summary(zfm, Xtrainlist) transform(zfm, Xtestlist) coef(zfm)
## EXAMPLE OF MB-PLSDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8]) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] Xtestlist <- list(Xtest[,1:3], Xtest[,4:8]) nlv <- 5 fm <- mbplsrda(Xtrainlist, ytrain, Xscaling = "sd", nlv = nlv) names(fm) predict(fm, Xtestlist) predict(fm, Xtestlist, nlv = 0:2)$pred pred <- predict(fm, Xtestlist)$pred err(pred, ytest) zfm <- fm$fm transform(zfm, Xtestlist) transform(zfm, Xtestlist, nlv = 1) summary(zfm, Xtrainlist) coef(zfm) coef(zfm, nlv = 0) coef(zfm, nlv = 2) ## EXAMPLE OF MB-PLS LDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8]) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] Xtestlist <- list(Xtest[,1:3], Xtest[,4:8]) nlv <- 5 fm <- mbplslda(Xtrainlist, ytrain, Xscaling = "none", nlv = nlv) predict(fm, Xtestlist) predict(fm, Xtestlist, nlv = 1:2)$pred zfm <- fm[[1]][[1]] class(zfm) names(zfm) summary(zfm, Xtrainlist) transform(zfm, Xtestlist) coef(zfm) ## EXAMPLE OF MB-PLS QDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8]) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] Xtestlist <- list(Xtest[,1:3], Xtest[,4:8]) nlv <- 5 fm <- mbplsqda(Xtrainlist, ytrain, Xscaling = "none", nlv = nlv) predict(fm, Xtestlist) predict(fm, Xtestlist, nlv = 1:2)$pred zfm <- fm[[1]][[1]] class(zfm) names(zfm) summary(zfm, Xtrainlist) transform(zfm, Xtestlist) coef(zfm)
Residuals and prediction error rates (MSEP, SEP, etc. or classification error rate) for models with quantitative or qualitative responses.
residreg(pred, Y) residcla(pred, y) msep(pred, Y) rmsep(pred, Y) sep(pred, Y) bias(pred, Y) cor2(pred, Y) r2(pred, Y) rpd(pred, Y) rpdr(pred, Y) mse(pred, Y, digits = 3) err(pred, y)
residreg(pred, Y) residcla(pred, y) msep(pred, Y) rmsep(pred, Y) sep(pred, Y) bias(pred, Y) cor2(pred, Y) r2(pred, Y) rpd(pred, Y) rpdr(pred, Y) mse(pred, Y, digits = 3) err(pred, y)
pred |
Prediction ( |
Y |
Observed response ( |
y |
Observed response ( |
digits |
Number of digits for the numerical outputs. |
The rate is calculated by
, where
and "null model" is the overall mean of
. For predictions over CV or Test sets, and/or for non linear models, it can be different from the square of the correlation coefficient (
) between the observed values and the predictions.
Function sep
computes the SEP, referred to as "corrected SEP" (SEP_c) in Bellon et al. 2010. SEP is the standard deviation of the residuals. There is the relation: .
Function rpd
computes the ratio of the "deviation" (sqrt of the mean of the squared residuals for the null model when it is defined by the simple average) to the "performance" (sqrt of the mean of the squared residuals for the current model, i.e. RMSEP), i.e. (see eg. Bellon et al. 2010).
Function rpdr
computes a robust RPD.
Residuals or prediction error rates.
Bellon-Maurel, V., Fernandez-Ahumada, E., Palagos, B., Roger, J.-M., McBratney, A., 2010. Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by NIR spectroscopy. TrAC Trends in Analytical Chemistry 29, 1073-1081. https://doi.org/10.1016/j.trac.2010.05.006
## EXAMPLE 1 n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ytest <- Ytest[1:m, 1] nlv <- 3 fm <- plskern(Xtrain, Ytrain, nlv = nlv) pred <- predict(fm, Xtest)$pred residreg(pred, Ytest) msep(pred, Ytest) rmsep(pred, Ytest) sep(pred, Ytest) bias(pred, Ytest) cor2(pred, Ytest) r2(pred, Ytest) rpd(pred, Ytest) rpdr(pred, Ytest) mse(pred, Ytest, digits = 3) ## EXAMPLE 2 n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ytest <- ytrain[1:5] nlv <- 5 fm <- plsrda(Xtrain, ytrain, nlv = nlv) pred <- predict(fm, Xtest)$pred residcla(pred, ytest) err(pred, ytest)
## EXAMPLE 1 n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ytest <- Ytest[1:m, 1] nlv <- 3 fm <- plskern(Xtrain, Ytrain, nlv = nlv) pred <- predict(fm, Xtest)$pred residreg(pred, Ytest) msep(pred, Ytest) rmsep(pred, Ytest) sep(pred, Ytest) bias(pred, Ytest) cor2(pred, Ytest) r2(pred, Ytest) rpd(pred, Ytest) rpdr(pred, Ytest) mse(pred, Ytest, digits = 3) ## EXAMPLE 2 n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ytest <- ytrain[1:5] nlv <- 5 fm <- plsrda(Xtrain, ytrain, nlv = nlv) pred <- predict(fm, Xtest)$pred residcla(pred, ytest) err(pred, ytest)
Octane dataset.
Near infrared (NIR) spectra (absorbance) of = 39 gasoline samples over
= 226 wavelengths (1102 nm to 1552 nm, step = 2 nm).
Samples 25, 26, and 36-39 contain added alcohol (outliers).
data(octane)
data(octane)
A list with 1 component: the matrix with 39 samples and 226 variables.
K.H. Esbensen, S. Schoenkopf and T. Midtgaard Multivariate Analysis in Practice, Trondheim, Norway: Camo, 1994.
Todorov, V. 2020. rrcov: Robust Location and Scatter Estimation and Robust Multivariate Analysis with High Breakdown. R Package version 1.5-5. https://cran.r-project.org/.
M. Hubert, P. J. Rousseeuw, K. Vanden Branden (2005), ROBPCA: a new approach to robust principal components analysis, Technometrics, 47, 64-79.
P. J. Rousseeuw, M. Debruyne, S. Engelen and M. Hubert (2006), Robustness and Outlier Detection in Chemometrics, Critical Reviews in Analytical Chemistry, 36(3-4), 221-242.
data(octane) X <- octane$X headm(X) plotsp(X, xlab = "Wawelength", ylab = "Absorbance") plotsp(X[c(25:26, 36:39), ], add = TRUE, col = "red")
data(octane) X <- octane$X headm(X) plotsp(X, xlab = "Wawelength", ylab = "Absorbance") plotsp(X[c(25:26, 36:39), ], add = TRUE, col = "red")
odis
calculates the orthogonal distances (OD = "X-residuals") for a PCA or PLS model. OD is the Euclidean distance of a row observation to its projection to the score plan (see e.g. Hubert et al. 2005, Van Branden & Hubert 2005, p. 66; Varmuza & Filzmoser, 2009, p. 79).
A distance cutoff is computed using a moment estimation of the parameters of a Chi-squared distribution for OD^2 (see Nomikos & MacGregor 1995, and Pomerantzev 2008). In the function output, column dstand
is a standardized distance defined as . A value
dstand > 1
can be considered as extreme.
The cutoff for detecting extreme OD values is computed using a moment estimation of a Chi-squared distrbution for the squared distance.
odis( object, Xtrain, X = NULL, nlv = NULL, rob = TRUE, alpha = .01 )
odis( object, Xtrain, X = NULL, nlv = NULL, rob = TRUE, alpha = .01 )
object |
A fitted model, output of a call to a fitting function. |
Xtrain |
Training X-data that was used to fit the model. |
X |
New X-data. |
nlv |
Number of components (PCs or LVs) to consider. |
rob |
Logical. If |
alpha |
Risk- |
res.train |
matrix with distance and a standardized distance calculated for Xtrain. |
res |
matrix with distance and a standardized distance calculated for X. |
cutoff |
distance cutoff computed using a moment estimation of the parameters of a Chi-squared distribution for OD^2. |
M. Hubert, P. J. Rousseeuw, K. Vanden Branden (2005). ROBPCA: a new approach to robust principal components analysis. Technometrics, 47, 64-79.
Nomikos, P., MacGregor, J.F., 1995. Multivariate SPC Charts for Monitoring Batch Processes. null 37, 41-59. https://doi.org/10.1080/00401706.1995.10485888
Pomerantsev, A.L., 2008. Acceptance areas for multivariate classification derived by projection methods. Journal of Chemometrics 22, 601-609. https://doi.org/10.1002/cem.1147
K. Vanden Branden, M. Hubert (2005). Robuts classification in high dimension based on the SIMCA method. Chem. Lab. Int. Syst, 79, 10-21.
K. Varmuza, P. Filzmoser (2009). Introduction to multivariate statistical analysis in chemometrics. CRC Press, Boca Raton.
n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Xtest <- Xtrain[1:3, , drop = FALSE] nlv <- 3 fm <- pcasvd(Xtrain, nlv = nlv) odis(fm, Xtrain) odis(fm, Xtrain, nlv = 2) odis(fm, Xtrain, X = Xtest, nlv = 2)
n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Xtest <- Xtrain[1:3, , drop = FALSE] nlv <- 3 fm <- pcasvd(Xtrain, nlv = nlv) odis(fm, Xtrain) odis(fm, Xtrain, nlv = 2) odis(fm, Xtrain, X = Xtest, nlv = 2)
Function orthog
orthogonalizes a matrix to a matrix
. The row observations can be weighted.
The function uses function lm
.
orthog(X, Y, weights = NULL)
orthog(X, Y, weights = NULL)
X |
A |
Y |
A |
weights |
A vector of length |
Y |
The |
b |
The regression coefficients used for orthogonalization. |
n <- 8 ; p <- 3 set.seed(1) X <- matrix(rnorm(n * p, mean = 10), ncol = p, byrow = TRUE) Y <- matrix(rnorm(n * 2, mean = 10), ncol = 2, byrow = TRUE) colnames(Y) <- c("y1", "y2") set.seed(NULL) X Y res <- orthog(X, Y) res$Y crossprod(res$Y, X) res$b # Same as: fm <- lm(Y ~ X) Y - fm$fitted.values fm$coef #### WITH WEIGHTS w <- 1:n fm <- lm(Y ~ X, weights = w) Y - fm$fitted.values fm$coef res <- orthog(X, Y, weights = w) res$Y t(res$Y) res$b
n <- 8 ; p <- 3 set.seed(1) X <- matrix(rnorm(n * p, mean = 10), ncol = p, byrow = TRUE) Y <- matrix(rnorm(n * 2, mean = 10), ncol = 2, byrow = TRUE) colnames(Y) <- c("y1", "y2") set.seed(NULL) X Y res <- orthog(X, Y) res$Y crossprod(res$Y, X) res$b # Same as: fm <- lm(Y ~ X) Y - fm$fitted.values fm$coef #### WITH WEIGHTS w <- 1:n fm <- lm(Y ~ X, weights = w) Y - fm$fitted.values fm$coef res <- orthog(X, Y, weights = w) res$Y t(res$Y) res$b
Los Angeles ozone pollution data in 1976 (sources: Breiman & Friedman 1985, Leisch & Dimitriadou 2020).
data(ozone)
data(ozone)
A list with 1 component: the matrix X
with 366 observations, 13 variables. The variable to predict is V4.
V1
Month: 1 = January, ..., 12 = December
V2
Day of month
V3
Day of week: 1 = Monday, ..., 7 = Sunday
V4
Daily maximum one-hour-average ozone reading
V5
500 millibar pressure height (m) measured at Vandenberg AFB
V6
Wind speed (mph) at Los Angeles International Airport (LAX)
V7
Humidity (%) at LAX
V8
Temperature (degrees F) measured at Sandburg, CA
V9
Temperature (degrees F) measured at El Monte, CA
V10
Inversion base height (feet) at LAX
V11
Pressure gradient (mm Hg) from LAX to Daggett, CA
V12
Inversion base temperature (degrees F) at LAX
V13
Visibility (miles) measured at LAX
Breiman L., Friedman J.H. 1985. Estimating optimal transformations for multiple regression and correlation, JASA, 80, pp. 580-598.
Leisch, F. and Dimitriadou, E. (2010). mlbench: Machine Learning Benchmark Problems. R package version 1.1-6. https://cran.r-project.org/.
data(ozone) z <- ozone$X head(z) plotxna(z)
data(ozone) z <- ozone$X head(z) plotxna(z)
Algorithms fitting a centered weighted PCA of a matrix .
Noting a (
) diagonal matrix of weights for the observations (rows of
), the functions consist in:
- pcasvd
: SVD factorization of , using function
svd
.
- pcaeigen
:Eigen factorization of , using function
eigen
.
- pcaeigenk
: Eigen factorization of , using function
eigen
. This is the "kernel cross-product trick" version of the PCA algorithm (Wu et al. 1997). For wide matrices () and
not too large, this algorithm can be much faster than the others.
- pcanipals
: Eigen factorization of using NIPALS.
- pcanipalsna
: Eigen factorization of using NIPALS allowing missing data in
.
- pcasph
: Robust spherical PCA (Locantore et al. 1990, Maronna 2005, Daszykowski et al. 2007).
Function pcanipalsna
accepts missing data (NA
s) in , unlike the other functions. The part of
pcanipalsna
accounting specifically for missing missing data is based on the efficient code of K. Wright in the R package nipals
(https://cran.r-project.org/web/packages/nipals/index.html).
Gram-Schmidt orthogonalization in the NIPALS algorithm
The PCA NIPALS is known to generate a loss of orthogonality of the PCs (due to the accumulation of rounding errors in the successive iterations), particularly for large matrices or with high degrees of column collinearity.
With missing data, orthogonality of loadings is not satisfied neither.
An approach for coming back to orthogonality (PCs and loadings) is the iterative classical Gram-Schmidt orthogonalization (Lingen 2000, Andrecut 2009, and vignette of R package nipals
), referred to as the iterative CGS. It consists in adding a CGS orthorgonalization step in each iteration of the PCs and loadings calculations.
For the case with missing data, the iterative CGS does not insure that the orthogonalized PCs are centered.
Auxiliary function
transform
Calculates the PCs for any new matrix from the model.
summary
returns summary information for the model.
pcasvd(X, weights = NULL, nlv) pcaeigen(X, weights = NULL, nlv) pcaeigenk(X,weights = NULL, nlv) pcanipals(X, weights = NULL, nlv, gs = TRUE, tol = .Machine$double.eps^0.5, maxit = 200) pcanipalsna(X, nlv, gs = TRUE, tol = .Machine$double.eps^0.5, maxit = 200) pcasph(X, weights = NULL, nlv) ## S3 method for class 'Pca' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Pca' summary(object, X, ...)
pcasvd(X, weights = NULL, nlv) pcaeigen(X, weights = NULL, nlv) pcaeigenk(X,weights = NULL, nlv) pcanipals(X, weights = NULL, nlv, gs = TRUE, tol = .Machine$double.eps^0.5, maxit = 200) pcanipalsna(X, nlv, gs = TRUE, tol = .Machine$double.eps^0.5, maxit = 200) pcasph(X, weights = NULL, nlv) ## S3 method for class 'Pca' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Pca' summary(object, X, ...)
X |
For the main functions and auxiliary function |
weights |
Weights ( |
nlv |
The number of PCs to calculate. |
object |
A fitted model, output of a call to the main functions. |
... |
Optional arguments. |
Specific for the NIPALS algorithm
gs |
Logical indicating if a Gram-Schmidt orthogonalization is implemented or not (default to |
tol |
Tolerance for testing convergence of the NIPALS iterations for each PC. |
maxit |
Maximum number of NIPALS iterations for each PC. |
A list of outputs, such as:
T |
The score matrix ( |
P |
The loadings matrix ( |
R |
The projection matrix (= |
sv |
The singular values ( |
eig |
The eigenvalues ( |
xmeans |
The centering vector of |
niter |
Numbers of iterations of the NIPALS. |
conv |
Logical indicating if the NIPALS converged before reaching the maximal number of iterations. |
Andrecut, M., 2009. Parallel GPU Implementation of Iterative PCA Algorithms. Journal of Computational Biology 16, 1593-1599. https://doi.org/10.1089/cmb.2008.0221
Gabriel, R. K., 2002. Le biplot - Outil d\'exploration de données multidimensionnelles. Journal de la Société Française de la Statistique, 143, 5-55.
Lingen, F.J., 2000. Efficient Gram-Schmidt orthonormalisation on parallel computers. Communications in Numerical Methods in Engineering 16, 57-66. https://doi.org/10.1002/(SICI)1099-0887(200001)16:1<57::AID-CNM320>3.0.CO;2-I
Tenenhaus, M., 1998. La régression PLS: théorie et pratique. Editions Technip, Paris, France.
Wright, K., 2018. Package nipals: Principal Components Analysis using NIPALS with Gram-Schmidt Orthogonalization. https://cran.r-project.org/
Wu, W., Massart, D.L., de Jong, S., 1997. The kernel PCA algorithms for wide data. Part I: Theory and algorithms. Chemometrics and Intelligent Laboratory Systems 36, 165-172. https://doi.org/10.1016/S0169-7439(97)00010-5
For Spherical PCA:
Daszykowski, M., Kaczmarek, K., Vander Heyden, Y., Walczak, B., 2007. Robust statistics in data analysis - A review. Chemometrics and Intelligent Laboratory Systems 85, 203-219. https://doi.org/10.1016/j.chemolab.2006.06.016
Locantore N., Marron J.S., Simpson D.G., Tripoli N., Zhang J.T., Cohen K.L. Robust principal component analysis for functional data, Test 8 (1999) 1-7
Maronna, R., 2005. Principal components and orthogonal regression based on robust scales, Technometrics, 47:3, 264-273, DOI: 10.1198/004017005000000166
n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), nrow = n) s <- c(3, 4, 7, 10, 11, 15, 21:24) zX <- replace(Xtrain, s, NA) Xtrain zX m <- 2 Xtest <- matrix(rnorm(m * p), nrow = m) pcasvd(Xtrain, nlv = 3) pcaeigen(Xtrain, nlv = 3) pcaeigenk(Xtrain, nlv = 3) pcanipals(Xtrain, nlv = 3) pcanipalsna(Xtrain, nlv = 3) pcanipalsna(zX, nlv = 3) fm <- pcaeigen(Xtrain, nlv = 3) fm$T transform(fm, Xtest) transform(fm, Xtest, nlv = 2) pcaeigen(Xtrain, nlv = 3)$T pcaeigen(Xtrain, nlv = 3, weights = 1:n)$T Ttrain <- fm$T Ttest <- transform(fm, Xtest) T <- rbind(Ttrain, Ttest) group <- c(rep("Training", nrow(Ttrain)), rep("Test", nrow(Ttest))) i <- 1 plotxy(T[, i:(i+1)], group = group, pch = 16, zeroes = TRUE, cex = 1.3, main = "scores") plotxy(fm$P, zeroes = TRUE, label = TRUE, cex = 2, col = "red3", main ="loadings") summary(fm, Xtrain) res <- summary(fm, Xtrain) plotxy(res$cor.circle, zeroes = TRUE, label = TRUE, cex = 2, col = "red3", circle = TRUE, ylim = c(-1, 1))
n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), nrow = n) s <- c(3, 4, 7, 10, 11, 15, 21:24) zX <- replace(Xtrain, s, NA) Xtrain zX m <- 2 Xtest <- matrix(rnorm(m * p), nrow = m) pcasvd(Xtrain, nlv = 3) pcaeigen(Xtrain, nlv = 3) pcaeigenk(Xtrain, nlv = 3) pcanipals(Xtrain, nlv = 3) pcanipalsna(Xtrain, nlv = 3) pcanipalsna(zX, nlv = 3) fm <- pcaeigen(Xtrain, nlv = 3) fm$T transform(fm, Xtest) transform(fm, Xtest, nlv = 2) pcaeigen(Xtrain, nlv = 3)$T pcaeigen(Xtrain, nlv = 3, weights = 1:n)$T Ttrain <- fm$T Ttest <- transform(fm, Xtest) T <- rbind(Ttrain, Ttest) group <- c(rep("Training", nrow(Ttrain)), rep("Test", nrow(Ttest))) i <- 1 plotxy(T[, i:(i+1)], group = group, pch = 16, zeroes = TRUE, cex = 1.3, main = "scores") plotxy(fm$P, zeroes = TRUE, label = TRUE, cex = 2, col = "red3", main ="loadings") summary(fm, Xtrain) res <- summary(fm, Xtrain) plotxy(res$cor.circle, zeroes = TRUE, label = TRUE, cex = 2, col = "red3", circle = TRUE, ylim = c(-1, 1))
Calculation of the Moore-Penrose (MP) pseudo-inverse of a matrix .
pinv(X, tol = sqrt(.Machine$double.eps))
pinv(X, tol = sqrt(.Machine$double.eps))
X |
X-data ( |
tol |
A relative tolerance to detect zero singular values. |
Xplus |
The MP pseudo-inverse. |
sv |
singular values. |
n <- 7 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) pinv(X) tcrossprod(pinv(X)$Xplus, t(y)) lm(y ~ X - 1)
n <- 7 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) pinv(X) tcrossprod(pinv(X)$Xplus, t(y)) lm(y ~ X - 1)
Plot comparing classes with jittered points (random noise is added to the x-axis values for avoiding overplotting).
plotjit(x, y, group = NULL, jit = 1, col = NULL, alpha.f = .8, legend = TRUE, legend.title = NULL, ncol = 1, med = TRUE, ...)
plotjit(x, y, group = NULL, jit = 1, col = NULL, alpha.f = .8, legend = TRUE, legend.title = NULL, ncol = 1, med = TRUE, ...)
x |
A vector of length |
y |
A vector of length |
group |
A vector of length |
jit |
Scalar defining the jittering magnitude. Default to 1. |
alpha.f |
Scalar modifying the opacity of the points in the graphics; typically in [0,1]. See |
col |
A color, or a vector of colors (of length equal to the number of classes or groups), defining the color(s) of the points. |
legend |
Only if there are groups. Logical indicationg is a legend is drawn for groups (Default to |
legend.title |
Character string indicationg a title for the legend. |
ncol |
Number of columns drawn in the legend box. |
med |
Logical. If |
... |
Other arguments to pass in |
Jittered plot.
n <- 500 x <- c(rep("A", n), rep("B", n)) y <- c(rnorm(n), rnorm(n, mean = 5, sd = 3)) group <- sample(1:2, size = 2 * n, replace = TRUE) plotjit(x, y, pch = 16, jit = .5, alpha.f = .5) plotjit(x, y, pch = 16, jit = .5, alpha.f = .5, group = group)
n <- 500 x <- c(rep("A", n), rep("B", n)) y <- c(rnorm(n), rnorm(n, mean = 5, sd = 3)) group <- sample(1:2, size = 2 * n, replace = TRUE) plotjit(x, y, pch = 16, jit = .5, alpha.f = .5) plotjit(x, y, pch = 16, jit = .5, alpha.f = .5, group = group)
Plotting scores of prediction errors (error rates).
plotscore(x, y, group = NULL, col = NULL, steplab = 2, legend = TRUE, legend.title = NULL, ncol = 1, ...)
plotscore(x, y, group = NULL, col = NULL, steplab = 2, legend = TRUE, legend.title = NULL, ncol = 1, ...)
x |
Horizontal axis vector ( |
y |
Vertical axis vector ( |
group |
Groups of data ( |
col |
A color, or a vector of colors (of length equal to the number of groups), defining the color(s) of the groups. |
steplab |
A step for the horizontal axis. Can be |
legend |
Only if there are groups. Logical indicationg is a legend is drawn for groups (Default to |
legend.title |
Character string indicationg a title for the legend. |
ncol |
Number of columns drawn in the legend box. |
... |
Other arguments to pass in function |
A plot.
n <- 50 ; p <- 20 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 10 * rnorm(n)) m <- 3 Xtest <- Xtrain[1:m, ] Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1] nlv <- 15 res <- gridscorelv( Xtrain, ytrain, Xtest, ytest, score = msep, fun = plskern, nlv = 0:nlv, verb = TRUE ) plotscore(res$nlv, res$y1, main = "MSEP", xlab = "Nb. LVs", ylab = "Value") nlvdis <- 5 h <- c(1, Inf) k <- c(10, 20) nlv <- 15 pars <- mpars(nlvdis = nlvdis, diss = "mahal", h = h, k = k) res <- gridscorelv( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = lwplsr, nlv = 0:nlv, pars = pars, verb = TRUE) headm(res) group <- paste("h=", res$h, " k=", res$k, sep = "") plotscore(res$nlv, res$y1, group = group, main = "MSEP", xlab = "Nb. LVs", ylab = "Value")
n <- 50 ; p <- 20 Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE) ytrain <- rnorm(n) Ytrain <- cbind(ytrain, 10 * rnorm(n)) m <- 3 Xtest <- Xtrain[1:m, ] Ytest <- Ytrain[1:m, ] ; ytest <- Ytest[, 1] nlv <- 15 res <- gridscorelv( Xtrain, ytrain, Xtest, ytest, score = msep, fun = plskern, nlv = 0:nlv, verb = TRUE ) plotscore(res$nlv, res$y1, main = "MSEP", xlab = "Nb. LVs", ylab = "Value") nlvdis <- 5 h <- c(1, Inf) k <- c(10, 20) nlv <- 15 pars <- mpars(nlvdis = nlvdis, diss = "mahal", h = h, k = k) res <- gridscorelv( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = lwplsr, nlv = 0:nlv, pars = pars, verb = TRUE) headm(res) group <- paste("h=", res$h, " k=", res$k, sep = "") plotscore(res$nlv, res$y1, group = group, main = "MSEP", xlab = "Nb. LVs", ylab = "Value")
plotsp
plots lines corresponding to the row observations (e.g. spectra) of a data set.
plotsp1
plots only one observation per plot (e.g. spectrum by spectrum) by scrolling the rows. After running a plotsp1
command, the plots are printed successively by pushing the R console "entry button", and stopped by entering any character in the R console.
plotsp(X, type = "l", col = NULL, zeroes = FALSE, labels = FALSE, add = FALSE, ...) plotsp1(X, col = NULL, zeroes = FALSE, ...)
plotsp(X, type = "l", col = NULL, zeroes = FALSE, labels = FALSE, add = FALSE, ...) plotsp1(X, col = NULL, zeroes = FALSE, ...)
X |
Data ( |
type |
1-character string giving the type of plot desired. Default value to |
col |
A color, or a vector of colors (of length n), defining the color(s) of the lines representing the rows. |
zeroes |
Logical indicationg if an horizontal line is drawn at coordonates (0, 0) (Default to |
labels |
Logical indicating if the row names of |
add |
Logical defining if the frame of the plot is plotted ( |
... |
.
A plot (see examples).
For the first example, see ?hcl.colors and ?hcl.pals, and try with col <- hcl.colors(n = n, alpha = 1, rev = FALSE, palette = "Green-Orange") col <- terrain.colors(n, rev = FALSE) col <- rainbow(n, rev = FALSE, alpha = .2)
The second example is with plotsp1 (Scrolling plot of PCA loadings). After running the code, type Enter in the R console for starting the scrolling, and type any character in the R console
## EXAMPLE 1 data(cassav) X <- cassav$Xtest n <- nrow(X) plotsp(X) plotsp(X, col = "grey") plotsp(X, col = "lightblue", xlim = c(500, 1500), xlab = "Wawelength (nm)", ylab = "Absorbance") col <- hcl.colors(n = n, alpha = 1, rev = FALSE, palette = "Grays") plotsp(X, col = col) plotsp(X, col = "grey") plotsp(X[23, , drop = FALSE], lwd = 2, add = TRUE) plotsp(X[c(23, 16), ], lwd = 2, add = TRUE) plotsp(X[5, , drop = FALSE], labels = TRUE) plotsp(X[c(5, 61), ], labels = TRUE) col <- hcl.colors(n = n, alpha = 1, rev = FALSE, palette = "Grays") plotsp(X, col = col) plotsp(X[5, , drop = FALSE], col = "red", lwd = 2, add = TRUE, labels = TRUE) ## EXAMPLE 2 (Scrolling plot of PCA loadings) data(cassav) X <- cassav$Xtest fm <- pcaeigenk(X, nlv = 20) P <- fm$P plotsp1(t(P), ylab = "Value")
## EXAMPLE 1 data(cassav) X <- cassav$Xtest n <- nrow(X) plotsp(X) plotsp(X, col = "grey") plotsp(X, col = "lightblue", xlim = c(500, 1500), xlab = "Wawelength (nm)", ylab = "Absorbance") col <- hcl.colors(n = n, alpha = 1, rev = FALSE, palette = "Grays") plotsp(X, col = col) plotsp(X, col = "grey") plotsp(X[23, , drop = FALSE], lwd = 2, add = TRUE) plotsp(X[c(23, 16), ], lwd = 2, add = TRUE) plotsp(X[5, , drop = FALSE], labels = TRUE) plotsp(X[c(5, 61), ], labels = TRUE) col <- hcl.colors(n = n, alpha = 1, rev = FALSE, palette = "Grays") plotsp(X, col = col) plotsp(X[5, , drop = FALSE], col = "red", lwd = 2, add = TRUE, labels = TRUE) ## EXAMPLE 2 (Scrolling plot of PCA loadings) data(cassav) X <- cassav$Xtest fm <- pcaeigenk(X, nlv = 20) P <- fm$P plotsp1(t(P), ylab = "Value")
Plot the location of missing data in a matrix.
plotxna(X, pch = 16, col = "red", grid = FALSE, asp = 0, ...)
plotxna(X, pch = 16, col = "red", grid = FALSE, asp = 0, ...)
X |
A data set ( |
pch |
Type of point. See |
col |
A color defining the color of the points. |
grid |
Logical. If |
asp |
Scalar. Giving the aspect ratio y/x. The value |
... |
Other arguments to pass in functions |
A plot.
data(octane) X <- octane$X n <- nrow(X) p <- ncol(X) N <- n * p s <- sample(1:N, size = 50) zX <- replace(X, s, NA) plotxna(zX) plotxna(zX, grid = TRUE, asp = 0)
data(octane) X <- octane$X n <- nrow(X) p <- ncol(X) N <- n * p s <- sample(1:N, size = 50) zX <- replace(X, s, NA) plotxna(zX) plotxna(zX, grid = TRUE, asp = 0)
2-dimension scatter plot.
plotxy(X, group = NULL, asp = 0, col = NULL, alpha.f = .8, zeroes = FALSE, circle = FALSE, ellipse = FALSE, labels = FALSE, legend = TRUE, legend.title = NULL, ncol = 1, ...)
plotxy(X, group = NULL, asp = 0, col = NULL, alpha.f = .8, zeroes = FALSE, circle = FALSE, ellipse = FALSE, labels = FALSE, legend = TRUE, legend.title = NULL, ncol = 1, ...)
X |
Data ( |
group |
Groups of observations ( |
asp |
Scalar. Giving the aspect ratio y/x. The value |
col |
A color, or a vector of colors (of length equal to the number of groups), defining the color(s) of the groups. |
alpha.f |
Scalar modifying the opacity of the points in the graphics; typically in [0,1]. See |
zeroes |
Logical indicationg if an horizontal and vertical lines are drawn at coordonates (0, 0) (Default to |
circle |
Not still working. Logical indicating if a correlation circle is plotted (default to |
ellipse |
Logical indicating if a Gaussian ellipse is plotted (default to |
labels |
Logical indicating if the row names of |
legend |
Only if there are groups. Logical indicationg is a legend is drawn for groups (Default to |
legend.title |
Character string indicationg a title for the legend. |
ncol |
Number of columns drawn in the legend box. |
... |
Other arguments to pass in functions |
A plot.
n <- 50 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) Xtest <- Xtrain[1:5, ] + .4 fm <- pcaeigen(Xtrain, nlv = 5) Ttrain <- fm$T Ttest <- transform(fm, Xtest) T <- rbind(Ttrain, Ttest) group <- c(rep("Training", nrow(Ttrain)), rep("Test", nrow(Ttest))) i <- 1 plotxy(T[, i:(i+1)], group = group, pch = 16, zeroes = TRUE, main = "PCA") plotxy(T[, i:(i+1)], group = group, pch = 16, zeroes = TRUE, asp = 1, main = "PCA")
n <- 50 ; p <- 10 Xtrain <- matrix(rnorm(n * p), ncol = p) Xtest <- Xtrain[1:5, ] + .4 fm <- pcaeigen(Xtrain, nlv = 5) Ttrain <- fm$T Ttest <- transform(fm, Xtest) T <- rbind(Ttrain, Ttest) group <- c(rep("Training", nrow(Ttrain)), rep("Test", nrow(Ttest))) i <- 1 plotxy(T[, i:(i+1)], group = group, pch = 16, zeroes = TRUE, main = "PCA") plotxy(T[, i:(i+1)], group = group, pch = 16, zeroes = TRUE, asp = 1, main = "PCA")
Algorithms fitting a PLS1 or PLS2 model between dependent variables and responses
.
- plskern
: "Improved kernel algorithm #1" proposed by Dayal and MacGregor (1997). This algorithm is stable and fast (Andersson 2009), and returns the same results as the NIPALS.
- plsnipals
: NIPALS algorithm (e.g. Tenenhaus 1998, Wold 2002). In the function, the usual PLS2 NIPALS iterative is replaced by a direct calculation of the weights vector by SVD decomposition of matrix
(Hoskuldsson 1988 p.213).
- plsrannar
: Kernel algorithm proposed by Rannar et al. (1994) for "wide" matrices, i.e. with low number of rows and very large number of columns (p >> n; e.g. p = 20000). In such a situation, this algorithm is faster than the others (but it becomes much slower in other situations). If the algorithm converges, it returns the same results as the NIPALS (Note: discrepancies can be observed if too many PLS components are requested compared to the low number of observations).
For weighted versions, see for instance Schaal et al. 2002, Siccard & Sabatier 2006, Kim et al. 2011 and Lesnoff et al. 2020.
Auxiliary functions
transform
Calculates the LVs for any new matrix from the model.
summary
returns summary information for the model.
coef
Calculates b-coefficients from the model.
predict
Calculates the predictions for any new matrix from the model.
plskern(X, Y, weights = NULL, nlv, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) plsnipals(X, Y, weights = NULL, nlv, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) plsrannar(X, Y, weights = NULL, nlv, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) ## S3 method for class 'Plsr' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Plsr' summary(object, X, ...) ## S3 method for class 'Plsr' coef(object, ..., nlv = NULL) ## S3 method for class 'Plsr' predict(object, X, ..., nlv = NULL)
plskern(X, Y, weights = NULL, nlv, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) plsnipals(X, Y, weights = NULL, nlv, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) plsrannar(X, Y, weights = NULL, nlv, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1]) ## S3 method for class 'Plsr' transform(object, X, ..., nlv = NULL) ## S3 method for class 'Plsr' summary(object, X, ...) ## S3 method for class 'Plsr' coef(object, ..., nlv = NULL) ## S3 method for class 'Plsr' predict(object, X, ..., nlv = NULL)
X |
For the main functions: Training X-data ( |
Y |
Training Y-data ( |
weights |
Weights ( |
nlv |
For the main functions: The number(s) of LVs to calculate. — For the auxiliary functions: The number(s) of LVs to consider. |
Xscaling |
X variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
Yscaling |
Y variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
object |
For the auxiliary functions: A fitted model, output of a call to the main functions. |
... |
For the auxiliary functions: Optional arguments. Not used. |
For plskern
, plsnipals
, plsrannar
: A list of outputs, such as
T |
The X-score matrix ( |
P |
The X-loadings matrix ( |
W |
The X-loading weights matrix ( |
C |
The Y-loading weights matrix (C = t(Beta), where Beta is the scores regression coefficients matrix). |
R |
The PLS projection matrix ( |
xmeans |
The centering vector of |
ymeans |
The centering vector of |
xscales |
The vector of |
yscales |
The vector of |
weights |
Weights applied to the training observations. |
TT |
the X-score normalization factor. |
U |
intermediate output. |
For transform.Plsr
: X-scores matrix for new X-data.
For summary.Plsr
:
explvarx |
matrix of explained variances. |
For coef.Plsr
:
int |
matrix (1,nlv) with the intercepts |
B |
matrix (n,nlv) with the coefficients |
For predict.Plsr
:
pred |
A list of matrices ( |
Andersson, M., 2009. A comparison of nine PLS1 algorithms. Journal of Chemometrics 23, 518-529.
Dayal, B.S., MacGregor, J.F., 1997. Improved PLS algorithms. Journal of Chemometrics 11, 73-85.
Hoskuldsson, A., 1988. PLS regression methods. Journal of Chemometrics 2, 211-228. https://doi.org/10.1002/cem.1180020306
Kim, S., Kano, M., Nakagawa, H., Hasebe, S., 2011. Estimation of active pharmaceutical ingredients content using locally weighted partial least squares and statistical wavelength selection. Int. J. Pharm., 421, 269-274.
Lesnoff, M., Metz, M., Roger, J.M., 2020. Comparison of locally weighted PLS strategies for regression and discrimination on agronomic NIR Data. Journal of Chemometrics. e3209. https://onlinelibrary.wiley.com/doi/abs/10.1002/cem.3209
Rannar, S., Lindgren, F., Geladi, P., Wold, S., 1994. A PLS kernel algorithm for data sets with many variables and fewer objects. Part 1: Theory and algorithm. Journal of Chemometrics 8, 111-125. https://doi.org/10.1002/cem.1180080204
Schaal, S., Atkeson, C., Vijayamakumar, S. 2002. Scalable techniques from nonparametric statistics for the real time robot learning. Applied Intell., 17, 49-60.
Sicard, E. Sabatier, R., 2006. Theoretical framework for local PLS1 regression and application to a rainfall data set. Comput. Stat. Data Anal., 51, 1393-1410.
Tenenhaus, M., 1998. La régression PLS: théorie et pratique. Editions Technip, Paris, France.
Wold, S., Sjostrom, M., Eriksson, l., 2001. PLS-regression: a basic tool for chemometrics. Chem. Int. Lab. Syst., 58, 109-130.
plsr_plsda_allsteps
function to help determine the optimal number of latent variables, perform a permutation test, calculate model parameters and predict new observations.
n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] nlv <- 3 plskern(Xtrain, Ytrain, Xscaling = "sd", nlv = nlv) plsnipals(Xtrain, Ytrain, Xscaling = "sd", nlv = nlv) plsrannar(Xtrain, Ytrain, Xscaling = "sd", nlv = nlv) plskern(Xtrain, Ytrain, Xscaling = "none", nlv = nlv) plskern(Xtrain, Ytrain, nlv = nlv)$T plskern(Xtrain, Ytrain, nlv = nlv, weights = 1:n)$T fm <- plskern(Xtrain, Ytrain, nlv = nlv) coef(fm) coef(fm, nlv = 0) coef(fm, nlv = 1) fm$T transform(fm, Xtest) transform(fm, Xtest, nlv = 1) summary(fm, Xtrain) predict(fm, Xtest) predict(fm, Xtest, nlv = 0:3) pred <- predict(fm, Xtest)$pred msep(pred, Ytest)
n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] nlv <- 3 plskern(Xtrain, Ytrain, Xscaling = "sd", nlv = nlv) plsnipals(Xtrain, Ytrain, Xscaling = "sd", nlv = nlv) plsrannar(Xtrain, Ytrain, Xscaling = "sd", nlv = nlv) plskern(Xtrain, Ytrain, Xscaling = "none", nlv = nlv) plskern(Xtrain, Ytrain, nlv = nlv)$T plskern(Xtrain, Ytrain, nlv = nlv, weights = 1:n)$T fm <- plskern(Xtrain, Ytrain, nlv = nlv) coef(fm) coef(fm, nlv = 0) coef(fm, nlv = 1) fm$T transform(fm, Xtest) transform(fm, Xtest, nlv = 1) summary(fm, Xtrain) predict(fm, Xtest) predict(fm, Xtest, nlv = 0:3) pred <- predict(fm, Xtest)$pred msep(pred, Ytest)
Ensemblist approach where the predictions are calculated by averaging the predictions of PLSR models (plskern
) built with different numbers of latent variables (LVs).
For instance, if argument nlv
is set to nlv = "5:10"
, the prediction for a new observation is the average (without weighting) of the predictions returned by the models with 5 LVS, 6 LVs, ... 10 LVs.
plsr_agg(X, Y, weights = NULL, nlv) ## S3 method for class 'Plsr_agg' predict(object, X, ...)
plsr_agg(X, Y, weights = NULL, nlv) ## S3 method for class 'Plsr_agg' predict(object, X, ...)
For plsr_agg
:
X |
For the main function: Training X-data ( |
Y |
Training Y-data ( |
weights |
Weights ( |
nlv |
A character string such as "5:20" defining the range of the numbers of LVs to consider (here: the models with nb LVS = 5, 6, ..., 20 are averaged). Syntax such as "10" is also allowed (here: correponds to the single model with 10 LVs). |
object |
For the auxiliary function: A fitted model, output of a call to the main functions. |
... |
For the auxiliary function: Optional arguments. Not used. |
For plsr_agg
:
fm |
list contaning the model: ( |
nlv |
range of the numbers of LVs considered |
For predict.Plsr_agg
:
pred |
Final predictions (after aggregation) |
predlv |
Intermediate predictions (Per nb. LVs) |
In the example, zfm
is the maximal PLSR model, and there is no sense to use gridscorelv or gridcvlv instead of gridscore or gridcv.
n <- 20 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] nlv <- "1:3" fm <- plsr_agg(Xtrain, ytrain, nlv = nlv) names(fm) zfm <- fm$fm class(zfm) names(zfm) summary(zfm, Xtrain) res <- predict(fm, Xtest) names(res) res$pred msep(res$pred, ytest) res$predlv pars <- mpars(nlv = c("1:3", "2:5")) pars res <- gridscore( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = plsr_agg, pars = pars) res K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm res <- gridcv( Xtrain, Ytrain, segm, score = msep, fun = plsr_agg, pars = pars, verb = TRUE) res
n <- 20 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] nlv <- "1:3" fm <- plsr_agg(Xtrain, ytrain, nlv = nlv) names(fm) zfm <- fm$fm class(zfm) names(zfm) summary(zfm, Xtrain) res <- predict(fm, Xtest) names(res) res$pred msep(res$pred, ytest) res$predlv pars <- mpars(nlv = c("1:3", "2:5")) pars res <- gridscore( Xtrain, Ytrain, Xtest, Ytest, score = msep, fun = plsr_agg, pars = pars) res K = 3 segm <- segmkf(n = n, K = K, nrep = 1) segm res <- gridcv( Xtrain, Ytrain, segm, score = msep, fun = plsr_agg, pars = pars, verb = TRUE) res
Help determine the optimal number of latent variables by cross-validation, perform a permutation test, calculate model parameters and predict new observations, for plsr (plskern
), plsrda (plsrda
), plslda (plslda
) or plsqda (plsqda
) models.
plsr_plsda_allsteps(X, Xname = NULL, Xscaling = c("none","pareto","sd")[1], Y, Yscaling = c("none","pareto","sd")[1], weights = NULL, newX = NULL, newXname = NULL, method = c("plsr", "plsrda","plslda","plsqda")[1], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlv, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[1], nbrep = 30, seed = 123, samplingk = NULL, nfolds = 10, npermut = 30, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], import = c("R","ChemFlow","W4M")[1], outputfilename = NULL)
plsr_plsda_allsteps(X, Xname = NULL, Xscaling = c("none","pareto","sd")[1], Y, Yscaling = c("none","pareto","sd")[1], weights = NULL, newX = NULL, newXname = NULL, method = c("plsr", "plsrda","plslda","plsqda")[1], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlv, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[1], nbrep = 30, seed = 123, samplingk = NULL, nfolds = 10, npermut = 30, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], import = c("R","ChemFlow","W4M")[1], outputfilename = NULL)
X |
Training X-data ( |
Xname |
name of the X-matrix |
Xscaling |
X variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
Y |
Training Y-data ( |
Yscaling |
Y variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
weights |
Weights ( |
newX |
New X-data ( |
newXname |
name of the newX-matrix |
method |
method to apply among "plsr", "plsrda","plslda","plsqda" |
prior |
for plslda or plsqda models : The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in |
step |
step of the analysis among "nlvtest" (cross-validation to help determine the optimal number of latent variables), "permutation" (permutation test),"model" (model calculation),"prediction" (prediction of newX-data or X-data if any)) |
nlv |
number of latent variables to test if step is "nlvtest"; number of latent variables of the model if step is not "nlvtest". |
modeloutput |
if step is "model": outputs among "scores", "loadings", "coef" (regression coefficients), "vip" (Variable Importance in Projection; the VIP calculation being based on the proportion of Y-variance explained by the components, as proposed by Mehmood et al (2012, 2020).) |
cvmethod |
if step is "nlvtest" or "permutation": "kfolds" for k-folds cross-validation, or "loo" for leave-one-out. |
nbrep |
if step is "nlvtest" and cvmethod is "kfolds": An integer, setting the number of CV repetitions. Default value is 30. Must me set to 1 if cvmethod is "loo" |
seed |
if step is "nlvtest" and cvmethod is "kfolds", or if step is "permutation: a numeric. Seed used for the repeated resampling |
samplingk |
A vector of length n. The elements are the values of a qualitative variable used for stratified partition creation. If NULL, the first observation is set in the first fold, the second observation in the second fold, etc... |
nfolds |
if cvmethod is "kfolds". An integer, setting the number of partitions to create. Default value is 10. |
npermut |
if step is "permutation": An integer, setting the number of Y-Block with permutated responses to create. Default value is 30. |
criterion |
if step is "nlvtest" or "permutation" and method is "plsrda", "plslda" or "plsqda": optimisation criterion among "rmse" and "err" (for classification error rate))) |
selection |
if step is "nlvtest": a character indicating the selection method to use to choose the optimal combination of components, among "localmin","globalmin","1std". If "localmin": the optimal combination corresponds to the first local minimum of the mean CV rmse or error rate. If "globalmin" : the optimal combination corresponds to the minimum mean CV rmse or error rate. If "1std" (one standard error rule) : it corresponds to the first combination after which the mean cross-validated rmse or error rate does not decrease significantly. |
import |
If "R", X and Y are in the global environment, and the observation names are in rownames. If "ChemFlow", X and Y are tabulated tables (.txt), and the observation names are in the first column. If "W4M", X and Y are tabulated tables (.txt), and the observation names are in the headers of X, and in the first column of Y. |
outputfilename |
character: If not NULL, name of the tabular file, in which the function outputs have to be written.) |
If step is "nlvtest": table with rmsecv or cross-validated classification error rates. The suggested optimal number of latent variables is indicated by the binary "optimum" variable.
If step is "permutation": table with the dissimilarity between the original and the permutated Y-block, and the rmsecv or cross-validated classification error rates obtained with the permutated Y-block by the model and the given number of latent variables.
If step is "model": tables of scores, loadings, regression coefficients, and vip values, depending of the "modeloutput" parameter.
If step is "prediction": table of predicted scores and predicted classes or values.
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) colnames(Xtrain) <- paste0("V",1:p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] resnlvtestplsrda <- plsr_plsda_allsteps(X = Xtrain, Xname = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newX = Xtest, newXname = NULL, method = c("plsr", "plsrda","plslda","plsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlv = 5, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) respermutationplsrda <- plsr_plsda_allsteps(X = Xtrain, Xname = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newX = Xtest, newXname = NULL, method = c("plsr", "plsrda","plslda","plsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[2], nlv = 2, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) plotxy(respermutationplsrda, pch=16) abline (h = respermutationplsrda[respermutationplsrda[,"permut_dyssimilarity"]==0,"res_permut"]) resmodelplsrda <- plsr_plsda_allsteps(X = Xtrain, Xname = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newX = Xtest, newXname = NULL, method = c("plsr", "plsrda","plslda","plsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[3], nlv = 2, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) resmodelplsrda$scores resmodelplsrda$loadings resmodelplsrda$coef resmodelplsrda$vip respredictionplsrda <- plsr_plsda_allsteps(X = Xtrain, Xname = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newX = Xtest, newXname = NULL, method = c("plsr", "plsrda","plslda","plsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[4], nlv = 2, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL)
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) colnames(Xtrain) <- paste0("V",1:p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] resnlvtestplsrda <- plsr_plsda_allsteps(X = Xtrain, Xname = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newX = Xtest, newXname = NULL, method = c("plsr", "plsrda","plslda","plsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlv = 5, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) respermutationplsrda <- plsr_plsda_allsteps(X = Xtrain, Xname = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newX = Xtest, newXname = NULL, method = c("plsr", "plsrda","plslda","plsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[2], nlv = 2, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) plotxy(respermutationplsrda, pch=16) abline (h = respermutationplsrda[respermutationplsrda[,"permut_dyssimilarity"]==0,"res_permut"]) resmodelplsrda <- plsr_plsda_allsteps(X = Xtrain, Xname = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newX = Xtest, newXname = NULL, method = c("plsr", "plsrda","plslda","plsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[3], nlv = 2, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) resmodelplsrda$scores resmodelplsrda$loadings resmodelplsrda$coef resmodelplsrda$vip respredictionplsrda <- plsr_plsda_allsteps(X = Xtrain, Xname = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newX = Xtest, newXname = NULL, method = c("plsr", "plsrda","plslda","plsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[4], nlv = 2, modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL)
Discrimination (DA) based on PLS.
The training variable (univariate class membership) is firstly transformed to a dummy table containing
columns, where
is the number of classes present in
. Each column is a dummy variable (0/1). Then, a PLS2 is implemented on the
data and the dummy table, returning latent variables (LVs) that are used as dependent variables in a DA model.
- plsrda
: Usual "PLSDA". A linear regression model predicts the Y-dummy table from the PLS2 LVs. This corresponds to the PLSR2 of the X-data and the Y-dummy table. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.
- plslda
and plsqda
: Probabilistic LDA and QDA are run over the PLS2 LVs, respectively.
plsrda(X, y, weights = NULL, nlv, Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1]) plslda(X, y, weights = NULL, nlv, prior = c("unif", "prop"), Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1]) plsqda(X, y, weights = NULL, nlv, prior = c("unif", "prop"), Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1]) ## S3 method for class 'Plsrda' predict(object, X, ..., nlv = NULL) ## S3 method for class 'Plsprobda' predict(object, X, ..., nlv = NULL)
plsrda(X, y, weights = NULL, nlv, Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1]) plslda(X, y, weights = NULL, nlv, prior = c("unif", "prop"), Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1]) plsqda(X, y, weights = NULL, nlv, prior = c("unif", "prop"), Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1]) ## S3 method for class 'Plsrda' predict(object, X, ..., nlv = NULL) ## S3 method for class 'Plsprobda' predict(object, X, ..., nlv = NULL)
X |
For the main functions: Training X-data ( |
y |
Training class membership ( |
weights |
Weights ( |
nlv |
The number(s) of LVs to calculate. |
prior |
The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in |
Xscaling |
X variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
Yscaling |
Y variable scaling, once converted to binary variables, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
object |
For the auxiliary functions: A fitted model, output of a call to the main functions. |
... |
For the auxiliary functions: Optional arguments. Not used. |
For plsrda
, plslda
, plsqda
:
fm |
list with the model: ( |
lev |
classes |
ni |
number of observations in each class |
For predict.Plsrda
, predict.Plsprobda
:
pred |
predicted class for each observation |
posterior |
calculated probability of belonging to a class for each observation |
The first example concerns PLSDA, and the second one concerns PLS LDA.
fm
are PLS1 models, and zfm
are PLS2 models to predict the disjunctive matrix.
plsr_plsda_allsteps
function to help determine the optimal number of latent variables, perform a permutation test, calculate model parameters and predict new observations.
## EXAMPLE OF PLSDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] nlv <- 5 fm <- plsrda(Xtrain, ytrain, Xscaling = "sd", nlv = nlv) names(fm) predict(fm, Xtest) predict(fm, Xtest, nlv = 0:2)$pred pred <- predict(fm, Xtest)$pred err(pred, ytest) zfm <- fm$fm transform(zfm, Xtest) transform(zfm, Xtest, nlv = 1) summary(zfm, Xtrain) coef(zfm) coef(zfm, nlv = 0) coef(zfm, nlv = 2) ## EXAMPLE OF PLS LDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] nlv <- 5 fm <- plslda(Xtrain, ytrain, Xscaling = "sd", nlv = nlv) predict(fm, Xtest) predict(fm, Xtest, nlv = 1:2)$pred zfm <- fm$fm[[1]] class(zfm) names(zfm) summary(zfm, Xtrain) transform(zfm, Xtest[1:2, ]) coef(zfm)
## EXAMPLE OF PLSDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] nlv <- 5 fm <- plsrda(Xtrain, ytrain, Xscaling = "sd", nlv = nlv) names(fm) predict(fm, Xtest) predict(fm, Xtest, nlv = 0:2)$pred pred <- predict(fm, Xtest)$pred err(pred, ytest) zfm <- fm$fm transform(zfm, Xtest) transform(zfm, Xtest, nlv = 1) summary(zfm, Xtrain) coef(zfm) coef(zfm, nlv = 0) coef(zfm, nlv = 2) ## EXAMPLE OF PLS LDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] nlv <- 5 fm <- plslda(Xtrain, ytrain, Xscaling = "sd", nlv = nlv) predict(fm, Xtest) predict(fm, Xtest, nlv = 1:2)$pred zfm <- fm$fm[[1]] class(zfm) names(zfm) summary(zfm, Xtrain) transform(zfm, Xtest[1:2, ]) coef(zfm)
Ensemblist approach where the predictions are calculated by "averaging" the predictions of PLSDA models built with different numbers of latent variables (LVs).
For instance, if argument nlv
is set to nlv = "5:10"
, the prediction for a new observation is the most occurent level (vote) over the predictions returned by the models with 5 LVS, 6 LVs, ... 10 LVs.
- plsrda_agg
: use plsrda
.
- plslda_agg
: use plslda
.
- plsqda_agg
: use plsqda
.
plsrda_agg(X, y, weights = NULL, nlv) plslda_agg(X, y, weights = NULL, nlv, prior = c("unif", "prop")) plsqda_agg(X, y, weights = NULL, nlv, prior = c("unif", "prop")) ## S3 method for class 'Plsda_agg' predict(object, X, ...)
plsrda_agg(X, y, weights = NULL, nlv) plslda_agg(X, y, weights = NULL, nlv, prior = c("unif", "prop")) plsqda_agg(X, y, weights = NULL, nlv, prior = c("unif", "prop")) ## S3 method for class 'Plsda_agg' predict(object, X, ...)
X |
For the main functions: Training X-data ( |
y |
Training class membership ( |
weights |
Weights ( |
nlv |
A character string such as "5:20" defining the range of the numbers of LVs to consider (here: the models with nb LVS = 5, 6, ..., 20 are averaged). Syntax such as "10" is also allowed (here: correponds to the single model with 10 LVs). |
prior |
The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in |
object |
For the auxiliary function: A fitted model, output of a call to the main functions. |
... |
For the auxiliary function: Optional arguments. Not used. |
For plsrda_agg
, plslda_agg
and plsqda_agg
:
fm |
list contaning:
the model(( |
nlv |
range of the numbers of LVs considered |
For predict.Plsda_agg
:
pred |
Final predictions (after aggregation) |
predlv |
Intermediate predictions (Per nb. LVs) |
the first example concerns PLSRDA-AGG, and the second one concerns PLSLDA-AGG.
## EXAMPLE OF PLSRDA-AGG n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10, 2), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] nlv <- "2:5" fm <- plsrda_agg(Xtrain, ytrain, nlv = nlv) names(fm) res <- predict(fm, Xtest) names(res) res$pred err(res$pred, ytest) res$predlv pars <- mpars(nlv = c("1:3", "2:5")) pars res <- gridscore( Xtrain, ytrain, Xtest, ytest, score = err, fun = plsrda_agg, pars = pars) res segm <- segmkf(n = n, K = 3, nrep = 1) res <- gridcv( Xtrain, ytrain, segm, score = err, fun = plslda_agg, pars = pars, verb = TRUE) res ## EXAMPLE OF PLSLDA-AGG n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10, 2), size = n, replace = TRUE) #ytrain <- sample(c("a", "10", "d"), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] nlv <- "2:5" fm <- plslda_agg(Xtrain, ytrain, nlv = nlv, prior = "unif") names(fm) res <- predict(fm, Xtest) names(res) res$pred err(res$pred, ytest) res$predlv pars <- mpars(nlv = c("1:3", "2:5"), prior = c("unif", "prop")) pars res <- gridscore( Xtrain, ytrain, Xtest, ytest, score = err, fun = plslda_agg, pars = pars) res segm <- segmkf(n = n, K = 3, nrep = 1) res <- gridcv( Xtrain, ytrain, segm, score = err, fun = plslda_agg, pars = pars, verb = TRUE) res
## EXAMPLE OF PLSRDA-AGG n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10, 2), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] nlv <- "2:5" fm <- plsrda_agg(Xtrain, ytrain, nlv = nlv) names(fm) res <- predict(fm, Xtest) names(res) res$pred err(res$pred, ytest) res$predlv pars <- mpars(nlv = c("1:3", "2:5")) pars res <- gridscore( Xtrain, ytrain, Xtest, ytest, score = err, fun = plsrda_agg, pars = pars) res segm <- segmkf(n = n, K = 3, nrep = 1) res <- gridcv( Xtrain, ytrain, segm, score = err, fun = plslda_agg, pars = pars, verb = TRUE) res ## EXAMPLE OF PLSLDA-AGG n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10, 2), size = n, replace = TRUE) #ytrain <- sample(c("a", "10", "d"), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] nlv <- "2:5" fm <- plslda_agg(Xtrain, ytrain, nlv = nlv, prior = "unif") names(fm) res <- predict(fm, Xtest) names(res) res$pred err(res$pred, ytest) res$predlv pars <- mpars(nlv = c("1:3", "2:5"), prior = c("unif", "prop")) pars res <- gridscore( Xtrain, ytrain, Xtest, ytest, score = err, fun = plslda_agg, pars = pars) res segm <- segmkf(n = n, K = 3, nrep = 1) res <- gridcv( Xtrain, ytrain, segm, score = err, fun = plslda_agg, pars = pars, verb = TRUE) res
Remove the vertical gaps in spectra (rows of matrix ), e.g. for ASD. This is done by extrapolation from simple linear regressions computed on the left side of the gaps.
rmgap(X, indexcol, k = 5)
rmgap(X, indexcol, k = 5)
X |
A dataset. |
indexcol |
The column indexes corresponding to the gaps. For instance, if two gaps are observed between indexes 651-652 and between indexes 1451-1452, respectively, then |
k |
The number of columns used on the left side of the gaps for fitting the linear regressions. |
The corrected data .
In the example, two gaps are at wavelengths 1000-1001 nm and 1800-1801 nm.
data(asdgap) X <- asdgap$X indexcol <- which(colnames(X) == "1000" | colnames(X) == "1800") indexcol plotsp(X, lwd = 1.5) abline(v = as.numeric(colnames(X)[1]) + indexcol - 1, col = "lightgrey", lty = 3) zX <- rmgap(X, indexcol = indexcol) plotsp(zX, lwd = 1.5) abline(v = as.numeric(colnames(zX)[1]) + indexcol - 1, col = "lightgrey", lty = 3)
data(asdgap) X <- asdgap$X indexcol <- which(colnames(X) == "1000" | colnames(X) == "1800") indexcol plotsp(X, lwd = 1.5) abline(v = as.numeric(colnames(X)[1]) + indexcol - 1, col = "lightgrey", lty = 3) zX <- rmgap(X, indexcol = indexcol) plotsp(zX, lwd = 1.5) abline(v = as.numeric(colnames(zX)[1]) + indexcol - 1, col = "lightgrey", lty = 3)
Fitting linear ridge regression models (RR) (Hoerl & Kennard 1970, Hastie & Tibshirani 2004, Hastie et al 2009, Cule & De Iorio 2012) by SVD factorization.
rr(X, Y, weights = NULL, lb = 1e-2) ## S3 method for class 'Rr' coef(object, ..., lb = NULL) ## S3 method for class 'Rr' predict(object, X, ..., lb = NULL)
rr(X, Y, weights = NULL, lb = 1e-2) ## S3 method for class 'Rr' coef(object, ..., lb = NULL) ## S3 method for class 'Rr' predict(object, X, ..., lb = NULL)
X |
For the main function: Training X-data ( |
Y |
Training Y-data ( |
weights |
Weights ( |
lb |
A value of regularization parameter |
object |
For the auxiliary functions: A fitted model, output of a call to the main function. |
... |
— For the auxiliary functions: Optional arguments. Not used. |
For rr
:
V |
eigenvector matrix of the correlation matrix (n,n). |
TtDY |
intermediate output. |
sv |
singular values of the matrix |
lb |
value of regularization parameter |
xmeans |
the centering vector of X |
ymeans |
the centering vector of Y |
weights |
the weights vector of X-variables |
For coef.Rr
:
int |
matrix (1,nlv) with the intercepts |
B |
matrix (n,nlv) with the coefficients |
df |
model complexity (number of degrees of freedom) |
For predict.Rr
:
pred |
A list of matrices ( |
Cule, E., De Iorio, M., 2012. A semi-automatic method to guide the choice of ridge parameter in ridge regression. arXiv:1205.0686.
Hastie, T., Tibshirani, R., 2004. Efficient quadratic regularization for expression arrays. Biostatistics 5, 329-340. https://doi.org/10.1093/biostatistics/kxh010
Hastie, T., Tibshirani, R., Friedman, J., 2009. The elements of statistical learning: data mining, inference, and prediction, 2nd ed. Springer, New York.
Hoerl, A.E., Kennard, R.W., 1970. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 12, 55-67. https://doi.org/10.1080/00401706.1970.10488634
Wu, W., Massart, D.L., de Jong, S., 1997. The kernel PCA algorithms for wide data. Part I: Theory and algorithms. Chemometrics and Intelligent Laboratory Systems 36, 165-172. https://doi.org/10.1016/S0169-7439(97)00010-5
n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] lb <- .1 fm <- rr(Xtrain, Ytrain, lb = lb) coef(fm) coef(fm, lb = .8) predict(fm, Xtest) predict(fm, Xtest, lb = c(0.1, .8)) pred <- predict(fm, Xtest)$pred msep(pred, Ytest)
n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] lb <- .1 fm <- rr(Xtrain, Ytrain, lb = lb) coef(fm) coef(fm, lb = .8) predict(fm, Xtest) predict(fm, Xtest, lb = c(0.1, .8)) pred <- predict(fm, Xtest)$pred msep(pred, Ytest)
Discrimination (DA) based on ridge regression (RR).
rrda(X, y, weights = NULL, lb = 1e-5) ## S3 method for class 'Rrda' predict(object, X, ..., lb = NULL)
rrda(X, y, weights = NULL, lb = 1e-5) ## S3 method for class 'Rrda' predict(object, X, ..., lb = NULL)
For rrda
:
X |
For the main function: Training X-data ( |
y |
Training class membership ( |
weights |
Weights ( |
lb |
A value of regularization parameter |
object |
For the auxiliary function: A fitted model, output of a call to the main functions. |
... |
For the auxiliary function: Optional arguments. Not used. |
The training variable (univariate class membership) is transformed to a dummy table containing
columns, where
is the number of classes present in
. Each column is a dummy variable (0/1). Then, a ridge regression (RR) is run on the
data and the dummy table, returning predictions of the dummy variables. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.
For rrda
:
fm |
List with the outputs of the RR (( |
lev |
classes |
ni |
number of observations in each class |
For predict.Rrda
:
pred |
matrix or list of matrices (if lb is a vector), with predicted class for each observation |
posterior |
matrix or list of matrices (if lb is a vector), calculated probability of belonging to a class for each observation |
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] lb <- 1 fm <- rrda(Xtrain, ytrain, lb = lb) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred err(pred, ytest) predict(fm, Xtest, lb = 0:2) predict(fm, Xtest, lb = 0)
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] lb <- 1 fm <- rrda(Xtrain, ytrain, lb = lb) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred err(pred, ytest) predict(fm, Xtest, lb = 0:2) predict(fm, Xtest, lb = 0)
The function divides a datset in two sets, "train" vs "test", using a stratified sampling on defined classes.
If argument y = NULL
(default), the sampling is random within each class. If not, the sampling is systematic (regular grid) within each class over the quantitative variable .
sampcla(x, y = NULL, m)
sampcla(x, y = NULL, m)
x |
A vector (length |
y |
A vector (length |
m |
Either an integer defining the equal number of test observation(s) to select per class, or a vector of integers defining the numbers to select for each class. In the last case, vector |
train |
Indexes (i.e. position in |
test |
Indexes (i.e. position in |
lev |
classes |
ni |
number of observations in each class |
The second example is a representative stratified sampling from an unsupervised clustering.
Naes, T., 1987. The design of calibration in near infra-red reflectance analysis by clustering. Journal of Chemometrics 1, 121-134.
## EXAMPLE 1 x <- sample(c(1, 3, 4), size = 20, replace = TRUE) table(x) sampcla(x, m = 2) s <- sampcla(x, m = 2)$test x[s] sampcla(x, m = c(1, 2, 1)) s <- sampcla(x, m = c(1, 2, 1))$test x[s] y <- rnorm(length(x)) sampcla(x, y, m = 2) s <- sampcla(x, y, m = 2)$test x[s] ## EXAMPLE 2 data(cassav) X <- cassav$Xtrain y <- cassav$ytrain N <- nrow(X) fm <- pcaeigenk(X, nlv = 10) z <- stats::kmeans(x = fm$T, centers = 3, nstart = 25, iter.max = 50) x <- z$cluster z <- table(x) z p <- c(z) / N p psamp <- .20 m <- round(psamp * N * p) m random_sampling <- sampcla(x, m = m) s <- random_sampling$test table(x[s]) Systematic_sampling_for_y <- sampcla(x, y, m = m) s <- Systematic_sampling_for_y$test table(x[s])
## EXAMPLE 1 x <- sample(c(1, 3, 4), size = 20, replace = TRUE) table(x) sampcla(x, m = 2) s <- sampcla(x, m = 2)$test x[s] sampcla(x, m = c(1, 2, 1)) s <- sampcla(x, m = c(1, 2, 1))$test x[s] y <- rnorm(length(x)) sampcla(x, y, m = 2) s <- sampcla(x, y, m = 2)$test x[s] ## EXAMPLE 2 data(cassav) X <- cassav$Xtrain y <- cassav$ytrain N <- nrow(X) fm <- pcaeigenk(X, nlv = 10) z <- stats::kmeans(x = fm$T, centers = 3, nstart = 25, iter.max = 50) x <- z$cluster z <- table(x) z p <- c(z) / N p psamp <- .20 m <- round(psamp * N * p) m random_sampling <- sampcla(x, m = m) s <- random_sampling$test table(x[s]) Systematic_sampling_for_y <- sampcla(x, y, m = m) s <- Systematic_sampling_for_y$test table(x[s])
The function divides the data in two sets, "train" vs "test", using the Duplex algorithm (Snee, 1977). The two sets are of equal size. If needed, the user can add
the eventual remaining observations (not in "train" nor "test") to "train".
sampdp(X, k, diss = c("eucl", "mahal"))
sampdp(X, k, diss = c("eucl", "mahal"))
X |
X-data ( |
k |
An integer defining the number of training observations to select. Must be <= |
diss |
The type of dissimilarity used for selecting the observations in the algorithm. Possible values are "eucl" (default; Euclidean distance) or "mahal" (Mahalanobis distance). |
train |
Indexes (i.e. row numbers in |
test |
Indexes (i.e. row numbers in |
remain |
Indexes (i.e., row numbers in |
Kennard, R.W., Stone, L.A., 1969. Computer aided design of experiments. Technometrics, 11(1), 137-148.
Snee, R.D., 1977. Validation of Regression Models: Methods and Examples. Technometrics 19, 415-428. https://doi.org/10.1080/00401706.1977.10489581
n <- 10 ; p <- 3 X <- matrix(rnorm(n * p), ncol = p) k <- 4 sampdp(X, k = k) sampdp(X, k = k, diss = "mahal")
n <- 10 ; p <- 3 X <- matrix(rnorm(n * p), ncol = p) k <- 4 sampdp(X, k = k) sampdp(X, k = k, diss = "mahal")
The function divides the data in two sets, "train" vs "test", using the Kennard-Stone (KS) algorithm (Kennard & Stone, 1969). The two sets correspond to two different underlying probability distributions: set "train" has higher dispersion than set "test".
sampks(X, k, diss = c("eucl", "mahal"))
sampks(X, k, diss = c("eucl", "mahal"))
X |
X-data ( |
k |
An integer defining the number of training observations to select. |
diss |
The type of dissimilarity used for selecting the observations in the algorithm. Possible values are "eucl" (default; Euclidean distance) or "mahal" (Mahalanobis distance). |
train |
Indexes (i.e. row numbers in |
test |
Indexes (i.e. row numbers in |
Kennard, R.W., Stone, L.A., 1969. Computer aided design of experiments. Technometrics, 11(1), 137-148.
n <- 10 ; p <- 3 X <- matrix(rnorm(n * p), ncol = p) k <- 7 sampks(X, k = k) n <- 10 ; k <- 25 X <- expand.grid(1:n, 1:n) X <- X + rnorm(nrow(X) * ncol(X), 0, .1) s <- sampks(X, k)$train plot(X) points(X[s, ], pch = 19, col = 2, cex = 1.5)
n <- 10 ; p <- 3 X <- matrix(rnorm(n * p), ncol = p) k <- 7 sampks(X, k = k) n <- 10 ; k <- 25 X <- expand.grid(1:n, 1:n) X <- X + rnorm(nrow(X) * ncol(X), 0, .1) s <- sampks(X, k)$train plot(X) points(X[s, ], pch = 19, col = 2, cex = 1.5)
Smoothing by derivation, with a Savitzky-Golay filter, of the row observations (e.g. spectra) of a data set.
The function uses function sgolayfilt
of package signal
available on the CRAN.
savgol(X, m, n, p, ts = 1)
savgol(X, m, n, p, ts = 1)
X |
X-data). |
m |
Derivation order. |
n |
Filter length (must be odd), i.e. the number of colums in |
p |
Polynomial order. |
ts |
Scaling factor (e.g. the absolute step between two columns in matrix |
A matrix of the transformed data.
X <- cassav$Xtest m <- 1 ; n <- 11 ; p <- 2 Xp <- savgol(X, m, n, p) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plotsp(X, main = "Signal") plotsp(Xp, main = "Corrected signal") abline(h = 0, lty = 2, col = "grey") par(oldpar)
X <- cassav$Xtest m <- 1 ; n <- 11 ; p <- 2 Xp <- savgol(X, m, n, p) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plotsp(X, main = "Signal") plotsp(Xp, main = "Corrected signal") abline(h = 0, lty = 2, col = "grey") par(oldpar)
scordis
calculates the score distances (SD) for a PCA or PLS model. SD is the Mahalanobis distance of the projection of a row observation on the score plan to the center of the score space.
A distance cutoff is computed using a moment estimation of the parameters of a Chi-squared distrbution for SD^2 (see e.g. Pomerantzev 2008). In the function output, column dstand
is a standardized distance defined as . A value
dstand > 1
can be considered as extreme.
The Winisi "GH" is also provided (usually considered as extreme if GH > 3).
scordis( object, X = NULL, nlv = NULL, rob = TRUE, alpha = .01 )
scordis( object, X = NULL, nlv = NULL, rob = TRUE, alpha = .01 )
object |
A fitted model, output of a call to a fitting function (for example from |
X |
New X-data. |
nlv |
Number of components (PCs or LVs) to consider. |
rob |
Logical. If |
alpha |
Risk- |
res.train |
matrix with distances, standardized distances and Winisi "GH", for the training set. |
res |
matrix with distances, standardized distances and Winisi "GH", for new X-data if any. |
cutoff |
cutoff value |
M. Hubert, P. J. Rousseeuw, K. Vanden Branden (2005). ROBPCA: a new approach to robust principal components analysis. Technometrics, 47, 64-79.
Pomerantsev, A.L., 2008. Acceptance areas for multivariate classification derived by projection methods. Journal of Chemometrics 22, 601-609. https://doi.org/10.1002/cem.1147
n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Xtest <- Xtrain[1:3, , drop = FALSE] nlv <- 3 fm <- pcasvd(Xtrain, nlv = nlv) scordis(fm) scordis(fm, nlv = 2) scordis(fm, Xtest, nlv = 2)
n <- 6 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Xtest <- Xtrain[1:3, , drop = FALSE] nlv <- 3 fm <- pcasvd(Xtrain, nlv = nlv) scordis(fm) scordis(fm, nlv = 2) scordis(fm, Xtest, nlv = 2)
Build segments of observations for K-Fold or "test-set" cross-validation (CV).
The CV can eventually be randomly repeated. For each repetition:
- K-fold CV - Function segmkf
returns the segments.
- Test-set CV - Function segmts
returns a segment (of a given length) randomly sampled in the dataset.
CV of blocks
Argument y
allows sampling blocks of observations instead of observations. This can be required when there are repetitions in the data. In such a situation, CV should account for the repetition level (if not, the error rates are in general highly underestimated). For implementing such a CV, object y
must be a a vector () defining the blocks, in the same order as in the data.
In any cases (y = NULL
or not), the functions return a list of vector(s). Each vector contains the indexes of the observations defining the segment.
segmkf(n, y = NULL, K = 5, type = c("random", "consecutive", "interleaved"), nrep = 1) segmts(n, y = NULL, m, nrep)
segmkf(n, y = NULL, K = 5, type = c("random", "consecutive", "interleaved"), nrep = 1) segmts(n, y = NULL, m, nrep)
n |
The total number of row observations in the dataset. If |
y |
A vector ( |
K |
For |
type |
For |
m |
For |
nrep |
The number of replications of the repeated CV. Default to |
The segments (lists of indexes).
Kfold <- segmkf(n = 10, K = 3) interleavedKfold <- segmkf(n = 10, K = 3, type = "interleaved") LeaveOneOut <- segmkf(n = 10, K = 10) RepeatedKfold <- segmkf(n = 10, K = 3, nrep = 2) repeatedTestSet <- segmts(n = 10, m = 3, nrep = 5) n <- 10 y <- rep(LETTERS[1:5], 2) y Kfold_withBlocks <- segmkf(n = n, y = y, K = 3, nrep = 1) z <- Kfold_withBlocks z y[z$rep1$segm1] y[z$rep1$segm2] y[z$rep1$segm3] TestSet_withBlocks <- segmts(n = n, y = y, m = 3, nrep = 1) z <- TestSet_withBlocks z y[z$rep1$segm1]
Kfold <- segmkf(n = 10, K = 3) interleavedKfold <- segmkf(n = 10, K = 3, type = "interleaved") LeaveOneOut <- segmkf(n = 10, K = 10) RepeatedKfold <- segmkf(n = 10, K = 3, nrep = 2) repeatedTestSet <- segmts(n = 10, m = 3, nrep = 5) n <- 10 y <- rep(LETTERS[1:5], 2) y Kfold_withBlocks <- segmkf(n = n, y = y, K = 3, nrep = 1) z <- Kfold_withBlocks z y[z$rep1$segm1] y[z$rep1$segm2] y[z$rep1$segm3] TestSet_withBlocks <- segmts(n = n, y = y, m = 3, nrep = 1) z <- TestSet_withBlocks z y[z$rep1$segm1]
The function helps selecting the dimensionnality of latent variable (LV) models (e.g. PLSR) using the "Wold criterion".
The criterion is the "precision gain ratio" where
is an observed error rate quantifying the model performance (msep, classification error rate, etc.) and
the model dimensionnality (= nb. LVs). It can also represent other indicators such as the eigenvalues of a PCA.
is the relative gain in efficiency after a new LV is added to the model. The iterations continue until
becomes lower than a threshold value
. By default and only as an indication, the default
is set in the function, but the user should set any other value depending on his data and parcimony objective.
In the original article, Wold (1978; see also Bro et al. 2008) used the ratio of cross-validated over training residual sums of squares, i.e. PRESS over SSR. Instead, selwold
compares values of consistent nature (the successive values in the input vector ), e.g. PRESS only . For instance,
was set to PRESS values in Li et al. (2002) and Andries et al. (2011), which is equivalent to the "punish factor" described in Westad & Martens (2000).
The ratio is often erratic, making difficult the dimensionnaly selection. Function
selwold
proposes to calculate a smoothing of (argument
).
selwold( r, indx = seq(length(r)), smooth = TRUE, f = 1/3, alpha = .05, digits = 3, plot = TRUE, xlab = "Index", ylab = "Value", main = "r", ... )
selwold( r, indx = seq(length(r)), smooth = TRUE, f = 1/3, alpha = .05, digits = 3, plot = TRUE, xlab = "Index", ylab = "Value", main = "r", ... )
r |
Vector of a given error rate ( |
indx |
Vector of indexes ( |
smooth |
Logical. If |
f |
Window for smoothing |
alpha |
Proportion |
digits |
Number of digits for |
plot |
Logical. If |
xlab |
x-axis label of the plot of |
ylab |
y-axis label of the plot of |
main |
Title of the plot of |
... |
Other arguments to pass in function |
res |
matrix with for each number of Lvs: |
opt |
The index of the minimum for |
sel |
The index of the selection from the |
Andries, J.P.M., Vander Heyden, Y., Buydens, L.M.C., 2011. Improved variable reduction in partial least squares modelling based on Predictive-Property-Ranked Variables and adaptation of partial least squares complexity. Analytica Chimica Acta 705, 292-305. https://doi.org/10.1016/j.aca.2011.06.037
Bro, R., Kjeldahl, K., Smilde, A.K., Kiers, H.A.L., 2008. Cross-validation of component models: A critical look at current methods. Anal Bioanal Chem 390, 1241-1251. https://doi.org/10.1007/s00216-007-1790-1
Li, B., Morris, J., Martin, E.B., 2002. Model selection for partial least squares regression. Chemometrics and Intelligent Laboratory Systems 64, 79-89. https://doi.org/10.1016/S0169-7439(02)00051-5
Westad, F., Martens, H., 2000. Variable Selection in near Infrared Spectroscopy Based on Significance Testing in Partial Least Squares Regression. J. Near Infrared Spectrosc., JNIRS 8, 117-124.
Wold S. Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models. Technometrics. 1978;20(4):397-405
data(cassav) Xtrain <- cassav$Xtrain ytrain <- cassav$ytrain X <- cassav$Xtest y <- cassav$ytest nlv <- 20 res <- gridscorelv( Xtrain, ytrain, X, y, score = msep, fun = plskern, nlv = 0:nlv ) selwold(res$y1, res$nlv, f = 2/3)
data(cassav) Xtrain <- cassav$Xtrain ytrain <- cassav$ytrain X <- cassav$Xtest y <- cassav$ytest nlv <- 20 res <- gridscorelv( Xtrain, ytrain, X, y, score = msep, fun = plskern, nlv = 0:nlv ) selwold(res$y1, res$nlv, f = 2/3)
SNV transformation of the row observations (e.g. spectra) of a dataset. By default, each observation is centered on its mean and divided by its standard deviation.
snv(X, center = TRUE, scale = TRUE)
snv(X, center = TRUE, scale = TRUE)
X |
X-data ( |
center |
Logical. If |
scale |
Logical. If |
A matrix of the transformed data.
data(cassav) X <- cassav$Xtest Xp <- snv(X) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plotsp(X, main = "Signal") plotsp(Xp, main = "Corrected signal") abline(h = 0, lty = 2, col = "grey") par(oldpar)
data(cassav) X <- cassav$Xtest Xp <- snv(X) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(1, 2)) plotsp(X, main = "Signal") plotsp(Xp, main = "Corrected signal") abline(h = 0, lty = 2, col = "grey") par(oldpar)
Function soplsr
implements dimension reductions of pre-selected blocks of variables (= set of columns) of a reference (= training) matrix, by sequential orthogonalization-PLS (said "SO-PLS").
Function soplsrcv
perfoms repeteated cross-validation of an SO-PLS model in order to choose the optimal lv combination from the different blocks.
SO-PLS is described for instance in Menichelli et al. (2014), Biancolillo et al. (2015) and Biancolillo (2016).
The block reduction consists in calculating latent variables (= scores) for each block, each block being sequentially orthogonalized to the information computed from the previous blocks.
The function allows giving a priori weights to the rows of the reference matrix in the calculations.
Auxiliary functions
transform
Calculates the LVs for any new matrices list from the model.
predict
Calculates the predictions for any new matrices list from the model.
soplsr(Xlist, Y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlv) soplsrcv(Xlist, Y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist = list(), nbrep = 30, cvmethod = "kfolds", seed = 123, samplingk = NULL, nfolds = 7, optimisation = c("global","sequential")[1], selection = c("localmin","globalmin","1std")[1], majorityvote = FALSE) ## S3 method for class 'Soplsr' transform(object, X, ...) ## S3 method for class 'Soplsr' predict(object, X, ...)
soplsr(Xlist, Y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlv) soplsrcv(Xlist, Y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist = list(), nbrep = 30, cvmethod = "kfolds", seed = 123, samplingk = NULL, nfolds = 7, optimisation = c("global","sequential")[1], selection = c("localmin","globalmin","1std")[1], majorityvote = FALSE) ## S3 method for class 'Soplsr' transform(object, X, ...) ## S3 method for class 'Soplsr' predict(object, X, ...)
Xlist |
A list of matrices or data frames of reference (= training) observations. |
X |
For the auxiliary functions: list of new X-data, with the same variables than the training X-data. |
Y |
A |
Xscaling |
vector (of length Xlist) of variable scaling for each datablock, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
Yscaling |
variable scaling for the Y-block, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
weights |
a priori weights to the rows of the reference matrix in the calculations. |
nlv |
A vector of same length as the number of blocks defining the number of scores to calculate for each block, or a single number. In this last case, the same number of scores is used for all the blocks. |
nlvlist |
A list of same length as the number of X-blocks. Each component of the list gives the number of PLS components of the corresponding X-block to test. |
nbrep |
An integer, setting the number of CV repetitions. Default value is 30. |
cvmethod |
"kfolds" for k-folds cross-validation, or "loo" for leave-one-out. |
seed |
a numeric. Seed used for the repeated resampling, and if cvmethod is "kfolds" and samplingk is not NULL. |
samplingk |
A vector of length n. The elements are the values of a qualitative variable used for stratified partition creation. If NULL, the first observation is set in the first fold, the second observation in the second fold, etc... |
nfolds |
An integer, setting the number of partitions to create. Default value is 7. |
optimisation |
"global" or "sequential" optimisation of the number of components. If "sequential", the optimal lv number is found for the first X-block, then for the 2nd one, etc... |
selection |
a character indicating the selection method to use to choose the optimal combination of components, among "localmin","globalmin","1std". If "localmin": the optimal combination corresponds to the first local minimum of the mean CV rmse. If "globalmin" : the optimal combination corresponds to the minimum mean CV rmse. If "1std" (one standard errror rule): it corresponds to the first combination after which the mean cross-validated rmse does not decrease significantly. |
majorityvote |
only if optimisation is "global" or one X-block. If majorityvote is TRUE, the optimal combination is chosen for each Y variable, with the chosen selection, before a majority vote. If majorityvote is "FALSE, the optimal combination is simply chosen with the chosen selection. |
object |
For the auxiliary functions: A fitted model, output of a call to the main functions. |
... |
For the auxiliary functions: Optional arguments. Not used. |
For soplsr
:
fm |
A list of the plsr models. |
T |
A matrix with the concatenated scores calculated from the X-blocks. |
pred |
A matrice |
xmeans |
list of vectors of X-mean values. |
ymeans |
vector of Y-mean values. |
xscales |
list of vectors of X-scaling values. |
yscales |
vector of Y-scaling values. |
b |
A list of X-loading weights, used in the orthogonalization step. |
weights |
Weights applied to the training observations. |
nlv |
vector of numbers of latent variables from each X-block. |
For transform.Soplsr
: the LVs calculated for the new matrices list from the model.
For predict.Soplsr
: predicted values for each observation
For soplsrcv
:
lvcombi |
matrix or list of matrices, of tested component combinations. |
optimcombi |
the number of PLS components of each X-block allowing the optimisation of the mean rmseCV. |
rmseCV_byY |
matrix or list of matrices of mean and sd of cross-validated RMSE in the model for each combination and each response variable. |
ExplVarCV_byY |
matrix or list of matrices of mean and sd of cross-validated explained variances in the model for each combination and each response variable. |
rmseCV |
matrix or list of matrices of mean and sd of cross-validated RMSE in the model for each combination and response variables. |
ExplVarCV |
matrix or list of matrices of mean and sd of cross-validated explained variances in the model for each combination and response variables. |
- Biancolillo et al. , 2015. Combining SO-PLS and linear discriminant analysis for multi-block classification. Chemometrics and Intelligent Laboratory Systems, 141, 58-67.
- Biancolillo, A. 2016. Method development in the area of multi-block analysis focused on food analysis. PhD. University of copenhagen.
- Menichelli et al., 2014. SO-PLS as an exploratory tool for path modelling. Food Quality and Preference, 36, 122-134.
- Tenenhaus, M., 1998. La régression PLS: théorie et pratique. Editions Technip, Paris, France.
soplsr_soplsda_allsteps
function to help determine the optimal number of latent variables, perform a permutation test, calculate model parameters and predict new observations.
N <- 10 ; p <- 12 set.seed(1) X <- matrix(rnorm(N * p, mean = 10), ncol = p, byrow = TRUE) Y <- matrix(rnorm(N * 2, mean = 10), ncol = 2, byrow = TRUE) colnames(X) <- paste("varx", 1:ncol(X), sep = "") colnames(Y) <- paste("vary", 1:ncol(Y), sep = "") rownames(X) <- rownames(Y) <- paste("obs", 1:nrow(X), sep = "") set.seed(NULL) X Y n <- nrow(X) X_list <- list(X[,1:4], X[,5:7], X[,9:ncol(X)]) X_list_2 <- list(X[1:2,1:4], X[1:2,5:7], X[1:2,9:ncol(X)]) soplsrcv(X_list, Y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist=list(0:1, 1:2, 0:1), nbrep=1, cvmethod="loo", seed = 123, samplingk=NULL, optimisation="global", selection="localmin", majorityvote=FALSE) ncomp <- 2 fm <- soplsr(X_list, Y, nlv = ncomp) transform(fm, X_list_2) predict(fm, X_list_2) mse(predict(fm, X_list), Y) # VIP calculation based on the proportion of Y-variance explained by the components vip(fm$fm[[1]], X_list[[1]], Y = NULL, nlv = ncomp) vip(fm$fm[[2]], X_list[[2]], Y = NULL, nlv = ncomp) vip(fm$fm[[3]], X_list[[3]], Y = NULL, nlv = ncomp) ncomp <- c(2, 0, 3) fm <- soplsr(X_list, Y, nlv = ncomp) transform(fm, X_list_2) predict(fm, X_list_2) mse(predict(fm, X_list), Y) ncomp <- 0 fm <- soplsr(X_list, Y, nlv = ncomp) transform(fm, X_list_2) predict(fm, X_list_2) ncomp <- 2 weights <- rep(1 / n, n) #w <- 1:n fm <- soplsr(X_list, Y, Xscaling = c("sd","pareto","none"), nlv = ncomp, weights = weights) transform(fm, X_list_2) predict(fm, X_list_2)
N <- 10 ; p <- 12 set.seed(1) X <- matrix(rnorm(N * p, mean = 10), ncol = p, byrow = TRUE) Y <- matrix(rnorm(N * 2, mean = 10), ncol = 2, byrow = TRUE) colnames(X) <- paste("varx", 1:ncol(X), sep = "") colnames(Y) <- paste("vary", 1:ncol(Y), sep = "") rownames(X) <- rownames(Y) <- paste("obs", 1:nrow(X), sep = "") set.seed(NULL) X Y n <- nrow(X) X_list <- list(X[,1:4], X[,5:7], X[,9:ncol(X)]) X_list_2 <- list(X[1:2,1:4], X[1:2,5:7], X[1:2,9:ncol(X)]) soplsrcv(X_list, Y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist=list(0:1, 1:2, 0:1), nbrep=1, cvmethod="loo", seed = 123, samplingk=NULL, optimisation="global", selection="localmin", majorityvote=FALSE) ncomp <- 2 fm <- soplsr(X_list, Y, nlv = ncomp) transform(fm, X_list_2) predict(fm, X_list_2) mse(predict(fm, X_list), Y) # VIP calculation based on the proportion of Y-variance explained by the components vip(fm$fm[[1]], X_list[[1]], Y = NULL, nlv = ncomp) vip(fm$fm[[2]], X_list[[2]], Y = NULL, nlv = ncomp) vip(fm$fm[[3]], X_list[[3]], Y = NULL, nlv = ncomp) ncomp <- c(2, 0, 3) fm <- soplsr(X_list, Y, nlv = ncomp) transform(fm, X_list_2) predict(fm, X_list_2) mse(predict(fm, X_list), Y) ncomp <- 0 fm <- soplsr(X_list, Y, nlv = ncomp) transform(fm, X_list_2) predict(fm, X_list_2) ncomp <- 2 weights <- rep(1 / n, n) #w <- 1:n fm <- soplsr(X_list, Y, Xscaling = c("sd","pareto","none"), nlv = ncomp, weights = weights) transform(fm, X_list_2) predict(fm, X_list_2)
Help determine the optimal number of latent variables by cross-validation, perform a permutation test, calculate model parameters and predict new observations, for soplsr (soplsr
), soplsrda (soplsrda
), soplslda (soplslda
) or soplsqda (soplsqda
) models.
soplsr_soplsda_allsteps(Xlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y, Yscaling = c("none","pareto","sd")[1], weights = NULL, newXlist = NULL, newXnames = NULL, method = c("soplsr", "soplsrda","soplslda","soplsqda")[1], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlv = c(), nlvlist = list(), modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[1], nbrep = 30, seed = 123, samplingk = NULL, nfolds = 10, npermut = 30, optimisation = c("global","sequential")[1], criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], import = c("R","ChemFlow","W4M")[1], outputfilename = NULL)
soplsr_soplsda_allsteps(Xlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y, Yscaling = c("none","pareto","sd")[1], weights = NULL, newXlist = NULL, newXnames = NULL, method = c("soplsr", "soplsrda","soplslda","soplsqda")[1], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlv = c(), nlvlist = list(), modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[1], nbrep = 30, seed = 123, samplingk = NULL, nfolds = 10, npermut = 30, optimisation = c("global","sequential")[1], criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], import = c("R","ChemFlow","W4M")[1], outputfilename = NULL)
Xlist |
list of training X-data ( |
Xnames |
name of the X-matrices |
Xscaling |
vector of Xlist length. X variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
Y |
Training Y-data ( |
Yscaling |
Y variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
weights |
Weights ( |
newXlist |
list of new X-data ( |
newXnames |
names of the newX-matrices |
method |
method to apply among "plsr", "plsrda","plslda","plsqda" |
prior |
for plslda or plsqda models : The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in |
step |
step of the analysis among "nlvtest" (cross-validation to help determine the optimal number of latent variables), "permutation" (permutation test),"model" (model calculation),"prediction" (prediction of newX-data or X-data if any)) |
nlv |
if step is not "nlvtest". A vector of same length as the number of blocks defining the number of scores to calculate for each block, or a single number. In this last case, the same number of scores is used for all the blocks. |
nlvlist |
if step is not "nlvtest". A list of same length as the number of X-blocks. Each component of the list gives the number of PLS components of the corresponding X-block to test. |
modeloutput |
if step is "model": outputs among "scores", "loadings", "coef" (regression coefficients), "vip" (Variable Importance in Projection; the VIP calculation being based on the proportion of Y-variance explained by the components, as proposed by Mehmood et al (2012, 2020).) |
cvmethod |
if step is "nlvtest" or "permutation": "kfolds" for k-folds cross-validation, or "loo" for leave-one-out. |
nbrep |
if step is "nlvtest" and cvmethod is "kfolds": An integer, setting the number of CV repetitions. Default value is 30. Must me set to 1 if cvmethod is "loo" |
seed |
if step is "nlvtest" and cvmethod is "kfolds", or if step is "permutation: a numeric. Seed used for the repeated resampling |
samplingk |
A vector of length n. The elements are the values of a qualitative variable used for stratified partition creation. If NULL, the first observation is set in the first fold, the second observation in the second fold, etc... |
nfolds |
if cvmethod is "kfolds". An integer, setting the number of partitions to create. Default value is 10. |
npermut |
if step is "permutation": An integer, setting the number of Y-Block with permutated responses to create. Default value is 30. |
optimisation |
if step is "nlvtest", method for the error optimisation among "global" (the optimal number of latent variables is determined after all the ordred block combination have been computed), or "sequential" (the optimal number of latent variables is determined after each addition of X-block) |
criterion |
if step is "nlvtest" or "permutation" and method is "plsrda", "plslda" or "plsqda": optimisation criterion among "rmse" and "err" (for classification error rate))) |
selection |
if step is "nlvtest": a character indicating the selection method to use to choose the optimal combination of components, among "localmin","globalmin","1std". If "localmin": the optimal combination corresponds to the first local minimum of the mean CV rmse or error rate. If "globalmin" : the optimal combination corresponds to the minimum mean CV rmse or error rate. If "1std" (one standard error rule) : it corresponds to the first combination after which the mean cross-validated rmse or error rate does not decrease significantly. |
import |
If "R", X and Y are in the global environment, and the observation names are in rownames. If "ChemFlow", X and Y are tabulated tables (.txt), and the observation names are in the first column. If "W4M", X and Y are tabulated tables (.txt), and the observation names are in the headers of X, and in the first column of Y. |
outputfilename |
character: If not NULL, name of the tabular file, in which the function outputs have to be written.) |
If step is "nlvtest": table with rmsecv or cross-validated classification error rates. The suggested optimal number of latent variable combination is indicated by the binary "optimum" variable.
If step is "permutation": table with the dissimilarity between the original and the permutated Y-block, and the rmsecv or cross-validated classification error rates obtained with the permutated Y-block by the model and the given number of latent variables.
If step is "model": list of tables of scores, loadings, regression coefficients, and vip values by X-Block, depending of the "modeloutput" parameter.
If step is "prediction": table of predicted scores and predicted classes or values.
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) colnames(Xtrain) <- paste0("V",1:p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8]) Xtestlist <- list(Xtest[,1:3], Xtest[,4:8]) nlv <- 5 resnlvtestsoplsrda <- soplsr_soplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("soplsr", "soplsrda","soplslda","soplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlvlist = list(1:2, 1:2), modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, optimisation = "global", criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) respermutationsoplsrda <- soplsr_soplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("soplsr", "soplsrda","soplslda","soplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[2], nlv = c(2,1), modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) plotxy(respermutationsoplsrda, pch=16) abline (h = respermutationsoplsrda[respermutationsoplsrda[,"permut_dyssimilarity"]==0,"res_permut"]) resmodelsoplsrda <- soplsr_soplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("soplsr", "soplsrda","soplslda","soplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[3], nlv = c(2,1), modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) resmodelsoplsrda$scores resmodelsoplsrda$loadings resmodelsoplsrda$coef resmodelsoplsrda$vip respredictionsoplsrda <- soplsr_soplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("soplsr", "soplsrda","soplslda","soplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[4], nlv = c(2,1), modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL)
n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) colnames(Xtrain) <- paste0("V",1:p) ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] Xtrainlist <- list(Xtrain[,1:3], Xtrain[,4:8]) Xtestlist <- list(Xtest[,1:3], Xtest[,4:8]) nlv <- 5 resnlvtestsoplsrda <- soplsr_soplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("soplsr", "soplsrda","soplslda","soplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[1], nlvlist = list(1:2, 1:2), modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, optimisation = "global", criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) respermutationsoplsrda <- soplsr_soplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("soplsr", "soplsrda","soplslda","soplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[2], nlv = c(2,1), modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) plotxy(respermutationsoplsrda, pch=16) abline (h = respermutationsoplsrda[respermutationsoplsrda[,"permut_dyssimilarity"]==0,"res_permut"]) resmodelsoplsrda <- soplsr_soplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("soplsr", "soplsrda","soplslda","soplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[3], nlv = c(2,1), modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL) resmodelsoplsrda$scores resmodelsoplsrda$loadings resmodelsoplsrda$coef resmodelsoplsrda$vip respredictionsoplsrda <- soplsr_soplsda_allsteps(Xlist = Xtrainlist, Xnames = NULL, Xscaling = c("none","pareto","sd")[1], Y = ytrain, Yscaling = "none", weights = NULL, newXlist = Xtestlist, newXnames = NULL, method = c("soplsr", "soplsrda","soplslda","soplsqda")[2], prior = c("unif", "prop")[1], step = c("nlvtest","permutation","model","prediction")[4], nlv = c(2,1), modeloutput = c("scores","loadings","coef","vip"), cvmethod = c("kfolds","loo")[2], nbrep = 1, seed = 123, samplingk = NULL, nfolds = 10, npermut = 5, criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1], outputfilename = NULL)
Function soplsrda
implements dimension reductions of pre-selected blocks of variables (= set of columns) of a reference (= training) matrix, by sequential orthogonalization-PLS (said "SO-PLS") in a context of discrimination.
Function soplsrdacv
perfoms repeteated cross-validation of an SO-PLS-RDA model in order to choose the optimal lv combination from the different blocks.
The block reduction consists in calculating latent variables (= scores) for each block, each block being sequentially orthogonalized to the information computed from the previous blocks.
The function allows giving a priori weights to the rows of the reference matrix in the calculations.
Insoplslda
and soplsqda
, probabilistic LDA and QDA are run over the PLS2 LVs, respectively.
soplsrda(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlv) soplslda(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlv, prior = c("unif", "prop")) soplsqda(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlv, prior = c("unif", "prop")) soplsrdacv(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist=list(), nbrep=30, cvmethod="kfolds", seed = 123, samplingk = NULL, nfolds = 7, optimisation = c("global","sequential")[1], criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1]) soplsldacv(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist=list(), prior = c("unif", "prop"), nbrep = 30, cvmethod = "kfolds", seed = 123, samplingk = NULL, nfolds = 7, optimisation = c("global","sequential")[1], criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1]) soplsqdacv(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist = list(), prior = c("unif", "prop"), nbrep = 30, cvmethod = "kfolds", seed = 123, samplingk = NULL, nfolds = 7, optimisation = c("global","sequential")[1], criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1]) ## S3 method for class 'Soplsrda' transform(object, X, ...) ## S3 method for class 'Soplsprobda' transform(object, X, ...) ## S3 method for class 'Soplsrda' predict(object, X, ...) ## S3 method for class 'Soplsprobda' predict(object, X, ...)
soplsrda(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlv) soplslda(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlv, prior = c("unif", "prop")) soplsqda(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlv, prior = c("unif", "prop")) soplsrdacv(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist=list(), nbrep=30, cvmethod="kfolds", seed = 123, samplingk = NULL, nfolds = 7, optimisation = c("global","sequential")[1], criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1]) soplsldacv(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist=list(), prior = c("unif", "prop"), nbrep = 30, cvmethod = "kfolds", seed = 123, samplingk = NULL, nfolds = 7, optimisation = c("global","sequential")[1], criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1]) soplsqdacv(Xlist, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist = list(), prior = c("unif", "prop"), nbrep = 30, cvmethod = "kfolds", seed = 123, samplingk = NULL, nfolds = 7, optimisation = c("global","sequential")[1], criterion = c("err","rmse")[1], selection = c("localmin","globalmin","1std")[1]) ## S3 method for class 'Soplsrda' transform(object, X, ...) ## S3 method for class 'Soplsprobda' transform(object, X, ...) ## S3 method for class 'Soplsrda' predict(object, X, ...) ## S3 method for class 'Soplsprobda' predict(object, X, ...)
Xlist |
For the main functions: A list of matrices or data frames of reference (= training) observations. |
X |
For the auxiliary functions: list of new X-data, with the same variables than the training X-data. |
y |
Training class membership ( |
Xscaling |
vector (of length Xlist) of variable scaling for each datablock, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
Yscaling |
variable scaling for the Y-block, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used. |
weights |
a priori weights to the rows of the reference matrix in the calculations. |
nlv |
A vector of same length as the number of blocks defining the number of scores to calculate for each block, or a single number. In this last case, the same number of scores is used for all the blocks. |
nlvlist |
A list of same length as the number of X-blocks. Each component of the list gives the number of PLS components of the corresponding X-block to test. |
nbrep |
An integer, setting the number of CV repetitions. Default value is 30. |
cvmethod |
"kfolds" for k-folds cross-validation, or "loo" for leave-one-out. |
seed |
a numeric. Seed used for the repeated resampling, and if cvmethod is "kfolds" and samplingk is not NULL. |
samplingk |
Optional. A vector of length n. The elements are the values of a qualitative variable used for stratified partition creation. |
nfolds |
An integer, setting the number of partitions to create. Default value is 7. |
optimisation |
"global" or "sequential" optimisation of the number of components. If "sequential", the optimal lv number is found for the first X-block, then for the 2nd one, etc... |
criterion |
optimisation criterion among "rmse" and "err" (for classification error rate) |
selection |
a character indicating the selection method to use to choose the optimal combination of components, among "localmin","globalmin","1std". If "localmin": the optimal combination corresponds to the first local minimum of the mean CV rmse or error rate. If "globalmin" : the optimal combination corresponds to the minimum mean CV rmse or error rate. If "1std" (one standard error rule) : it corresponds to the first combination after which the mean cross-validated rmse or error rate does not decrease significantly. |
prior |
The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in |
object |
For the auxiliary functions: A fitted model, output of a call to the main functions. |
... |
For the auxiliary functions: Optional arguments. Not used. |
For soplsrda
, soplslda
, soplsqda
:
fm |
list with the PLS models: ( |
lev |
classes |
ni |
number of observations in each class |
For transform.Soplsrda
, transform.Soplsprobda
: the LVs Calculated for the new matrices list from the model.
For predict.Soplsrda
, predict.Soplsprobda
:
pred |
predicted class for each observation |
posterior |
calculated probability of belonging to a class for each observation |
For soplsrdacv
, soplsldacv
, soplsqdacv
:
lvcombi |
matrix or list of matrices, of tested component combinations. |
optimCombiLine |
number of the combination line corresponding to the optimal one. In the case of a sequential optimisation, it is the number of the combination line in the model with all the X-blocks. |
optimcombi |
the number of PLS components of each X-block allowing the optimisation of the mean rmseCV. |
optimExplVarCV |
cross-validated explained variance for the optimal soplsda model. |
rmseCV |
matrix or list of matrices of mean and sd of cross-validated rmse in the model for each combination and response variables. |
ExplVarCV |
matrix or list of matrices of mean and sd of cross-validated explained variances in the model for each combination and response variables. |
errCV |
matrix or list of matrices of mean and sd of cross-validated classification error rates in the model for each combination and response variables. |
- Biancolillo et al. , 2015. Combining SO-PLS and linear discriminant analysis for multi-block classification. Chemometrics and Intelligent Laboratory Systems, 141, 58-67.
- Biancolillo, A. 2016. Method development in the area of multi-block analysis focused on food analysis. PhD. University of copenhagen.
- Menichelli et al., 2014. SO-PLS as an exploratory tool for path modelling. Food Quality and Preference, 36, 122-134.
- Tenenhaus, M., 1998. La régression PLS: théorie et pratique. Editions Technip, Paris, France.
soplsr_soplsda_allsteps
function to help determine the optimal number of latent variables, perform a permutation test, calculate model parameters and predict new observations.
N <- 10 ; p <- 12 set.seed(1) X <- matrix(rnorm(N * p, mean = 10), ncol = p, byrow = TRUE) y <- matrix(sample(c("1", "4", "10"), size = N, replace = TRUE), ncol=1) colnames(X) <- paste("x", 1:ncol(X), sep = "") set.seed(NULL) n <- nrow(X) X_list <- list(X[,1:4], X[,5:7], X[,9:ncol(X)]) X_list_2 <- list(X[1:2,1:4], X[1:2,5:7], X[1:2,9:ncol(X)]) # EXEMPLE WITH SO-PLS-RDA soplsrdacv(X_list, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist=list(0:1, 1:2, 0:1), nbrep=1, cvmethod="loo", seed = 123, samplingk = NULL, nfolds = 3, optimisation = "global", criterion = c("err","rmse")[1], selection = "localmin") ncomp <- 2 fm <- soplsrda(X_list, y, nlv = ncomp) predict(fm,X_list_2) transform(fm,X_list_2) ncomp <- c(2, 0, 3) fm <- soplsrda(X_list, y, nlv = ncomp) predict(fm,X_list_2) transform(fm,X_list_2) ncomp <- 0 fm <- soplsrda(X_list, y, nlv = ncomp) predict(fm,X_list_2) transform(fm,X_list_2) # EXEMPLE WITH SO-PLS-LDA ncomp <- 2 weights <- rep(1 / n, n) #w <- 1:n soplslda(X_list, y, Xscaling = "none", nlv = ncomp, weights = weights) soplslda(X_list, y, Xscaling = "pareto", nlv = ncomp, weights = weights) soplslda(X_list, y, Xscaling = "sd", nlv = ncomp, weights = weights) fm <- soplslda(X_list, y, Xscaling = c("none","pareto","sd"), nlv = ncomp, weights = weights) predict(fm,X_list_2) transform(fm,X_list_2)
N <- 10 ; p <- 12 set.seed(1) X <- matrix(rnorm(N * p, mean = 10), ncol = p, byrow = TRUE) y <- matrix(sample(c("1", "4", "10"), size = N, replace = TRUE), ncol=1) colnames(X) <- paste("x", 1:ncol(X), sep = "") set.seed(NULL) n <- nrow(X) X_list <- list(X[,1:4], X[,5:7], X[,9:ncol(X)]) X_list_2 <- list(X[1:2,1:4], X[1:2,5:7], X[1:2,9:ncol(X)]) # EXEMPLE WITH SO-PLS-RDA soplsrdacv(X_list, y, Xscaling = c("none", "pareto", "sd")[1], Yscaling = c("none", "pareto", "sd")[1], weights = NULL, nlvlist=list(0:1, 1:2, 0:1), nbrep=1, cvmethod="loo", seed = 123, samplingk = NULL, nfolds = 3, optimisation = "global", criterion = c("err","rmse")[1], selection = "localmin") ncomp <- 2 fm <- soplsrda(X_list, y, nlv = ncomp) predict(fm,X_list_2) transform(fm,X_list_2) ncomp <- c(2, 0, 3) fm <- soplsrda(X_list, y, nlv = ncomp) predict(fm,X_list_2) transform(fm,X_list_2) ncomp <- 0 fm <- soplsrda(X_list, y, nlv = ncomp) predict(fm,X_list_2) transform(fm,X_list_2) # EXEMPLE WITH SO-PLS-LDA ncomp <- 2 weights <- rep(1 / n, n) #w <- 1:n soplslda(X_list, y, Xscaling = "none", nlv = ncomp, weights = weights) soplslda(X_list, y, Xscaling = "pareto", nlv = ncomp, weights = weights) soplslda(X_list, y, Xscaling = "sd", nlv = ncomp, weights = weights) fm <- soplslda(X_list, y, Xscaling = c("none","pareto","sd"), nlv = ncomp, weights = weights) predict(fm,X_list_2) transform(fm,X_list_2)
Source all the R functions contained in a directory.
sourcedir(path, trace = TRUE, ...)
sourcedir(path, trace = TRUE, ...)
path |
A character vector of full path names; the default corresponds to the working directory, |
trace |
Logical. Default to |
... |
Additional arguments to pass in the function |
Sourcing.
path <- "D:/Users/Fun" sourcedir(path, FALSE)
path <- "D:/Users/Fun" sourcedir(path, FALSE)
Displays summary statistics for each quantitative column of the data set.
summ(X, nam = NULL, digits = 3)
summ(X, nam = NULL, digits = 3)
X |
A matrix or data frame containing the variables to summarize. |
nam |
Names of the variables to summarize (vector of character strings). Default to |
digits |
Number of digits for the numerical outputs. |
tab |
A dataframe of summary statistics : |
ntot |
number of observations |
dat <- data.frame( v1 = rnorm(10), v2 = c(NA, rnorm(8), NA), v3 = c(NA, NA, NA, rnorm(7)) ) dat summ(dat) summ(dat, nam = c("v1", "v3"))
dat <- data.frame( v1 = rnorm(10), v2 = c(NA, rnorm(8), NA), v3 = c(NA, NA, NA, rnorm(7)) ) dat summ(dat) summ(dat, nam = c("v1", "v3"))
SVM models with Gaussian (RBF) kernel.
svmr
: SVM regression (SVMR).
svmda
: SVM discrimination (SVMC).
The SVM models are fitted with parameterization , not the
parameterization.
The RBF kernel is defined by: exp(-gamma * |x - y|^2).
For tuning the model, usual preliminary ranges are for instance:
- cost = 10^(-5:15)
- epsilon = seq(.1, .3, by = .1)
- gamma = 10^(-6:3)
The functions uses function svm
of package e1071
(Meyer et al. 2021) available on CRAN (e1071 uses the tool box LIVSIM; Chang & Lin, http://www.csie.ntu.edu.tw/~cjlin/libsvm).
svmr(X, y, cost = 1, epsilon = .1, gamma = 1, scale = FALSE) svmda(X, y, cost = 1, epsilon = .1, gamma = 1, scale = FALSE) ## S3 method for class 'Svm' predict(object, X, ...) ## S3 method for class 'Svm' summary(object, ...)
svmr(X, y, cost = 1, epsilon = .1, gamma = 1, scale = FALSE) svmda(X, y, cost = 1, epsilon = .1, gamma = 1, scale = FALSE) ## S3 method for class 'Svm' predict(object, X, ...) ## S3 method for class 'Svm' summary(object, ...)
X |
For the main functions: Training X-data ( |
y |
Training Y-data ( |
cost |
The cost of constraints violation |
epsilon |
The |
gamma |
The |
scale |
Logical. If |
object |
For the auxiliary functions: A fitted model, output of a call to the main function. |
... |
For the auxiliary functions: Optional arguments. |
For svmr
and svmda
:
fm |
list of outputs such as:
|
For predict.Svm
:
pred |
predictions for each observation. |
For summary.Svm
:display of call, parameters, and number of support vectors.
The first example illustrates SVMR. The second one is the example of fitting the function sinc(x) described in Rosipal & Trejo 2001 p. 105-106. The third one illustrates SVMC.
Meyer, M. 2021 Support Vector Machines - The Interface to libsvm in package e1071. FH Technikum Wien, Austria, [email protected]. https://cran.r-project.org/web/packages/e1071/vignettes/svmdoc.pdf
Chang, cost.-cost. & Lin, cost.-J. (2001). LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. Detailed documentation (algorithms, formulae, . . . ) can be found in http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.ps.gz
## EXAMPLE 1 (SVMR) n <- 50 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] ytest <- ytrain[1:m] fm <- svmr(Xtrain, ytrain) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred msep(pred, ytest) summary(fm) ## EXAMPLE 2 x <- seq(-10, 10, by = .2) x[x == 0] <- 1e-5 n <- length(x) zy <- sin(abs(x)) / abs(x) y <- zy + rnorm(n, 0, .2) plot(x, y, type = "p") lines(x, zy, lty = 2) X <- matrix(x, ncol = 1) fm <- svmr(X, y, gamma = .5) pred <- predict(fm, X)$pred plot(X, y, type = "p") lines(X, zy, lty = 2) lines(X, pred, col = "red") ## EXAMPLE 3 (SVMC) n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c("a", "10", "d"), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] cost <- 100 ; epsilon <- .1 ; gamma <- 1 fm <- svmda(Xtrain, ytrain, cost = cost, epsilon = epsilon, gamma = gamma) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred err(pred, ytest) summary(fm)
## EXAMPLE 1 (SVMR) n <- 50 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] ytest <- ytrain[1:m] fm <- svmr(Xtrain, ytrain) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred msep(pred, ytest) summary(fm) ## EXAMPLE 2 x <- seq(-10, 10, by = .2) x[x == 0] <- 1e-5 n <- length(x) zy <- sin(abs(x)) / abs(x) y <- zy + rnorm(n, 0, .2) plot(x, y, type = "p") lines(x, zy, lty = 2) X <- matrix(x, ncol = 1) fm <- svmr(X, y, gamma = .5) pred <- predict(fm, X)$pred plot(X, y, type = "p") lines(X, zy, lty = 2) lines(X, pred, col = "red") ## EXAMPLE 3 (SVMC) n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c("a", "10", "d"), size = n, replace = TRUE) m <- 5 Xtest <- Xtrain[1:m, ] ; ytest <- ytrain[1:m] cost <- 100 ; epsilon <- .1 ; gamma <- 1 fm <- svmda(Xtrain, ytrain, cost = cost, epsilon = epsilon, gamma = gamma) predict(fm, Xtest) pred <- predict(fm, Xtest)$pred err(pred, ytest) summary(fm)
Transformation of the X-data by a fitted model.
transform(object, X, ...)
transform(object, X, ...)
object |
A fitted model, output of a call to a fitting function. |
X |
New X-data to consider. |
... |
Optional arguments. |
the transformed X-data
## EXAMPLE 1 n <- 6 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) fm <- pcaeigen(X, nlv = 3) fm$T transform(fm, X[1:2, ], nlv = 2) ## EXAMPLE 2 n <- 6 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) fm <- plskern(X, y, nlv = 3) fm$T transform(fm, X[1:2, ], nlv = 2)
## EXAMPLE 1 n <- 6 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) fm <- pcaeigen(X, nlv = 3) fm$T transform(fm, X[1:2, ], nlv = 2) ## EXAMPLE 2 n <- 6 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) fm <- plskern(X, y, nlv = 3) fm$T transform(fm, X[1:2, ], nlv = 2)
vip
calculates the Variable Importance in Projection (VIP) for a PLS model.
vip(object, X, Y = NULL, nlv = NULL)
vip(object, X, Y = NULL, nlv = NULL)
object |
A fitted model, output of a call to a fitting function among |
X |
X-data involved in the fitted model |
Y |
Y-data involved in the fitted model.
If |
nlv |
Number of components (LVs) to consider. |
matrix () with VIP values, for models with 1 to nlv latent variables.
Mehmood, T.,Liland, K.H.,Snipen, L.,Sæbø, S., 2012. A review of variable selection methods in Partial Least Squares Regression. Chemometrics and Intelligent Laboratory Systems, 118, 62-69.
Mehmood, T., Sæbø, S.,Liland, K.H., 2020. Comparison of variable selection methods in partial least squares regression. Journal of Chemometrics, 34, e3226.
Tenenhaus, M., 1998. La régression PLS: théorie et pratique. Editions Technip, Paris, France.
## EXAMPLE OF PLS n <- 50 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] nlv <- 3 fm <- plskern(Xtrain, Ytrain, nlv = nlv) vip(fm, Xtrain, Ytrain, nlv = nlv) vip(fm, Xtrain, nlv = nlv) fm <- plskern(Xtrain, ytrain, nlv = nlv) vip(fm, Xtrain, ytrain, nlv = nlv) vip(fm, Xtrain, nlv = nlv) ## EXAMPLE OF PLSDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c("1", "4", "10"), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] nlv <- 5 fm <- plsrda(Xtrain, ytrain, nlv = nlv) vip(fm, Xtrain, ytrain, nlv = nlv)
## EXAMPLE OF PLS n <- 50 ; p <- 4 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- rnorm(n) Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain) m <- 3 Xtest <- Xtrain[1:m, , drop = FALSE] Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1] nlv <- 3 fm <- plskern(Xtrain, Ytrain, nlv = nlv) vip(fm, Xtrain, Ytrain, nlv = nlv) vip(fm, Xtrain, nlv = nlv) fm <- plskern(Xtrain, ytrain, nlv = nlv) vip(fm, Xtrain, ytrain, nlv = nlv) vip(fm, Xtrain, nlv = nlv) ## EXAMPLE OF PLSDA n <- 50 ; p <- 8 Xtrain <- matrix(rnorm(n * p), ncol = p) ytrain <- sample(c("1", "4", "10"), size = n, replace = TRUE) Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5] nlv <- 5 fm <- plsrda(Xtrain, ytrain, nlv = nlv) vip(fm, Xtrain, ytrain, nlv = nlv)
Calculation of weights from a vector of distances using a decreasing inverse exponential function.
Let be a vector of distances.
1- Preliminary weights are calculated by , where
is a scalar > 0 (scale factor).
2- The weights corresponding to distances higher than , where
is a scalar > 0, are set to zero. This step is used for removing outliers.
3- Finally, the weights are "normalized" between 0 and 1 by .
wdist(d, h, cri = 4, squared = FALSE)
wdist(d, h, cri = 4, squared = FALSE)
d |
A vector of distances. |
h |
A scaling factor (positive scalar). Lower is |
cri |
A positive scalar used for defining outliers in the distances vector. |
squared |
Logical. If |
A vector of weights.
x1 <- sqrt(rchisq(n = 100, df = 10)) x2 <- sqrt(rchisq(n = 10, df = 40)) d <- c(x1, x2) h <- 2 ; cri <- 3 w <- wdist(d, h = h, cri = cri) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(2, 2)) plot(d) hist(d, n = 50) plot(w, ylim = c(0, 1)) ; abline(h = 1, lty = 2) plot(d, w, ylim = c(0, 1)) ; abline(h = 1, lty = 2) par(oldpar) d <- seq(0, 15, by = .5) h <- c(.5, 1, 1.5, 2.5, 5, 10, Inf) for(i in 1:length(h)) { w <- wdist(d, h = h[i]) z <- data.frame(d = d, w = w, h = rep(h[i], length(d))) if(i == 1) res <- z else res <- rbind(res, z) } res$h <- as.factor(res$h) headm(res) plotxy(res[, c("d", "w")], asp = 0, group = res$h, pch = 16)
x1 <- sqrt(rchisq(n = 100, df = 10)) x2 <- sqrt(rchisq(n = 10, df = 40)) d <- c(x1, x2) h <- 2 ; cri <- 3 w <- wdist(d, h = h, cri = cri) oldpar <- par(mfrow = c(1, 1)) par(mfrow = c(2, 2)) plot(d) hist(d, n = 50) plot(w, ylim = c(0, 1)) ; abline(h = 1, lty = 2) plot(d, w, ylim = c(0, 1)) ; abline(h = 1, lty = 2) par(oldpar) d <- seq(0, 15, by = .5) h <- c(.5, 1, 1.5, 2.5, 5, 10, Inf) for(i in 1:length(h)) { w <- wdist(d, h = h[i]) z <- data.frame(d = d, w = w, h = rep(h[i], length(d))) if(i == 1) res <- z else res <- rbind(res, z) } res$h <- as.factor(res$h) headm(res) plotxy(res[, c("d", "w")], asp = 0, group = res$h, pch = 16)
Function xfit
calculates an approximate of matrix (
) from a PCA or PLS fitted on
.
Function xresid
calculates the residual matrix .
xfit(object, X, ...) ## S3 method for class 'Pca' xfit(object, X, ..., nlv = NULL) ## S3 method for class 'Plsr' xfit(object, X, ..., nlv = NULL) xresid(object, X, ..., nlv = NULL)
xfit(object, X, ...) ## S3 method for class 'Pca' xfit(object, X, ..., nlv = NULL) ## S3 method for class 'Plsr' xfit(object, X, ..., nlv = NULL) xresid(object, X, ..., nlv = NULL)
object |
A fitted model, output of a call to a fitting function. |
X |
The X-data that was used to fit the model |
nlv |
Number of components (PCs or LVs) to consider. |
... |
Optional arguments. |
For xfit
:matrix of fitted values.
For xresid
:matrix of residuals.
n <- 6 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) nlv <- 3 fm <- pcasvd(X, nlv = nlv) xfit(fm, X) xfit(fm, X, nlv = 1) xfit(fm, X, nlv = 0) X - xfit(fm, X) xresid(fm, X) X - xfit(fm, X, nlv = 1) xresid(fm, X, nlv = 1)
n <- 6 ; p <- 4 X <- matrix(rnorm(n * p), ncol = p) y <- rnorm(n) nlv <- 3 fm <- pcasvd(X, nlv = nlv) xfit(fm, X) xfit(fm, X, nlv = 1) xfit(fm, X, nlv = 0) X - xfit(fm, X) xresid(fm, X) X - xfit(fm, X, nlv = 1) xresid(fm, X, nlv = 1)