Multivariate Multi-Task LASSO
multivariate_mtlasso.RdFits a multi-task LASSO model for multivariate isoform prediction. This method encourages joint feature selection across isoforms using L21 regularization (row sparsity on the coefficient matrix).
Arguments
- X
matrix, design matrix of SNP dosages (n x p)
- Y
matrix, matrix of isoform expression across columns (n x q)
- regularization
character, type of multi-task regularization:
"L21": L2,1 norm (row sparsity) - same SNPs selected across isoforms
"Trace": trace norm (low-rank) - isoforms share latent structure (RMTL only)
"Lasso": standard lasso applied jointly (RMTL only)
- lambda
numeric or NULL, regularization parameter. If NULL, selected by CV.
- lambda_seq
numeric vector, sequence of lambda values to try in CV. If NULL, automatically generates a sequence.
- nlambda
int, number of lambda values if lambda_seq is NULL
- lambda_min_ratio
numeric, ratio of min to max lambda
- Lam2
numeric, ridge penalty parameter (for RMTL backend). Default 0.
- nfolds
int, number of CV folds
- standardize
logical, standardize X before fitting. Default FALSE.
- verbose
logical, print progress
- seed
int, random seed
- par
logical, use parallel processing for CV folds
- n.cores
int, number of cores for parallel processing
- backend
character, implementation to use:
"fast": Custom fast implementation optimized for shared X (default)
"rmtl": Use RMTL package (slower, but supports Trace/Lasso regularization)
Value
isotwas_model object containing:
transcripts: list of transcript_model objects with weights, R2, pvalues
best_lambda: optimal lambda from CV
regularization: type of regularization used
Details
The L21 penalty enforces that the same SNPs are selected or excluded across all isoforms, which is appropriate when SNPs are expected to have shared effects on multiple isoforms of the same gene.
The L21 regularization solves: $$\min_W \frac{1}{2n}||Y - XW||_F^2 + \lambda ||W||_{2,1}$$
where \(||W||_{2,1} = \sum_j ||W_j||_2\) is the sum of L2 norms of rows, encouraging entire rows (SNPs) to be zero or non-zero together.
The fast backend exploits the shared X structure by precomputing X'X once, making it much faster than general-purpose MTL solvers.