SPiP: Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
SPiP : Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing. / Leman, Raphaël; Parfait, Béatrice; Vidaud, Dominique; Girodon, Emmanuelle; Pacot, Laurence; Le Gac, Gérald; Ka, Chandran; Ferec, Claude; Fichou, Yann; Quesnelle, Céline; Aucouturier, Camille; Muller, Etienne; Vaur, Dominique; Castera, Laurent; Boulouard, Flavie; Ricou, Agathe; Tubeuf, Hélène; Soukarieh, Omar; Gaildrat, Pascaline; Riant, Florence; Guillaud-Bataille, Marine; Caputo, Sandrine M.; Caux-Moncoutier, Virginie; Boutry-Kryza, Nadia; Bonnet-Dorion, Françoise; Schultz, Ines; Rossing, Maria; Quenez, Olivier; Goldenberg, Louis; Harter, Valentin; Parsons, Michael T.; Spurdle, Amanda B.; Frébourg, Thierry; Martins, Alexandra; Houdayer, Claude; Krieger, Sophie.
In: Human Mutation, Vol. 43, No. 12, 2022, p. 2308-2323.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - SPiP
T2 - Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing
AU - Leman, Raphaël
AU - Parfait, Béatrice
AU - Vidaud, Dominique
AU - Girodon, Emmanuelle
AU - Pacot, Laurence
AU - Le Gac, Gérald
AU - Ka, Chandran
AU - Ferec, Claude
AU - Fichou, Yann
AU - Quesnelle, Céline
AU - Aucouturier, Camille
AU - Muller, Etienne
AU - Vaur, Dominique
AU - Castera, Laurent
AU - Boulouard, Flavie
AU - Ricou, Agathe
AU - Tubeuf, Hélène
AU - Soukarieh, Omar
AU - Gaildrat, Pascaline
AU - Riant, Florence
AU - Guillaud-Bataille, Marine
AU - Caputo, Sandrine M.
AU - Caux-Moncoutier, Virginie
AU - Boutry-Kryza, Nadia
AU - Bonnet-Dorion, Françoise
AU - Schultz, Ines
AU - Rossing, Maria
AU - Quenez, Olivier
AU - Goldenberg, Louis
AU - Harter, Valentin
AU - Parsons, Michael T.
AU - Spurdle, Amanda B.
AU - Frébourg, Thierry
AU - Martins, Alexandra
AU - Houdayer, Claude
AU - Krieger, Sophie
N1 - Publisher Copyright: © 2022 The Authors. Human Mutation published by Wiley Periodicals LLC.
PY - 2022
Y1 - 2022
N2 - Modeling splicing is essential for tackling the challenge of variant interpretation as each nucleotide variation can be pathogenic by affecting pre-mRNA splicing via disruption/creation of splicing motifs such as 5′/3′ splice sites, branch sites, or splicing regulatory elements. Unfortunately, most in silico tools focus on a specific type of splicing motif, which is why we developed the Splicing Prediction Pipeline (SPiP) to perform, in one single bioinformatic analysis based on a machine learning approach, a comprehensive assessment of the variant effect on different splicing motifs. We gathered a curated set of 4616 variants scattered all along the sequence of 227 genes, with their corresponding splicing studies. The Bayesian analysis provided us with the number of control variants, that is, variants without impact on splicing, to mimic the deluge of variants from high-throughput sequencing data. Results show that SPiP can deal with the diversity of splicing alterations, with 83.13% sensitivity and 99% specificity to detect spliceogenic variants. Overall performance as measured by area under the receiving operator curve was 0.986, better than SpliceAI and SQUIRLS (0.965 and 0.766) for the same data set. SPiP lends itself to a unique suite for comprehensive prediction of spliceogenicity in the genomic medicine era. SPiP is available at: https://sourceforge.net/projects/splicing-prediction-pipeline/.
AB - Modeling splicing is essential for tackling the challenge of variant interpretation as each nucleotide variation can be pathogenic by affecting pre-mRNA splicing via disruption/creation of splicing motifs such as 5′/3′ splice sites, branch sites, or splicing regulatory elements. Unfortunately, most in silico tools focus on a specific type of splicing motif, which is why we developed the Splicing Prediction Pipeline (SPiP) to perform, in one single bioinformatic analysis based on a machine learning approach, a comprehensive assessment of the variant effect on different splicing motifs. We gathered a curated set of 4616 variants scattered all along the sequence of 227 genes, with their corresponding splicing studies. The Bayesian analysis provided us with the number of control variants, that is, variants without impact on splicing, to mimic the deluge of variants from high-throughput sequencing data. Results show that SPiP can deal with the diversity of splicing alterations, with 83.13% sensitivity and 99% specificity to detect spliceogenic variants. Overall performance as measured by area under the receiving operator curve was 0.986, better than SpliceAI and SQUIRLS (0.965 and 0.766) for the same data set. SPiP lends itself to a unique suite for comprehensive prediction of spliceogenicity in the genomic medicine era. SPiP is available at: https://sourceforge.net/projects/splicing-prediction-pipeline/.
KW - machine learning
KW - RNA
KW - sequence variants
KW - SPiP
KW - splicing predictions
UR - http://www.scopus.com/inward/record.url?scp=85142228279&partnerID=8YFLogxK
U2 - 10.1002/humu.24491
DO - 10.1002/humu.24491
M3 - Journal article
C2 - 36273432
AN - SCOPUS:85142228279
VL - 43
SP - 2308
EP - 2323
JO - Human Mutation
JF - Human Mutation
SN - 1059-7794
IS - 12
ER -
ID: 346412397