SPiP: Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

SPiP : Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing. / Leman, Raphaël; Parfait, Béatrice; Vidaud, Dominique; Girodon, Emmanuelle; Pacot, Laurence; Le Gac, Gérald; Ka, Chandran; Ferec, Claude; Fichou, Yann; Quesnelle, Céline; Aucouturier, Camille; Muller, Etienne; Vaur, Dominique; Castera, Laurent; Boulouard, Flavie; Ricou, Agathe; Tubeuf, Hélène; Soukarieh, Omar; Gaildrat, Pascaline; Riant, Florence; Guillaud-Bataille, Marine; Caputo, Sandrine M.; Caux-Moncoutier, Virginie; Boutry-Kryza, Nadia; Bonnet-Dorion, Françoise; Schultz, Ines; Rossing, Maria; Quenez, Olivier; Goldenberg, Louis; Harter, Valentin; Parsons, Michael T.; Spurdle, Amanda B.; Frébourg, Thierry; Martins, Alexandra; Houdayer, Claude; Krieger, Sophie.

In: Human Mutation, Vol. 43, No. 12, 2022, p. 2308-2323.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Leman, R, Parfait, B, Vidaud, D, Girodon, E, Pacot, L, Le Gac, G, Ka, C, Ferec, C, Fichou, Y, Quesnelle, C, Aucouturier, C, Muller, E, Vaur, D, Castera, L, Boulouard, F, Ricou, A, Tubeuf, H, Soukarieh, O, Gaildrat, P, Riant, F, Guillaud-Bataille, M, Caputo, SM, Caux-Moncoutier, V, Boutry-Kryza, N, Bonnet-Dorion, F, Schultz, I, Rossing, M, Quenez, O, Goldenberg, L, Harter, V, Parsons, MT, Spurdle, AB, Frébourg, T, Martins, A, Houdayer, C & Krieger, S 2022, 'SPiP: Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing', Human Mutation, vol. 43, no. 12, pp. 2308-2323. https://doi.org/10.1002/humu.24491

APA

Leman, R., Parfait, B., Vidaud, D., Girodon, E., Pacot, L., Le Gac, G., Ka, C., Ferec, C., Fichou, Y., Quesnelle, C., Aucouturier, C., Muller, E., Vaur, D., Castera, L., Boulouard, F., Ricou, A., Tubeuf, H., Soukarieh, O., Gaildrat, P., ... Krieger, S. (2022). SPiP: Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing. Human Mutation, 43(12), 2308-2323. https://doi.org/10.1002/humu.24491

Vancouver

Leman R, Parfait B, Vidaud D, Girodon E, Pacot L, Le Gac G et al. SPiP: Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing. Human Mutation. 2022;43(12):2308-2323. https://doi.org/10.1002/humu.24491

Author

Leman, Raphaël ; Parfait, Béatrice ; Vidaud, Dominique ; Girodon, Emmanuelle ; Pacot, Laurence ; Le Gac, Gérald ; Ka, Chandran ; Ferec, Claude ; Fichou, Yann ; Quesnelle, Céline ; Aucouturier, Camille ; Muller, Etienne ; Vaur, Dominique ; Castera, Laurent ; Boulouard, Flavie ; Ricou, Agathe ; Tubeuf, Hélène ; Soukarieh, Omar ; Gaildrat, Pascaline ; Riant, Florence ; Guillaud-Bataille, Marine ; Caputo, Sandrine M. ; Caux-Moncoutier, Virginie ; Boutry-Kryza, Nadia ; Bonnet-Dorion, Françoise ; Schultz, Ines ; Rossing, Maria ; Quenez, Olivier ; Goldenberg, Louis ; Harter, Valentin ; Parsons, Michael T. ; Spurdle, Amanda B. ; Frébourg, Thierry ; Martins, Alexandra ; Houdayer, Claude ; Krieger, Sophie. / SPiP : Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing. In: Human Mutation. 2022 ; Vol. 43, No. 12. pp. 2308-2323.

Bibtex

@article{e72ead5880fe409e81b402d4c38dafd6,
title = "SPiP: Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing",
abstract = "Modeling splicing is essential for tackling the challenge of variant interpretation as each nucleotide variation can be pathogenic by affecting pre-mRNA splicing via disruption/creation of splicing motifs such as 5′/3′ splice sites, branch sites, or splicing regulatory elements. Unfortunately, most in silico tools focus on a specific type of splicing motif, which is why we developed the Splicing Prediction Pipeline (SPiP) to perform, in one single bioinformatic analysis based on a machine learning approach, a comprehensive assessment of the variant effect on different splicing motifs. We gathered a curated set of 4616 variants scattered all along the sequence of 227 genes, with their corresponding splicing studies. The Bayesian analysis provided us with the number of control variants, that is, variants without impact on splicing, to mimic the deluge of variants from high-throughput sequencing data. Results show that SPiP can deal with the diversity of splicing alterations, with 83.13% sensitivity and 99% specificity to detect spliceogenic variants. Overall performance as measured by area under the receiving operator curve was 0.986, better than SpliceAI and SQUIRLS (0.965 and 0.766) for the same data set. SPiP lends itself to a unique suite for comprehensive prediction of spliceogenicity in the genomic medicine era. SPiP is available at: https://sourceforge.net/projects/splicing-prediction-pipeline/.",
keywords = "machine learning, RNA, sequence variants, SPiP, splicing predictions",
author = "Rapha{\"e}l Leman and B{\'e}atrice Parfait and Dominique Vidaud and Emmanuelle Girodon and Laurence Pacot and {Le Gac}, G{\'e}rald and Chandran Ka and Claude Ferec and Yann Fichou and C{\'e}line Quesnelle and Camille Aucouturier and Etienne Muller and Dominique Vaur and Laurent Castera and Flavie Boulouard and Agathe Ricou and H{\'e}l{\`e}ne Tubeuf and Omar Soukarieh and Pascaline Gaildrat and Florence Riant and Marine Guillaud-Bataille and Caputo, {Sandrine M.} and Virginie Caux-Moncoutier and Nadia Boutry-Kryza and Fran{\c c}oise Bonnet-Dorion and Ines Schultz and Maria Rossing and Olivier Quenez and Louis Goldenberg and Valentin Harter and Parsons, {Michael T.} and Spurdle, {Amanda B.} and Thierry Fr{\'e}bourg and Alexandra Martins and Claude Houdayer and Sophie Krieger",
note = "Publisher Copyright: {\textcopyright} 2022 The Authors. Human Mutation published by Wiley Periodicals LLC.",
year = "2022",
doi = "10.1002/humu.24491",
language = "English",
volume = "43",
pages = "2308--2323",
journal = "Human Mutation",
issn = "1059-7794",
publisher = "JohnWiley & Sons, Inc.",
number = "12",

}

RIS

TY - JOUR

T1 - SPiP

T2 - Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing

AU - Leman, Raphaël

AU - Parfait, Béatrice

AU - Vidaud, Dominique

AU - Girodon, Emmanuelle

AU - Pacot, Laurence

AU - Le Gac, Gérald

AU - Ka, Chandran

AU - Ferec, Claude

AU - Fichou, Yann

AU - Quesnelle, Céline

AU - Aucouturier, Camille

AU - Muller, Etienne

AU - Vaur, Dominique

AU - Castera, Laurent

AU - Boulouard, Flavie

AU - Ricou, Agathe

AU - Tubeuf, Hélène

AU - Soukarieh, Omar

AU - Gaildrat, Pascaline

AU - Riant, Florence

AU - Guillaud-Bataille, Marine

AU - Caputo, Sandrine M.

AU - Caux-Moncoutier, Virginie

AU - Boutry-Kryza, Nadia

AU - Bonnet-Dorion, Françoise

AU - Schultz, Ines

AU - Rossing, Maria

AU - Quenez, Olivier

AU - Goldenberg, Louis

AU - Harter, Valentin

AU - Parsons, Michael T.

AU - Spurdle, Amanda B.

AU - Frébourg, Thierry

AU - Martins, Alexandra

AU - Houdayer, Claude

AU - Krieger, Sophie

N1 - Publisher Copyright: © 2022 The Authors. Human Mutation published by Wiley Periodicals LLC.

PY - 2022

Y1 - 2022

N2 - Modeling splicing is essential for tackling the challenge of variant interpretation as each nucleotide variation can be pathogenic by affecting pre-mRNA splicing via disruption/creation of splicing motifs such as 5′/3′ splice sites, branch sites, or splicing regulatory elements. Unfortunately, most in silico tools focus on a specific type of splicing motif, which is why we developed the Splicing Prediction Pipeline (SPiP) to perform, in one single bioinformatic analysis based on a machine learning approach, a comprehensive assessment of the variant effect on different splicing motifs. We gathered a curated set of 4616 variants scattered all along the sequence of 227 genes, with their corresponding splicing studies. The Bayesian analysis provided us with the number of control variants, that is, variants without impact on splicing, to mimic the deluge of variants from high-throughput sequencing data. Results show that SPiP can deal with the diversity of splicing alterations, with 83.13% sensitivity and 99% specificity to detect spliceogenic variants. Overall performance as measured by area under the receiving operator curve was 0.986, better than SpliceAI and SQUIRLS (0.965 and 0.766) for the same data set. SPiP lends itself to a unique suite for comprehensive prediction of spliceogenicity in the genomic medicine era. SPiP is available at: https://sourceforge.net/projects/splicing-prediction-pipeline/.

AB - Modeling splicing is essential for tackling the challenge of variant interpretation as each nucleotide variation can be pathogenic by affecting pre-mRNA splicing via disruption/creation of splicing motifs such as 5′/3′ splice sites, branch sites, or splicing regulatory elements. Unfortunately, most in silico tools focus on a specific type of splicing motif, which is why we developed the Splicing Prediction Pipeline (SPiP) to perform, in one single bioinformatic analysis based on a machine learning approach, a comprehensive assessment of the variant effect on different splicing motifs. We gathered a curated set of 4616 variants scattered all along the sequence of 227 genes, with their corresponding splicing studies. The Bayesian analysis provided us with the number of control variants, that is, variants without impact on splicing, to mimic the deluge of variants from high-throughput sequencing data. Results show that SPiP can deal with the diversity of splicing alterations, with 83.13% sensitivity and 99% specificity to detect spliceogenic variants. Overall performance as measured by area under the receiving operator curve was 0.986, better than SpliceAI and SQUIRLS (0.965 and 0.766) for the same data set. SPiP lends itself to a unique suite for comprehensive prediction of spliceogenicity in the genomic medicine era. SPiP is available at: https://sourceforge.net/projects/splicing-prediction-pipeline/.

KW - machine learning

KW - RNA

KW - sequence variants

KW - SPiP

KW - splicing predictions

UR - http://www.scopus.com/inward/record.url?scp=85142228279&partnerID=8YFLogxK

U2 - 10.1002/humu.24491

DO - 10.1002/humu.24491

M3 - Journal article

C2 - 36273432

AN - SCOPUS:85142228279

VL - 43

SP - 2308

EP - 2323

JO - Human Mutation

JF - Human Mutation

SN - 1059-7794

IS - 12

ER -

ID: 346412397