Mass spectrometry-based proteomics data from thousands of HeLa control samples

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Dokumenter

  • Fulltext

    Forlagets udgivne version, 2,16 MB, PDF-dokument

Here we provide a curated, large scale, label free mass spectrometry-based proteomics data set derived from HeLa cell lines for general purpose machine learning and analysis. Data access and filtering is a tedious task, which takes up considerable amounts of time for researchers. Therefore we provide machine based metadata for easy selection and overview along the 7,444 raw files and MaxQuant search output. For convenience, we provide three filtered and aggregated development datasets on the protein groups, peptides and precursors level. Next to providing easy to access training data, we provide a SDRF file annotating each raw file with instrument settings allowing automated reprocessing. We encourage others to enlarge this data set by instrument runs of further HeLa samples from different machine types by providing our workflows and analysis scripts.
OriginalsprogEngelsk
Artikelnummer112
TidsskriftScientific Data
Vol/bind11
Antal sider7
ISSN2052-4463
DOI
StatusUdgivet - 2024

Bibliografisk note

Funding Information:
We would like to acknowledge Rebeca Quinones, Mario Oroshi and John Damm Sørensen for their huge effort to retrieve all QC and MNT HeLa raw files for the creation of our development data set. Furthermore, we would like to thank Jeppe Madsen and Martin Rykær for their time and expertise. Lastly, we would like to thank the proteomics groups that have kindly shared their MNT and QC HeLa files, both at Novo Nordisk Foundation Center for Protein Research and at the Proteomics and Signal Transduction department of the MaxPlanck institute of biochemistry. H.W. was supported by the Novo Nordisk Foundation (grant NNF19SA0035440). H.W. and S.R. were supported by the Novo Nordisk Foundation (grant NNF14CC0001, NNF21SA0072102). A.B.N. was supported by the Novo Nordisk Foundation (grant NNF14CC0001). Y.P.-R. acknowledges funding from EMBL core funding, Wellcome grants (208391/Z/17/Z, 223745/Z/21/Z), and the EU H2020 project EPIC-XS [823839]. Finally, this work would not have been possible without many researchers doing experiments and running files.

Funding Information:
We would like to acknowledge Rebeca Quinones, Mario Oroshi and John Damm Sørensen for their huge effort to retrieve all QC and MNT HeLa raw files for the creation of our development data set. Furthermore, we would like to thank Jeppe Madsen and Martin Rykær for their time and expertise. Lastly, we would like to thank the proteomics groups that have kindly shared their MNT and QC HeLa files, both at Novo Nordisk Foundation Center for Protein Research and at the Proteomics and Signal Transduction department of the MaxPlanck institute of biochemistry. H.W. was supported by the Novo Nordisk Foundation (grant NNF19SA0035440). H.W. and S.R. were supported by the Novo Nordisk Foundation (grant NNF14CC0001, NNF21SA0072102). A.B.N. was supported by the Novo Nordisk Foundation (grant NNF14CC0001). Y.P.-R. acknowledges funding from EMBL core funding, Wellcome grants (208391/Z/17/Z, 223745/Z/21/Z), and the EU H2020 project EPIC-XS [823839]. Finally, this work would not have been possible without many researchers doing experiments and running files.

Publisher Copyright:
© 2024, The Author(s).

ID: 381888756