Accurate continuous geographic assignment from low- to high-density SNP data

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

Accurate continuous geographic assignment from low- to high-density SNP data. / Guillot, Gilles; Jónsson, Hákon; Hinge, Antoine; Manchih, Nabil; Orlando, Ludovic Antoine Alexandre.

I: Bioinformatics, Bind 32, Nr. 7, 2016, s. 1106-1108.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

Guillot, G, Jónsson, H, Hinge, A, Manchih, N & Orlando, LAA 2016, 'Accurate continuous geographic assignment from low- to high-density SNP data', Bioinformatics, bind 32, nr. 7, s. 1106-1108. https://doi.org/10.1093/bioinformatics/btv703

APA

Guillot, G., Jónsson, H., Hinge, A., Manchih, N., & Orlando, L. A. A. (2016). Accurate continuous geographic assignment from low- to high-density SNP data. Bioinformatics, 32(7), 1106-1108. https://doi.org/10.1093/bioinformatics/btv703

Vancouver

Guillot G, Jónsson H, Hinge A, Manchih N, Orlando LAA. Accurate continuous geographic assignment from low- to high-density SNP data. Bioinformatics. 2016;32(7):1106-1108. https://doi.org/10.1093/bioinformatics/btv703

Author

Guillot, Gilles ; Jónsson, Hákon ; Hinge, Antoine ; Manchih, Nabil ; Orlando, Ludovic Antoine Alexandre. / Accurate continuous geographic assignment from low- to high-density SNP data. I: Bioinformatics. 2016 ; Bind 32, Nr. 7. s. 1106-1108.

Bibtex

@article{bff59e2f993d4f36b1a69785b521cf52,
title = "Accurate continuous geographic assignment from low- to high-density SNP data",
abstract = "MOTIVATION: Large-scale genotype datasets can help track the dispersal patterns of epidemiological outbreaks and predict the geographic origins of individuals. Such genetically-based geographic assignments also show a range of possible applications in forensics for profiling both victims and criminals, and in wildlife management, where poaching hotspot areas can be located. They, however, require fast and accurate statistical methods to handle the growing amount of genetic information made available from genotype arrays and next-generation sequencing technologies.RESULTS: We introduce a novel statistical method for geopositioning individuals of unknown origin from genotypes. Our method is based on a geostatistical model trained with a dataset of georeferenced genotypes. Statistical inference under this model can be implemented within the theoretical framework of Integrated Nested Laplace Approximation, which represents one of the major recent breakthroughs in statistics, as it does not require Monte Carlo simulations. We compare the performance of our method and an alternative method for geospatial inference, SPA in a simulation framework. We highlight the accuracy and limits of continuous spatial assignment methods at various scales by analyzing genotype datasets from a diversity of species, including Florida Scrub-jay birds Aphelocoma coerulescens, Arabidopsis thaliana and humans, representing 41-197,146 SNPs. Our method appears to be best suited for the analysis of medium-sized datasets (a few tens of thousands of loci), such as reduced-representation sequencing data that become increasingly available in ecology.AVAILABILITY AND IMPLEMENTATION: http://www2.imm.dtu.dk/∼gigu/Spasiba/ CONTACT: gilles.b.guillot@gmail.comSupplementary information: Supplementary data are available at Bioinformatics online.",
author = "Gilles Guillot and H{\'a}kon J{\'o}nsson and Antoine Hinge and Nabil Manchih and Orlando, {Ludovic Antoine Alexandre}",
note = "{\textcopyright} The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.",
year = "2016",
doi = "10.1093/bioinformatics/btv703",
language = "English",
volume = "32",
pages = "1106--1108",
journal = "Computer Applications in the Biosciences",
issn = "1471-2105",
publisher = "Oxford University Press",
number = "7",

}

RIS

TY - JOUR

T1 - Accurate continuous geographic assignment from low- to high-density SNP data

AU - Guillot, Gilles

AU - Jónsson, Hákon

AU - Hinge, Antoine

AU - Manchih, Nabil

AU - Orlando, Ludovic Antoine Alexandre

N1 - © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

PY - 2016

Y1 - 2016

N2 - MOTIVATION: Large-scale genotype datasets can help track the dispersal patterns of epidemiological outbreaks and predict the geographic origins of individuals. Such genetically-based geographic assignments also show a range of possible applications in forensics for profiling both victims and criminals, and in wildlife management, where poaching hotspot areas can be located. They, however, require fast and accurate statistical methods to handle the growing amount of genetic information made available from genotype arrays and next-generation sequencing technologies.RESULTS: We introduce a novel statistical method for geopositioning individuals of unknown origin from genotypes. Our method is based on a geostatistical model trained with a dataset of georeferenced genotypes. Statistical inference under this model can be implemented within the theoretical framework of Integrated Nested Laplace Approximation, which represents one of the major recent breakthroughs in statistics, as it does not require Monte Carlo simulations. We compare the performance of our method and an alternative method for geospatial inference, SPA in a simulation framework. We highlight the accuracy and limits of continuous spatial assignment methods at various scales by analyzing genotype datasets from a diversity of species, including Florida Scrub-jay birds Aphelocoma coerulescens, Arabidopsis thaliana and humans, representing 41-197,146 SNPs. Our method appears to be best suited for the analysis of medium-sized datasets (a few tens of thousands of loci), such as reduced-representation sequencing data that become increasingly available in ecology.AVAILABILITY AND IMPLEMENTATION: http://www2.imm.dtu.dk/∼gigu/Spasiba/ CONTACT: gilles.b.guillot@gmail.comSupplementary information: Supplementary data are available at Bioinformatics online.

AB - MOTIVATION: Large-scale genotype datasets can help track the dispersal patterns of epidemiological outbreaks and predict the geographic origins of individuals. Such genetically-based geographic assignments also show a range of possible applications in forensics for profiling both victims and criminals, and in wildlife management, where poaching hotspot areas can be located. They, however, require fast and accurate statistical methods to handle the growing amount of genetic information made available from genotype arrays and next-generation sequencing technologies.RESULTS: We introduce a novel statistical method for geopositioning individuals of unknown origin from genotypes. Our method is based on a geostatistical model trained with a dataset of georeferenced genotypes. Statistical inference under this model can be implemented within the theoretical framework of Integrated Nested Laplace Approximation, which represents one of the major recent breakthroughs in statistics, as it does not require Monte Carlo simulations. We compare the performance of our method and an alternative method for geospatial inference, SPA in a simulation framework. We highlight the accuracy and limits of continuous spatial assignment methods at various scales by analyzing genotype datasets from a diversity of species, including Florida Scrub-jay birds Aphelocoma coerulescens, Arabidopsis thaliana and humans, representing 41-197,146 SNPs. Our method appears to be best suited for the analysis of medium-sized datasets (a few tens of thousands of loci), such as reduced-representation sequencing data that become increasingly available in ecology.AVAILABILITY AND IMPLEMENTATION: http://www2.imm.dtu.dk/∼gigu/Spasiba/ CONTACT: gilles.b.guillot@gmail.comSupplementary information: Supplementary data are available at Bioinformatics online.

U2 - 10.1093/bioinformatics/btv703

DO - 10.1093/bioinformatics/btv703

M3 - Journal article

C2 - 26615214

VL - 32

SP - 1106

EP - 1108

JO - Computer Applications in the Biosciences

JF - Computer Applications in the Biosciences

SN - 1471-2105

IS - 7

ER -

ID: 160577155