GISA: using Gauss Integrals to identify rare conformations in protein structures

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

GISA : using Gauss Integrals to identify rare conformations in protein structures. / Grønbæk, Christian; Hamelryck, Thomas; Røgen, Peter.

I: PeerJ, Bind 8, e9159, 2020.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

Grønbæk, C, Hamelryck, T & Røgen, P 2020, 'GISA: using Gauss Integrals to identify rare conformations in protein structures', PeerJ, bind 8, e9159. https://doi.org/10.7717/peerj.9159

APA

Grønbæk, C., Hamelryck, T., & Røgen, P. (2020). GISA: using Gauss Integrals to identify rare conformations in protein structures. PeerJ, 8, [e9159]. https://doi.org/10.7717/peerj.9159

Vancouver

Grønbæk C, Hamelryck T, Røgen P. GISA: using Gauss Integrals to identify rare conformations in protein structures. PeerJ. 2020;8. e9159. https://doi.org/10.7717/peerj.9159

Author

Grønbæk, Christian ; Hamelryck, Thomas ; Røgen, Peter. / GISA : using Gauss Integrals to identify rare conformations in protein structures. I: PeerJ. 2020 ; Bind 8.

Bibtex

@article{4dbfaa8e785c4e519408dae052cb1cee,
title = "GISA: using Gauss Integrals to identify rare conformations in protein structures",
abstract = "The native structure of a protein is important for its function, and therefore methods for exploring protein structures have attracted much research. However, rather few methods are sensitive to topologic-geometric features, the examples being knots, slipknots, lassos, links, and pokes, and with each method aimed only for a specific set of such configurations. We here propose a general method which transforms a structure into a {"}fingerprint of topological-geometric values{"} consisting in a series of real-valued descriptors from mathematical Knot Theory. The extent to which a structure contains unusual configurations can then be judged from this fingerprint. The method is not confined to a particular pre-defined topology or geometry (like a knot or a poke), and so, unlike existing methods, it is general. To achieve this our new algorithm, GISA, as a key novelty produces the descriptors, so called Gauss integrals, not only for the full chains of a protein but for all its sub-chains. This allows fingerprinting on any scale from local to global. The Gauss integrals are known to be effective descriptors of global protein folds. Applying GISA to sets of several thousand high resolution structures, we first show how the most basic Gauss integral, the writhe, enables swift identification of pre-defined geometries such as pokes and links. We then apply GISA with no restrictions on geometry, to show how it allows identifying rare conformations by finding rare invariant values only. In this unrestricted search, pokes and links are still found, but also knotted conformations, as well as more highly entangled configurations not previously described. Thus, an application of the basic scan method in GISA's tool-box revealed 10 known cases of knots as the top positive writhe cases, while placing at the top of the negative writhe 14 cases in cis-trans isomerases sharing a spatial motif of little secondary structure content, which possibly has gone unnoticed. Possible general applications of GISA are fold classification and structural alignment based on local Gauss integrals. Others include finding errors in protein models and identifying unusual conformations that might be important for protein folding and function. By its broad potential, we believe that GISA will be of general benefit to the structural bioinformatics community. GISA is coded in C and comes as a command line tool. Source and compiled code for GISA plus read-me and examples are publicly available at GitHub (https://github.com).",
keywords = "Protein structure analysis, Knots and links, Gauss integrals, Rare conformations, Sub-chains, Fast algorithm, Database scan, Topology, Geometry",
author = "Christian Gr{\o}nb{\ae}k and Thomas Hamelryck and Peter R{\o}gen",
year = "2020",
doi = "10.7717/peerj.9159",
language = "English",
volume = "8",
journal = "PeerJ",
issn = "2167-8359",
publisher = "PeerJ",

}

RIS

TY - JOUR

T1 - GISA

T2 - using Gauss Integrals to identify rare conformations in protein structures

AU - Grønbæk, Christian

AU - Hamelryck, Thomas

AU - Røgen, Peter

PY - 2020

Y1 - 2020

N2 - The native structure of a protein is important for its function, and therefore methods for exploring protein structures have attracted much research. However, rather few methods are sensitive to topologic-geometric features, the examples being knots, slipknots, lassos, links, and pokes, and with each method aimed only for a specific set of such configurations. We here propose a general method which transforms a structure into a "fingerprint of topological-geometric values" consisting in a series of real-valued descriptors from mathematical Knot Theory. The extent to which a structure contains unusual configurations can then be judged from this fingerprint. The method is not confined to a particular pre-defined topology or geometry (like a knot or a poke), and so, unlike existing methods, it is general. To achieve this our new algorithm, GISA, as a key novelty produces the descriptors, so called Gauss integrals, not only for the full chains of a protein but for all its sub-chains. This allows fingerprinting on any scale from local to global. The Gauss integrals are known to be effective descriptors of global protein folds. Applying GISA to sets of several thousand high resolution structures, we first show how the most basic Gauss integral, the writhe, enables swift identification of pre-defined geometries such as pokes and links. We then apply GISA with no restrictions on geometry, to show how it allows identifying rare conformations by finding rare invariant values only. In this unrestricted search, pokes and links are still found, but also knotted conformations, as well as more highly entangled configurations not previously described. Thus, an application of the basic scan method in GISA's tool-box revealed 10 known cases of knots as the top positive writhe cases, while placing at the top of the negative writhe 14 cases in cis-trans isomerases sharing a spatial motif of little secondary structure content, which possibly has gone unnoticed. Possible general applications of GISA are fold classification and structural alignment based on local Gauss integrals. Others include finding errors in protein models and identifying unusual conformations that might be important for protein folding and function. By its broad potential, we believe that GISA will be of general benefit to the structural bioinformatics community. GISA is coded in C and comes as a command line tool. Source and compiled code for GISA plus read-me and examples are publicly available at GitHub (https://github.com).

AB - The native structure of a protein is important for its function, and therefore methods for exploring protein structures have attracted much research. However, rather few methods are sensitive to topologic-geometric features, the examples being knots, slipknots, lassos, links, and pokes, and with each method aimed only for a specific set of such configurations. We here propose a general method which transforms a structure into a "fingerprint of topological-geometric values" consisting in a series of real-valued descriptors from mathematical Knot Theory. The extent to which a structure contains unusual configurations can then be judged from this fingerprint. The method is not confined to a particular pre-defined topology or geometry (like a knot or a poke), and so, unlike existing methods, it is general. To achieve this our new algorithm, GISA, as a key novelty produces the descriptors, so called Gauss integrals, not only for the full chains of a protein but for all its sub-chains. This allows fingerprinting on any scale from local to global. The Gauss integrals are known to be effective descriptors of global protein folds. Applying GISA to sets of several thousand high resolution structures, we first show how the most basic Gauss integral, the writhe, enables swift identification of pre-defined geometries such as pokes and links. We then apply GISA with no restrictions on geometry, to show how it allows identifying rare conformations by finding rare invariant values only. In this unrestricted search, pokes and links are still found, but also knotted conformations, as well as more highly entangled configurations not previously described. Thus, an application of the basic scan method in GISA's tool-box revealed 10 known cases of knots as the top positive writhe cases, while placing at the top of the negative writhe 14 cases in cis-trans isomerases sharing a spatial motif of little secondary structure content, which possibly has gone unnoticed. Possible general applications of GISA are fold classification and structural alignment based on local Gauss integrals. Others include finding errors in protein models and identifying unusual conformations that might be important for protein folding and function. By its broad potential, we believe that GISA will be of general benefit to the structural bioinformatics community. GISA is coded in C and comes as a command line tool. Source and compiled code for GISA plus read-me and examples are publicly available at GitHub (https://github.com).

KW - Protein structure analysis

KW - Knots and links

KW - Gauss integrals

KW - Rare conformations

KW - Sub-chains

KW - Fast algorithm

KW - Database scan

KW - Topology

KW - Geometry

U2 - 10.7717/peerj.9159

DO - 10.7717/peerj.9159

M3 - Journal article

C2 - 32566389

VL - 8

JO - PeerJ

JF - PeerJ

SN - 2167-8359

M1 - e9159

ER -

ID: 243463017