Global distance test

The global distance test (GDT), also written as GDT_TS to represent "total score", is a measure of similarity between two protein structures with known amino acid correspondences (e.g. identical amino acid sequences) but different tertiary structures. It is most commonly used to compare the results of protein structure prediction to the experimentally determined structure as measured by X-ray crystallography or protein NMR. The GDT metric as described by its autor Adam Zemla[1] is intended as a more accurate measurement than the more common RMSD metric, which is sensitive to outlier regions created for example by poor modeling of individual loop regions in a structure that is otherwise reasonably accurate. GDT_TS measurements are used as major assessment criteria in the production of results from the Critical Assessment of Structure Prediction (CASP), a large-scale experiment in the structure prediction community dedicated to assessing current modeling techniques and identifying their primary deficiencies.[1][2][3] In general, the higher GDT_TS is, the better a given model is in comparison to reference structure.

The GDT score is calculated as the largest set of amino acid residues' alpha carbon atoms in the model structure falling within a defined distance cutoff of their position in the experimental structure, after superimposing the two structures. By the original design (Patent No.: US 8,024,127 B2) the GDT algorithm calculates 20 GDT scores, i.e. for each of 20 consecutive distance cutoffs (0.5 Å, 1.0 Å, 1.5 Å, ... 10.0 Å). For structure similarity assessment it is intended to use the GDT scores from several cutoff distances, and scores generally increase with increasing cutoff. A plateau in this increase may indicate an extreme divergence between the experimental and predicted structures, such that no additional atoms are included in any cutoff of a reasonable distance (see GDT plots). The conventional GDT_TS total score in CASP is the average result of cutoffs at 1, 2, 4, and 8 Å.[1][4]

The original GDT_TS is calculated based on the superimpositions and GDT scores produced by the Local Global Alignment (LGA) program.[1] A high accuracy version called GDT_HA is done by selection of smaller cutoff distances (half the size of GDT_TS) and thus is more rigorous. It was used in the high accuracy category of CASP7.[5] CASP8 defines a new "TR score", which is GDT_TS minus a penalty for residues clustered to close, meant to represent steric clashes invented by the predictor, sometimes to game the cutoff measure of GDT.[6][7] The primary GDT assessment uses only the Cα atoms. To apply superposition‐based scoring to the functional ends of protein sidechains, a GDT‐like score called global distance calculation for sidechains (GDC_sc) has been designed and implemented within the LGA program in 2008.[1][8] Instead of comparing residue positions on the basis of Cαs, GDC_sc uses a characteristic atom near the end of each sidechain type for the evaluation of residue–residue distance deviations. An "all atoms" variant of the GDC score (GDC_all) is calculated using full-model information, and is one of the standard measures used by CASP's organizers and assessors to evaluate accuracy of predicted structural models.[8][9]

See also

  • Root mean square deviation (bioinformatics) A different structure comparison measure.
  • TM-score A different structure comparison measure.
  • Longest continuous segment (LCS) A different structure comparison measure implemented within the LGA program.
  • Global distance calculation (GDC_sc, GDC_all) Structure comparison measures that use full-model information (not just α-carbon) to assess similarity.
  • Local global alignment (LGA) Structure alignment program and structure similarity measures GDT, GDC, LCS, and LGA_S.

References

  1. Zemla A (2003). "LGA: A method for finding 3D similarities in protein structures". Nucleic Acids Research. 31 (13): 3370–3374. doi:10.1093/nar/gkg571. PMC 168977. PMID 12824330.
  2. Zemla A, Venclovas C, Moult J, Fidelis K (1999). "Processing and analysis of CASP3 protein structure predictions". Proteins. S3: 22–29. doi:10.1002/(SICI)1097-0134(1999)37:3+<22::AID-PROT5>3.0.CO;2-W. PMID 10526349.
  3. Zemla A, Venclovas C, Moult J, Fidelis K (2001). "Processing and evaluation of predictions in CASP4". Proteins. 45 (S5): 13–21. doi:10.1002/prot.10052. PMID 11835478.
  4. Kryshtafovych, A; Prlic, A; Dmytriv, Z; Daniluk, P; Milostan, M; Eyrich, V; Hubbard, T; Fidelis, K (2007). "New tools and expanded data analysis capabilities at the Protein Structure Prediction Center". Proteins. 69 Suppl 8: 19–26. doi:10.1002/prot.21653. PMC 2656758. PMID 17705273.
  5. Read, Randy J.; Chavali, Gayatri (2007). "Assessment of CASP7 predictions in the high accuracy template-based modeling category". Proteins. 69 (S8): 27–37. doi:10.1002/prot.21662. PMID 17894351.
  6. Shi, S; Pei, J; Sadreyev, RI; Kinch, LN; Majumdar, I; Tong, J; Cheng, H; Kim, BH; Grishin, NV (2009). "Analysis of CASP8 targets, predictions and assessment methods". Database: The Journal of Biological Databases and Curation. 2009: bap003. doi:10.1093/database/bap003. PMC 2794793. PMID 20157476.. Related page
  7. Sadreyev, RI; Shi, S; Baker, D; Grishin, NV (15 May 2009). "Structure similarity measure with penalty for close non-equivalent residues". Bioinformatics. 25 (10): 1259–63. doi:10.1093/bioinformatics/btp148. PMC 2677741. PMID 19321733.
  8. Keedy, D.A.; Williams, CJ; Headd, JJ; Arendall, WB; Chen, VB; Kapral, GJ; Gillespie, RA; Block, JN; Zemla, A; Richardson, DC; Richardson, JS (2009). "The other 90% of the protein: Assessment beyond the α-carbon for CASP8 template-based and high-accuracy models". Proteins. 77 (Suppl 9): 29–49. doi:10.1002/prot.22551. PMC 2877634. PMID 19731372.
  9. Modi V, Xu QF, Adhikari S, Dunbrack RL (2016). "Assessment of template‐based modeling of protein structure in CASP11". Proteins. 84: 200–220. doi:10.1002/prot.25049. PMC 5030193. PMID 27081927.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.