ClusCo: clustering and comparison of protein models (1302.4000v2)
Abstract: Background: The development, optimization and validation of protein modeling methods require efficient tools for structural comparison. Frequently, a large number of models need to be compared with the target native structure. The main reason for the development of Clusco software was to create a high-throughput tool for all-versus-all comparison, because calculating similarity matrix is the one of the bottlenecks in the protein modeling pipeline. Results: Clusco is fast and easy-to-use software for high-throughput comparison of protein models with different similarity measures (cRMSD, dRMSD, GDT_TS, TM-Score, MaxSub, Contact Map Overlap) and clustering of the comparison results with standard methods: K-means Clustering or Hierarchical Agglomerative Clustering. Conclusions: The application was highly optimized and written in C/C++, including the code for parallel execution on CPU and GPU version of cRMSD, which resulted in a significant speedup over similar clustering and scoring computation programs.