데이터셋 상세
미국
A study of quality measures for protein threading models
Background Prediction of protein structures is one of the fundamental challenges in biology today. To fully understand how well different prediction methods perform, it is necessary to use measures that evaluate their performance. Every two years, starting in 1994, the CASP (Critical Assessment of protein Structure Prediction) process has been organized to evaluate the ability of different predictors to blindly predict the structure of proteins. To capture different features of the models, several measures have been developed during the CASP processes. However, these measures have not been examined in detail before. In an attempt to develop fully automatic measures that can be used in CASP, as well as in other type of benchmarking experiments, we have compared twenty-one measures. These measures include the measures used in CASP3 and CASP2 as well as have measures introduced later. We have studied their ability to distinguish between the better and worse models submitted to CASP3 and the correlation between them. Results Using a small set of 1340 models for 23 different targets we show that most methods correlate with each other. Most pairs of measures show a correlation coefficient of about 0.5. The correlation is slightly higher for measures of similar types. We found that a significant problem when developing automatic measures is how to deal with proteins of different length. Also the comparisons between different measures is complicated as many measures are dependent on the size of the target. We show that the manual assessment can be reproduced to about 70% using automatic measures. Alignment independent measures, detects slightly more of the models with the correct fold, while alignment dependent measures agree better when selecting the best models for each target. Finally we show that using automatic measures would, to a large extent, reproduce the assessors ranking of the predictors at CASP3. Conclusions We show that given a sufficient number of targets the manual and automatic measures would have given almost identical results at CASP3. If the intent is to reproduce the type of scoring done by the manual assessor in in CASP3, the best approach might be to use a combination of alignment independent and alignment dependent measures, as used in several recent studies.
데이터 정보
연관 데이터
Cross-species molecular docking method to support predictions of species susceptibility to chemical effects
공공데이터포털
The advancement of protein structural prediction tools, exemplified by AlphaFold and Iterative Threading ASSEmbly Refinement, has enabled the prediction of protein structures across species based on available protein sequence and structural data. In this study, we introduce an innovative molecular docking method that capitalizes on this wealth of structural data to enhance predictions of chemical susceptibility across species. We demonstrated this method using the androgen receptor as a pertinent modulator of endocrine function. By using protein structures, this method contextualizes species susceptibility within a functional framework and helps to integrate molecular docking into the repertoire of New Approach Methodologies (NAMs) that support the Next-Generation Risk Assessment (NGRA) paradigm through the novel integration of various open-source tools. This dataset is associated with the following publication: Schumann, P., D. Chang, S. Mayasich, S. Vliet, T. Brown, and C. LaLone. Cross-species molecular docking method to support predictions of species susceptibility to chemical effects. Computational Toxicology. Elsevier B.V., Amsterdam, NETHERLANDS, 30(4): 100319, (2024).
Within the fold: assessing differential expression measures and reproducibility in microarray assays
공공데이터포털
Fold-change' cutoffs have been widely used in microarray assays to identify genes that are differentially expressed. More accurate measures are required to identify high-confidence sets of genes with biologically meaningful changes in transcription. A general procedure for analyzing cDNA microarray data is proposed and validated. It is shown that pooled reference samples should be based not only on the expression of individual genes in each cell line but also on the expression levels of genes within cell lines.
Demonstration of the Sequence Alignment to Predict Across Species Susceptibility Tool for Rapid Assessment of Protein Conservation
공공데이터포털
Data file for "Vliet SMF, Hazemi M, Blatz D, Jensen M, Mayasich S, Transue TR, Simmons C, Wilkinson A, LaLone CA. Demonstration of the Sequence Alignment to Predict Across Species Susceptibility Tool for Rapid Assessment of Protein Conservation. J Vis Exp. 2023 Feb 10;(192). doi: 10.3791/63970. PMID: 36847398.". This dataset is associated with the following publication: Vliet, S., M. Hazemi, D. Blatz, M. Jensen, S. Mayasich, T. Transue, C. Simmons, A. Wilkinson, and C. Lalone. Demonstration of the Sequence Alignment to Predict Across Species Susceptibility Tool for Rapid Assessment of Protein Conservation. Journal of Visualized Experiments. JoVE, Somerville, MA, USA, 192, (2023).
Demonstration of the Sequence Alignment to Predict Across Species Susceptibility Tool for Rapid Assessment of Protein Conservation
공공데이터포털
Data file for "Vliet SMF, Hazemi M, Blatz D, Jensen M, Mayasich S, Transue TR, Simmons C, Wilkinson A, LaLone CA. Demonstration of the Sequence Alignment to Predict Across Species Susceptibility Tool for Rapid Assessment of Protein Conservation. J Vis Exp. 2023 Feb 10;(192). doi: 10.3791/63970. PMID: 36847398.". This dataset is associated with the following publication: Vliet, S., M. Hazemi, D. Blatz, M. Jensen, S. Mayasich, T. Transue, C. Simmons, A. Wilkinson, and C. Lalone. Demonstration of the Sequence Alignment to Predict Across Species Susceptibility Tool for Rapid Assessment of Protein Conservation. Journal of Visualized Experiments. JoVE, Somerville, MA, USA, 192, (2023).
Structure - Molecular Modeling Database (MMDB)
공공데이터포털
Three dimensional structures provide a wealth of information on the biological function and the evolutionary history of macromolecules. They can be used to examine sequence-structure-function relationships, interactions, active sites, and more.
광주광역시 유가공업 HACCP 인증업소 현황
공공데이터포털
광주광역시 관내 유가공업 HACCP(식품안전관리인증기준) 인증업소의 자치구명, 주소, 인증번호, 최초인증일에 관한 데이터입니다.
Protein
공공데이터포털
The Protein database is a collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and TPA, as well as records from SwissProt, PIR, PRF, and PDB. Protein sequences are the fundamental determinants of biological structure and function.
Vector Alignment Search Tool (VAST)
공공데이터포털
A computer algorithm that identifies similar protein 3-dimensional structures. Structure neighbors for every structure in MMDB are pre-computed and accessible via links on the MMDB Structure Summary pages.