Data detail | |||||||||||||||||||||||||||||||||||||||||
| info Data name | Cluster based on sequence comparison of homologous proteins of 95 organism species | ||||||||||||||||||||||||||||||||||||||||
| info DOI | 10.18908/lsdba.nbdc00464-002 | ||||||||||||||||||||||||||||||||||||||||
| info Description of data contents | Clustering was performed by the method in which the round-robin BLAST search of the above amino acid sequence data is performed, the E-value and the overlap score (the All-against-all BLASTP search of the above amino acid sequence data, and heuristic estimation of a similarity threshold for homologs of each protein by entropy-optimized organism count method (Bioinformatics 2009 Mar 1;25(5):599-605.). The data are given in a CSV format text file. | ||||||||||||||||||||||||||||||||||||||||
| info Data file | File name: gclust_cluster.zip File size: 8.72MB | ||||||||||||||||||||||||||||||||||||||||
| info Simple search URL | http://togodb.biosciencedbc.jp/togodb/view/gclust_cluster#en | ||||||||||||||||||||||||||||||||||||||||
| info Data acquisition method | Sequence data stated in "Amino acid sequences of predicted proteins and their annotation for 95 organism species". | ||||||||||||||||||||||||||||||||||||||||
| info Data analysis method | All-against-all BLASTP search of the above amino acid sequence data, and heuristic estimation of a similarity threshold for homologs of each protein by entropy-optimized organism count method (Bioinformatics 2009 Mar 1;25(5):599-605.). | ||||||||||||||||||||||||||||||||||||||||
| info Number of data entries | 206,764 entries | ||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||