This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log in
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 May 1;30(9):1236-40.
doi: 10.1093/bioinformatics/btu031. Epub 2014 Jan 21.

InterProScan 5: genome-scale protein function classification

Affiliations

InterProScan 5: genome-scale protein function classification

Philip Jones et al. Bioinformatics. .

Abstract

Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe a new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions to the outputs of the software and the complete reimplementation of the software framework, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis. InterProScan is freely available for download from the EMBl-EBI FTP site and the open source code is hosted at Google Code.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Comparison of the processing steps used by two different member database applications, TMHMM and Pfam
Fig. 2.
Fig. 2.
Overall system architecture of InterProScan 5
Fig. 3.
Fig. 3.
Use of JMS to manage allocation of jobs across a compute resource. This figure shows the primary tier of Master JVM-spawned workers. Jobs are added to a RequestQueue by the Master JVM, and any available worker JVMs will poll this queue to request work
Fig. 4.
Fig. 4.
Portion of the graphical output from InterProScan 5. This view of a protein’s match data is the same in both the HTML and SVG formats

References

    1. Altschul SF, et al. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. - PubMed
    1. Attwood TK, et al. The PRINTS database: a fine-grained protein sequence annotation and analysis resource–its status in 2012. Database. 2012;2012:bas019. - PMC - PubMed
    1. Bairoch A. The ENZYME database in 2000. Nucleic Acids Res. 2000;28:304–305. - PMC - PubMed
    1. Bru C, et al. The ProDom database of protein domain families: more emphasis on 3D. Nucleic Acids Res. 2005;33:D212–D215. - PMC - PubMed
    1. Eddy SR. A new generation of homology search tools based on probabilistic inference. Genome Inform. 2009;23:205–211. - PubMed

Publication types

Cite

AltStyle によって変換されたページ (->オリジナル) /