IUBIO ARCHIVE FOR BIOLOGY IUBio Archive is an archive of biology data and software. The archive includes items to browse, search and fetch public software, molecular data, biology news and documents. It is a public archive that you will find at the Internet address iubio.bio.indiana.edu. This archive, maintained at Indiana University Biology department since 1989, has moved recently from one computer to another. Access to the archive is via HTTP (world wide web), Internet Gopher, anonymous FTP (file transfer) and e-mail programs that connect to computers on the Internet. ftp:, gopher:, http://iubio.bio.indiana.edu/ mailto: archive@iubio.bio.indiana.edu Molecular biology is the area of concentration, and it is also a home for Drosophila research data. The archive maintains software for all computer systems important to biology. Public software categories served here include biology, chemistry, science, utilities, molecular biology sections including alignment, codon, autoseq, browsing, consensus, evolve, pattern, primer, restrict-enzymes, rna-fold, search, ibmpc, mac, mswin, unix, vax See also IUBio Bio-Mirror archive of large data sets at ftp to iubio.bio.indiana.edu, cd /biomirror This includes GenBank, EMBL and DDBJ and other biosequence data. Search services include GenBank nucleic databank (WAIS query of full release and weekly updates) Swiss-Prot and PIR protein databanks Bionet news (full, searchable archive since Dec 1990) SRS (Sequence Retrieval System) - including Genbank, protein and others for both simple and sophisticated text/annotation queries of databanks SRS-FastA - for sequence similarity searches against any user-defined subset of Genbank & protein data, including subsets for popular species Arabidopsis, Caenorhabditis, Drosophila, Oryza, Poaeceae, Rattus, Murinae, and Saccharomyces This is a user- and developer- supported archive. Authors of software are invited to send their work to the Incoming folder by FTP. E-mail access -- for most IUBio services, including software, Genbank, Swiss-protein, FlyBase and other databank searches, Bionet news searches, and software archives, send a mail message to archive@iubio.bio.indiana.edu FlyBase, a Drosophila Genome Database service is accessible now separately at ftp:, http://flybase.bio.indiana.edu/ mailto: flybase-server@flybase.bio.indiana.edu These IP addresses and computer names can and do change, but the service name (iubio or flybase) will be maintained. ACCESS TO IUBio ARCHIVE ----------------------- This IUBio Archive is on the Internet network of computers with the name IUBio.bio.indiana.edu The actual host computer and Internet number for this archive may change. An alternate Internet name that has been in use for several years, and will continue to work, is FTP.Bio.Indiana.Edu. The best way to use the resources of this archive now is thru either Internet Gopher or thru World Wide Web. Among these several advantages over FTP are: Gopher and WWW are easier to use than FTP; Gopher and WWW allow a richer set of information services, including searching thru indexed data, and retrieving "typed" files, such as pictures and formatted documents; Gopher and WWW cause less network traffic and load on the archive computer. From a gopher program, connect to iubio.bio.indiana.edu as the gopher server. From a WWW program, connect to this URL: http://iubio.bio.indiana.edu/ If your computer system is linked to the Internet, it probably has an FTP program even if it doesn't have an Internet Gopher program. Each FTP program has it's own peculiarities, but most follow a general syntax: ftp ftp.bio.indiana.edu -- connect to archive computer user: anonymous -- log on to archive computer password: your e-mail address ? or help -- general help for ftp cd subdirectory -- change to subdirectory cd .. -- change to superdirectory binary -- use full binary transfer ascii -- use text transfer get any.file -- fetch a file from the archive put my.file -- put a file to the archive (only for Incoming/ directory) bye -- close the connection This archive uses Unix conventions and software. *PLEASE NOTE*, all FTP commands and file names are CASE-Sensitive. Mostly these are all lowercase. You will need to use your shift key some places, like "Incoming" and "Readme". See below for a detailed example session. For all IUBio services, including Genbank, Swiss-protein, FlyBase and other databank searches, Bionet news searches, and software archives, send a mail message to gopher@iubio.bio.indiana.edu For only FlyBase Drosophila information, send the mail message to gopher@flybase.bio.indiana.edu In both cases, you start by sending mail with any or no subject, and any or no message body. GopherMail will reply by sending you it's main gopher menu. The return mail provides you a menu of selections, then you mark your choices, and return that mail. Repeat this process to select many items. Use the "Subject:" of your mail mesage for queries to search items like Genbank or FlyBase Genes. Send the word "help" in the subject line to get help on this service. GopherMail software was written by Fred Bremmer. CURRENT CONTENTS OF THE ARCHIVE ------------------------------- An abbreviated directory of the archive Archive.doc About this archive (this document) ls-lR.Z Full list of archive files, in unix compressed format (used for mirroring, use 'get ls-lR' for readable list) biology/ General biology chemistry/ Chemistry flybase/ Drosophila genome data, stocks, people and documents help/ Help documents molbio/ Molecular biology science/ General sciences usenet/ Archive of biology news from Usenet util/ Computer and archive utilities Incoming/ Contributions go here ./molbio: align/ Sequence alignment codon/ Codon tables data/ Molecular data evolve/ Evolution and phylogeny ibmpc/ MSDOS software journals/ Table of contents of journals mac/ Macintosh software primer/ PCR and primer calculation software restrict-enz/ Restriction enzyme software and data rnafold/ RNA secondary structure search/ Databank searching unix/ Unix specific software vax/ VMS-Vax specific software The folders bin/, dev/, etc/, and usr/ are for archive housekeeping; please ignore them. Thru Internet Gopher, there are several additional items of general interest to biologists, including GenBank, the databank of all gene sequences, BIOSCI Network News, Prosite database, the Genome of Drosophila book, and other biology data are indexed for key word searching. You can also find links to all other BioGophers in this archive, with easy access to even more biology data, software and other information. Much of the software includes a short abstract or "readme" file describing it and its requirements. This is normally named the same as the software file, but with the suffix ".readme". These readme files, and other help and information files in the archive can be searched via Gopher to let you find software based on key words you may know (e.g., find software for signal sequence prediction). Most of the software is as received from the authors or contributor. In the case of software source, it may or may not be ready to compile and run on a given operating system. Even those programs that do run may require adjustments to read the data formats that you use. I have edited and recompiled many of the programs in source form for different platforms to one extent or another, and can say that most will run but may require the talents of a programmer to install them. The archive of Bionet newsgroups, Sci.Bio and Info-GCG from the Usenet electronic news media was added starting with news from about 1 Dec 91. See the /usenet/Readme file for details. This news archive can be searched for specific items via Gopher. If you have suggestions, questions or comments, please let me know. Addresses are listed below. HISTORY ------- This archive was first started in October 1989 on the computer called IUBio.Bio.Indiana.Edu. At that time, the archive was my personal reference collection of public molecular biology software and data. I made the archive available to others because the only similar archive available at the time, at BioNet, closed it's public software operation. It seemed little extra effort for me to make my collection available to others (my mistake...). During the summer of 1991, the archive moved to Cricket.Bio. Indiana.Edu, my desktop Macintosh running Apple Unix. This computer has served well, handling the traffic of 300 callers/week and 1000 files/week transferred to people around the world. Somewhere around this time, the California Education and Research Federation Network selected this archive as it's 1991 Biological Sciences winner for its CERFnet award for excellence in networked applications. As of Nov 1991, the archive moved to a computer called Fly.Bio.Indiana.Edu. Through the Bloomington Drosophila stock center, NSF's Division of Instrumentation and Resources has provided the funds for the purchase of a Sun Sparcstation 2 to function as the fly community's database server, and also to host the public IUBio archive. By 1992, the archive through FTP and Gopher had been serving out data to the biology scientific community at high rates. In 1992, roughly 10,000 files per month were FTP'ed from the archive, and roughly 20,00 gopher transactions per month were recorded. By 1993 - 1994. this had grown to over 100,000 gopher transactions per month, and around 18,000 FTP transfers per month. ACKNOWLEDGEMENTS ---------------- I would like to thank all of the hard working and often under-acknowledged authors who have contributed their software and data collections to this and other public archives. It is they who deserve credit for what this and other public archives mean to you and your ability to do your work. Please keep in mind that frequently the authors of these works receive little or no credit from their peers, the authors represented in this archive for the most part receive no money for their efforts, which are commonly done on weekends, evenings, vacations. If you would, please remember to mention the authors who make their software and databases available to you as it aids your work. Any program or database that is available publicly is published. If the author does not have a paper publication that you can cite, I recommend this form: Doe, John, 1991*. (Insert title of software or database). Published electronically on the Internet, available at ftp://iubio.bio.indiana.edu/ .% * If no date is given explicitly, use the file dates. % Substitute the universal resource locator (URL) for the given package. You need not cite this archive for files you obtain here, however you may consider it equivalent to a paper journal or book in some ways. It is certainly an information resource. The proper citation for this public archive is: Gilbert, D.G., 1989. IUBio archive of molecular and general biology software and data. An Internet resource available at ftp,gopher,http://iubio.bio.indiana.edu. I am very pleased to acknowledge the many people and organizations that have helped support IUBio archive. Much of the current hardware is made available by co-operation with FlyBase, supported by a grant from the US National Institutes of Health. Current and past equipment support has come from Indiana University, National Science Foundation, and many of you who use the archive. (see http://iubio.bio.indiana.edu/contributors ). In early 1993, this archive faced a space-crunch. Its only 1-gigabyte disk was full. I was faced with throwing out valuable software & data, and spending a long time searching for government grant funds. When the community of users learned of this predicament, many of them quickly and cheerfully sent money to expand the disk space of this archive. Through the contributions of these many fine people, companies and institutions, we were able to triple the storage space of for this archive. The archive now has a home of its own, provided by several of you who use and benefit from it, and no longer has to rely exclusively on extra resources from other projects. See the "Contributors" document for a list of the many who have helped keep this archive a growing resource for you. CONTRIBUTING TO THE ARCHIVE --------------------------- Contributions of broad interest in any area of biology, and related areas of chemistry and other sciences, are welcome. These may be software or data. Contributions of interest over several computer platforms should either be plain text files or .ZIP or .TAR.Z archives. You may put your contribution in the "Incoming" directory, using your FTP put command. You may also send e-mail compatible files (usually .UUE or .HQX encoded files or plain text) to Archive@Bio.Indiana.Edu (preferred Internet address) Don Gilbert, BioComputing Office (land mail) Biology Department, Indiana University Bloomington, IN 47405 USA Any general mail about the archive should be addressed here also. USING THE ARCHIVE VIA FTP ------------------------- Using Internet Gopher is generally very simple, either a point'n'click operation on computers with graphic user interfaces, or simple menu operations on character terminals. FTP is still primarily a command-line service. You need to know and remember the commands to use it. Here is a brief primer. To change from the main directory to a subdiretory, use the command: ftp> cd subname To move up one level, ftp> cd .. To move over to another subdirectory at the same level, ftp> cd ../anothersub Some examples, ftp> cd molbio/align ftp> cd /molbio/evolve Some of the programs are in source form, or include data or documentation that is useful on many platforms. Since most of the programs require many files and to save space and transmission time, we store many of these programs in some archived format. In general there are different prefered archive formats for each computer platform. The ZIP archive format is perhaps most widely used, but it still is mainly found on MS-DOS computers (Unix and VMS ZIP utilities are available, but up-to-date ZIP for Macintosh is lacking at this time). The TAR.Z format is common on Unix computers, and utilities for using TAR.Z are available on MS-DOS, Macintosh and VMS. The Stuffit-BINHEX format is mainly Macintosh, but utilities exist on Unix and MS-DOS for using this format. Right now, there is quite a range of formats at this archive. You pretty much need to refer to the list of file suffixes below to determine which is which. Look in the /util subdirectory for programs which will decode these archived files. In some cases, the archive files and programs are stored here only as BINARY files. This means for FTP you must set your transmission software to transfer a full 8-bit byte, with the ftp command "Binary". See below list of file suffixes to tell which are in binary format. Internet Gopher users can ignore this, as the Gopher software handles text versus binary by itself. Most or all Macintosh files have been converted to .HQX format, or BinHex. This is an ascii (text) encoded form that you can fetch with the normal FTP setting of text (or ascii) transfer. You need Stuffit, BinHex 4.0 or other programs to decode this format. Many of the archives and any of the .DOC, .TXT or .README files are in Ascii (plain text) format, suitable for transfer with a default Ascii method, or via e-mail programs. The encoded programs (.HQX or .UUE suffixes) require that you have a decoder on your computer. These are for Macintosh .HQX [Archive.Util.Mac]BinHex.* MSDOS .UUE [Archive.Util.IBMPC]UUDecode.Bas, *.C VMS .UUE [Archive.Util.VMS.UUEncode]*.* Unix .UUE use MS-DOS or VMS source code File name suffixes ------------------ TEXT (ascii) File formats ---- .DOC Plain text documentation .TXT Plain text documentation .README Plain text documentation .HQX Macintosh BinHex encoded file .UUE MS-DOS, VMS, Unix UUEncode encoded file .C C source text .F ForTran source text .FOR ForTran source text .P Pascal source text .PAS Pascal source text .COM VMS command text .MAKE Unix command text .DAT general data file (usually text) .SEQ sequence data, usually nucleic acids (text) .PEP amino acid sequence data (text) .AA amino acid sequence data (text) BINARY File formats (requires ftp binary command before transfer) ------ .ARC Archived files (MS-Dos, Macintosh, Unix, VMS) .ZIP Archived (MS_Dos, Unix, VMS, others) .ZOO Archived files (MS_Dos, Unix, VMS, others) .TAR Unix archived files .Z Unix compressed file .TAR.Z Unix archived + compressed file .BCK VMS Backup archive .SIT Macintosh archive .EXE VMS or MSDOS program