SILVA logo DSMZ Digital Diversity
Release
Information Files ARB files
Tools
Search Browser TestProbe TestPrime ACT SILVAngs
Download
TaskManager Archive
Projects
Overview

Release Information: SILVA 89

Version 89 of the SSU and LSU databases was released on 06. and 11. February 2007

SSURef was updated on 25.02.2007

LSUParc was updated on 26.02.2007

SSU

LSU

Ref

137,788

5,660

Parc

353,366

46,979

Information about former releases can be found here.

Sequence Retrieval and Processing

490,549 SSU and 116,307 LSU sequences have been retrieved from EMBL Release 89 (December 2006) using a complex keyword search procedure. Cross checks with RDP II did not indicate a significant loss of primary data. The subsequent initial quality check removed 72,449 short rRNA sequences (below 300 bases), 6268, 14,119 and 8365 rRNA sequences due to extended amounts of ambiguities (>2%), homopolymers (>2%) and vector contamination (>5%) for the SSU db and 53,834 short rRNA sequences (below 300 bases), 1114, 1274 and 1298 rRNA sequences due to extended amounts of ambiguities (>2%), homopolymers (>2%) and vector contamination (>5%) for the LSU db. In the alignment process 33,557 and 4,134 rRNA sequences where rejected due to a lack of relatives for SSU and LSU, respectively. Most of these sequences were classified as non-ribosomal RNA sequences by manual inspection.

Basic statistics for the SILVA databases, release 89

SSUParc

SSU Ref

LSUParc

LSURef

Version

89

89.1

89.1

89

Total

353,366

137,788

46,979

5562

Bacteria

272,450

105,354

5371

3156

Archaea

17,270

6517

126

120

Eukarya

61,290

25,926

41,482

2384

Cultured #

129,257

73,683

44,313

5562

Uncultured

224,109

65,073

2988

98

# searched for not matching *uncult* or *unident* or *clone* in full name. Contains false positives!

Growth of the ribosomal RNA Databases since 1992

Blue: RDP II, Orange: SILVA SSUParc based on the EMBL Release 89

Length distribution in the databases

Red: raw data, Black: the quality checked & aligned SSUParc sequences
Red: raw data, Black: the quality checked & aligned LSUParc sequences

Known bugs

  • The select complete results button in the List is not really working.
  • Do not sort the Search List when searching within the LSU db - you will get strange effects.
  • The cutoff head and tail values in the exported ARB Dbs are wrong and have been removed from SSUParc.
  • The term "Sequences" need to be replaced by "accession numbers" in all pop ups since it is misleading.
  • On the Search page you have to reload the page after each request.
  • Reloading can be done by clicking again on Search in the menu bar.

Future

Extended search functionalities including a similarity based search are planned for beginning of 2007. The known bugs have to be fixed to improve speed and reliability of the Webpage. The thresholds of the quality value system are still under discussion. The SEED and REF dbs require further extension and curation.

Small Subunit rRNA Database

Sequences with an alignment quality value below 30 have been removed from the SSUParc database. For SSURef all sequences below 1,200 bases for Bacteria and Eukarya and below 900 bases for Archaea or an alignment quality value below 50 have been removed. A guide tree was calculated by adding all sequences to the tree_1000 from the ssujan04 release. For tree calculation highly variable positions were removed for Bacteria, Archaea and Eukarya with the respective position variability filters. Phyla and most of the classes for Bacteria and Archaea have been organized according to the Bergey's taxonomic outline. Around 400 sequences have been removed after manually inspecting the tree for long branches. Position variability filters for Bacteria, Archaea and Eukarya have been calculated and added to the dataset. Please take into account that also sequences below an alignment quality value of 70 might need further attention. All sequences with a Pintail value < 50 or an Alignment quality value < 70 have been assigned to color group 1 in ARB. Before using the alignment for extensive phylogenetic reconstructions all sequences should be checked carefully.

Large Subunit rRNA Databases

Please take into account that the SEED consisted only of around 2,800 sequences and there is no guaranty that well aligned close relatives have always been available. We would recommend additional manual curation before using it for extensive phylogenetic reconstructions. For the LSUParc all sequences with a quality value below 30 (7,352 sequences) had been removed. Additionally, in LSURef all sequences below 1,900 bases have been removed, a guide tree was calculated for both dbs and basic filters have been added.

Alternative Names

All names of validly described species in the SSU and LSU databases have been checked for changes (basonyms, synonyms and orthographical corrections) against the DSMZ "Nomenclature up to date" catalogue (http://www.dsmz.de/download/bactnom/names.txt) released in December 2006.

Quality values

The flashlight system gives a first indication about the sequence and alignment quality as well as the risk for sequence anomalies based on Pintail analysis. After downloading the sequences as an ARB file, sequences that need attention can be selected by searching for low quality (alignment, sequence) or Pintail values in the corresponding ARB db fields. A full description of all db fields available in the ARB files can be found in the FAQ section. Taking into account the righ annotation information that comes along with every SILVA sequence user designed ARB databases can be easily generated.

SEED

All remaining rRNA sequences have been aligned based on a completely manually re-checked SEED alignment of 49,697 rRNA sequences for SSU and 2,868 rRNA sequences for LSU. The SSU alignment is based on the official ssu_jan04 release of the ARB Project. The SSU SEED alignment has been considerably improved for Archaea by manual addition of more than 1,000 sequences by Katrin Knittel. All SSU Eukaryotic sequences (18S) have been cross-checked by Wolfgang Ludwig before their addition to the SEED. Most of the bacterial sequences have also undergone a curation process carried out by the SILVA Team. We would rate our SSU SEED alignment for all Bacteria and Archaea as good and for Eukarya as reasonable.

The LSU alignment was provided by Wolfgang Ludwig and has not been released before. It was cross-checked by the SILVA Team before using it as the SEED for automatic alignment. Bacteria and Archaea could be rated as good. The Eukaryotes need definitely further attention.

Did not find what you were looking for?

Search within all databases of the DSMZ Digital Diversity

provided to you by

DSMZ Digital Diversity



© DSMZ 2025Imprint Privacy Statement License Contact

AltStyle によって変換されたページ (->オリジナル) /