This page is split into:
The Human Phenotype Ontology (HPO) intends to offer a tool that will allow large-scale computational analysis of the human phenome. The HPO currently contains over 11,000 terms, each of which describes an individual phenotypic anomaly. The terms are arranged in a directed acyclic graph and are connected by is-a (subclass-of) edges, such that a term represents a more specific or limited instance of its parent term(s). All relationships in the HPO are is-a relationships, i.e. simple class-subclass relationships. For instance, Abnormality of the feet is-a Abnormality of the lower limbs. The relationships are transitive, meaning that they are inherited up all paths to the root. Phenotypic abnormality is the main subontology of the HPO and contains descriptions of clinical abnormalities. Additional subontologies are provided to describe inheritance patterns, onset/clinical course and modifiers of abnormalities.
What is An Ontology and What Use is An Ontology for Medical Genetics Research?
The word ontology is derived from Greek words meaning the study of existance and being. More recently, the word ontology has been used in computer science to describe systems that describe concepts within some domain and relationships between those concepts. The Gene Ontology consortium has developed an extensive ontology describing molecular functions, biological processes, and cellular locations over the last decade and a number of groups have supplied annotations using the GO terms to gene products of many organisms. The study of human phenotypes in the context of hereditary and common disease has the potential to lead to great insight on the function of genes and genetic networks. The HPO intends to offer such a computational tool that will allow large-scale computational analysis of the human phenome.
What is the Medical Focus of the Human Phenotype Ontology?
Currently, the most useful and comprehensive database of human hereditary disorders is the Online Mendelian Inheritance in Man (OMIM) database. OMIM contains clinical data on several thousand primarily monogenic hereditary disorders. Since OMIM and its predecesor Mendelian Inheritance in Man (McKusick) were initially developed over 30 years ago, OMIM did not use a controlled vocabulary to describe clinical features, and uses a more or less simple hierarchical scheme to assign clinical features to organ systems. Despite these drawbacks, OMIM was and will continue to be an incredibly useful resource for clinicians and researchers. We have therefore mapped nearly all clinical descriptions in OMIM to terms of the HPO. We have embedded these mappings into an ontological, multi-tiered structure that is described below and intend to continue refining and improving the HPO. We have also annotated all Orphanet entries and over 60 recurrent syndromes in DECIPHER.
Terms in the Human Phenotype Ontology
Each term in the HPO describes a clinical abnormality. These may be general terms, such as Abnormality of the musculoskeletal system or very specific such as Chorioretinal atrophy. Each term is also assigned to one of the four ontologies, Phenotypic abnormality, Clinical modifier, Mortality/Aging or Mode of inheritance. Most of the terms of the HPO belong to the Phenotypic abnormality ontology. The terms have a unique ID such as HP:0001140 and a label such as Epibulbar dermoids. Most terms have textual definitions such as An epibulbar dermoid is a benign tumor typically found at the junction of the cornea and sclera (limbal epibullar dermoid). The source of the definition must be indicated. Many terms have synonyms. For instance, Epibulbar dermoid is taken to be a synonym of Epibulbar dermoids.
| Subontology | Description |
|---|---|
| Phenotypic abnormality | This is the main ontology of the HPO and contains descriptions of clinical abnormalities. The level 1 children of this class are formed by terms such as Abnormality of the musculoskeletal system and Hematological abnormality. |
| Mode of Inheritance | This relatively small ontology is intended to describe the mode of inheritance and contains terms such as Autosomal dominant.. |
| Clinical modifier | This ontology contains classes that describe typical modifiers of clinical symptoms. For example the speed of progression, the variablity or the onset. It contains terms such as Onset in childhood, Rapidly progressive, or Incomplete penetrance. |
| Mortality/Aging | This sub-ontology describes Time of death and contains classes such as Neonatal death or Sudden death. |
| Frequency | Frequency with that patients do show a particular clinical feature. Examples are Obligate, Frequent, and Occasional. These terms are defined in the same way as Orphanet defines them. |
The Structure of the Human Phenotype Ontology
Most ontologies are structured as directed acyclic graphs (DAG), which are similar to hierarchies but differ in that a more specialized term (child) can be related to more than one less specialized term (parent). Cycles (cyclic paths in the graph) are not allowed. The relationship of the terms of the HPO to one another is displayed in the DAG. For instance, the term Aplasia/Hypoplasia of metatarsal bones is a child of both Aplasia/Hypoplasia involving bones of the feet and Abnormalities of the metatarsal bones. The ability to encode multiple parents in a DAG adds to the flexibility and descriptiveness of the ontology. For instance, it is possible to search for all terms involving Aplasia/Hypoplasia of the skeleton as well as to search for all terms involving Abnormalities of the foot. This would not be possible with a simple hierarchical system. The is-a relationship is transitive, meaning that annotations are inherited up all paths to the root. For instance, Abnormality of the lower limbs is-a Abnormality of the extremities, and thus Abnormality of the feet also is-a Abnormality of the extremities.
We represent clinical annotations using a simple tab-delimited format that was designed to be as similar as possible to the format used by the Gene Ontology consortium. This document describes the process of assigning HPO terms to disease entities such as Mendelian disorders from OMIM or Orphanet. Each line in the annotation file represents a link between a disease entity such as Noonan syndrome and one of the clinical features characteristically seen in that disease. Each of the features of a disease is to be listed on a separate line. Note that this file (and format) is intended to be used for the annotation of disease entities (e.g. Noonan syndrome) and not individuals (such as a person that has been diagnosed with Noonan syndrome). You may have look at PhenoTips if you are aiming at annotating clinical findings of individuals with hereditary diseases.
The flat file format comprises 14 tab-delimited fields
| Column | Content | Required | Example |
|---|---|---|---|
| 1 | DB | required | MIM |
| 2 | DB_Object_ID | required | 154700 |
| 3 | DB_Name | required | Achondrogenesis, type IB |
| 4 | Qualifier | optional | NOT |
| 5 | HPO ID | required | HP:0002487 |
| 6 | DB:Reference | required | OMIM:154700 or PMID:15517394 |
| 7 | Evidence code | required | IEA |
| 8 | Onset modifier | optional | HP:0003577 |
| 9 | Frequency modifier | optional | Usually from the subontology Frequency or "70%" or "12 of 30" |
| 10 | With | optional | |
| 11 | Aspect | required | O |
| 12 | Synonym | optional | ACG1B|Achondrogenesis, Fraccaro type |
| 13 | Date | required | YYYY.MM.DD |
| 14 | Assigned by | required | HPO |
Unless we instruct the computer otherwise, there is no implication that an HPO term such as Cerebral calcification is somehow related to the human brain. The HPO term can be related to other terms in the ontology by subclass relations, but it is not explicitly related to concepts from anatomy, histology, pathology, biochemistry, and cellular physiology. For this reason, a consortium including Sandra Dölken, Sebastian Köhler, and Peter N Robinson from the Institute of Human Genetics and Medical Genetics of the Charité Berlin, Chris Mungall and Suzi Lewis from the Berkely Bioinformatics Open Source Projects group, Barbara Ruef and Monte Westerfield from the Zebrafish Model Organism Database (ZFIN), and Melissa Haendel, Nicole Vasilevsky and Mark Engelstad from the Oregon Health & Science University has joined forces to develop computer-readable logical definitions of HPO terms that will allow human phenotypic abnormalities to be related to entities from anatomy, pathology, physiology, biochemistry, and other areas.
We are creating the definitions using the Phenotypic Quality Ontology (PATO). We logically define phenotypes by stating that classes in the HPO are logically equivalent to Entity/Quality descriptions, with each such description consisting of the following elements: Q, the type of quality (characteristic) that the genotype affects; E, the type of entity that bears the quality; E2, an additional optional entity type, for relational qualities; M, a modifier. The basic methodology has been described in this paper.