skip to main | skip to sidebar
Showing posts with label reuse. Show all posts
Showing posts with label reuse. Show all posts

Friday, December 15, 2017

Referencing Reused Classes and Properties When Working with Other Ontologies

While I am still toiling away on re-working OntoGraph to support diagramming RDF/RDFS (yes, it seems to be a major undertaking!), I thought that I would post a question that I received. Here it is ... "When reusing a bunch of different ontologies in a new ontology, how should reused classes and properties be referenced?" Should each of the reused ontologies be included "en masse", should individual entities be reused directly, should entities be redefined in the new ontology but using their original namespace, or should the entities be recreated? Unfortunately, this is a question that has no right answer, but I have some preferences.

First, let me explain the alternatives:
  • Included "en masse" means using import statements for each re-used ontology, and then referencing the specific entities (classes and properties) that are actually needed. Everything is referenced in the namespace where it was defined, and nothing is redefined or recreated.
  • Reusing a class or property directly means referencing that class or property but without importing the entire ontology. Everything is referenced using the namespace where it was defined, and nothing is redefined or recreated. But, you might end up with a triple that looks like this: myNamespace:someKindOfDate a owl:DatatypeProperty, owl:subPropertyOf dcterms:date. And, it is up to the infrastructure to resolve the "dcterms" (Dublin Core) namespace to get the details of the date property.
  • Redefining entities means that you take the classes or properties that should be reused and include their definitions in your ontology. So, if you are using the Dublin Core "creator" concept, you would include a definition for dcterms:creator. You might even add more information, as new predicates/objects defined for the entity, or maybe just copy over the existing predicates. Why might you do this? One reason is to have all the necessary details in one place. But, just as this is considered bad practice in programming (having multiple copies of the same code), I believe that copy and paste of another ontology's definition (using the same IRI/URI) is also wrong. You could end up with duplicated (or worse) divergent, or out-of-date declarations.
  • Recreating entities is similar to redefining them, but different in some important ways. In this case, you create a semantically equivalent entity. Using the example above, a myNamespace:author entity might be created and the relevant details defined for it. In addition, you define an equivalentClass/Property declaration, linking it to its source (in this case, dcterms:creator). Taking this approach, if dcterms:creator means something different in a future version, the equivalentProperty statement can be removed. Or, if a new metadata standard is dictated by your company or customer, you simply add another newMetadataNamespace:author equivalentProperty declaration.
Next, I will try to give all the pros and cons of using the same vs different namespaces, and recreating entities from one namespace in another.

A namespace exists to establish the provenance of the entities defined within it, and to identify that the entities are related. Ontologies should have loose coupling and tight cohesion, just like code - and the namespace can (should?) indicate the purpose/domain-space of the ontology. You can certainly group everything under the umbrella of a namespace that represents "my overall application space" - but that seems a bit too broad. Also, you might have another application in the future where you re-use one or more of your own ontologies - and then, one might question the "my overall application space" namespace, or question which entities in that namespace are relevant to the new application.

Also, a namespace helps to disambiguate entities that might have the same name - but not necessarily the same semantics (or detail of semantics) - across different ontologies. For example, a Location entity in an Event ontology (or more correctly, ontology design pattern, ODP) should not go into detail about Locations (that is not the purpose of the ontology). Defining locations would be better served by other ontologies that specifically deal with network, spatial-temporal, latitude-longitude-altitude and/or other kinds of locations. So, an under-defined Location in an Event ODP can then link - as an equivalent class - to the more detailed location declarations in other "Location"-specific ODPs. In this way, you get loose coupling and tight cohesion. You can pull out one network location ODP and replace it by a better one - without affecting the Event ODP. In this case, you would only change the equivalentClass definition. :-)

As for re-creating entities in the ODP namespace, that is really done for convenience. I can actually argue both sides of this issue (keeping the entities with their namespaces/provenance versus recreating them). But, erring on the side of simplicity, I recommend recreating entities in the new ontology's namespace (the last bullet above). This is especially relevant if only a portion of several existing ontologies/namespaces will be re-used. Why import large ontologies when you only need a handful of classes and properties? This can confuse your users and developers as to what is really relevant. Plus, you will have new entities/properties/axioms being defined in your new ontology. If you do not recreate entities, you end up with lots of different namespaces, and this translates to lots of different namespaces in your individuals. Your users and developers can become overwhelmed keeping track of which concept comes from which namespace.

For example, you may take document details from the SPAR DoCo ontology (http://www.sparontologies.net/ontologies/doco/source.ttl) and augment it with data from the Dublin Core (http://dublincore.org/2012/06/14/dcterms.rdf) and PRISM (http://prismstandard.org/namespaces/basic/2.0/) vocabularies, and then add details from the PROV-O ontology (http://www.w3.org/ns/prov-o). All these classes and properties use different namespaces and it gets hard to remember which is which. E.g., "foo" is an instance of the doco:document class and uses the dcterms:publisher and prism:doi properties, but is linked to a revision using a prov:wasDerivedFrom property. This could lead to errors in creating and querying the instances. It seems easier to say "foo" is an instance of the myData:document class, and uses the predicates myData:author, myData:publisher, myData:doi and myData:derivedFrom (where "myData" is the namespace of the ODP for tracking document details).

I know that some might disagree (or might agree!). If so, let me know.

Andrea

Monday, January 25, 2016

Ontologies for Reuse Versus Integration

There is ongoing email discussion in preparation for this year's Ontology Summit on "semantic integration". I thought that I would share one of my recent posts to that discussion here, on my blog. The issue is reuse versus integration ...

For me, designing for general reuse is a valid goal and valuable (if you have the time, which is not always true). (Also it was the subject of the Summit 2 yrs ago and many of my posts from that time - March-May 2014!) But reusing an ontology or design pattern in multiple places is not semantic integration. Reuse and integration are different beasts, although they are complimentary.

I have designed ontologies for both uses (reuse and integration), but my approach to the two is different. Designing for reuse is usually focused on a small domain that is well understood. There are general problem areas (such as creating ontologies/design patterns for events, or to support Allen's time interval algebra) that are generally applicable. In these areas, general design and reuse makes sense.

Over the years, however, I have been much more focused on designing for integration (especially in the commercial space). In my experience, companies are always trying to combine different systems together - whether these systems are legacy vs new, systems that come into the mix due to acquisition, internal (company-centric) vs external (customer-driven), dictated by the problem space (combining systems from different vendors or different parts of an organization to solve a business problem), ...

It is ok to try to be forward-thinking in designing these integration ontologies ... anticipating areas of integration. But, I have been wrong in my guesses (of what was needed in the "future" ontology) probably more than I have been right - unless it was indeed in general problem domains.

So, my integration "rules of thumb" are:
  • Get the SMEs in a particular domain to define the problem space and their solution (don't ever ask the SMEs about integrating their domains)
  • Don't ever give favor to one domain over another in influencing the ontology (you are sure to not be future-proof)
  • Focus on the biggest problem areas first, and find the commonalities/general concepts (superclasses)
  • Place the domain details "under" these superclasses
  • Never try to change the vocabulary of a domain, just map to/from the domains to the "integration" ontology
  • Never map everything in a domain, just what needs to be integrated
  • Look for smaller areas of "general patterns" that can be broadly reused
  • Have new work start from the integrating ontology instead of creating a totally new model
  • Update the integrating ontology based on mapping problems and new work (never claim that the ontology is immutable)
  • Utilize OWL's equivalentClass/disjointFrom/intersectionOf/unionOf/... (for classes), sameAs/differentFrom (for individuals) and class and property restrictions to tie concepts together in the "mapped" space
  • Be focused on concept diagrams and descriptions, documenting mapping details, ... and not that you are using an ontology
  • Clearly document ontology/concept, relationship, ... evolution
Let me know if this resonates with you or if you have different "rules of thumb".

Andrea

Monday, May 12, 2014

Ontology Summit 2014 and the communique

Ontology Summit 2014 officially concluded with the symposium on April 28-29. There were some great keynotes, summary presentations and discussions. You can see most of the slides on the Day 1 and Day 2 links, and can also check out the online, unedited Day 1 chat and Day 2 chat.

The main "output" of each Ontology Summit is a communique. This year's communique is titled Semantic Web and Big Data Meets Applied Ontology, consistent with the Summit theme. Follow the previous link to get the full document, and consider endorsing it (if you are so inclined). To endorse the communique, send an email to with the subject line: "I hereby confirm my endorsement of the OntologySummit2014 Communique" and include (at least) your name in the body of the email. Other remarks or feedback can also be included. And, I would encourage you to add your thoughts.

I want to provide a quick list of the high points of the communique (for me):
  • In the world of big data, ontologies can help with semantic integration and mapping, reduction of semantic mismatches, normalization of terms, and inference and insertion of metadata and other annotations.
  • Development approaches that involve a heavy-weight, complete analysis of "the world" are evolving to lighter weight approaches. This can be seen in the development of ontology design patterns, the use of ontologies in Watson, and the bottom-up annotation and interlinking approaches of web/RESTful services (as "Linked Services").
  • There are some best practices that can be applied for sharing and reuse to succeed (and since I drafted most of these best practices, I am just copying them directly below :-)):
    • Wise reuse possibilities follow from knowing your project requirements. Competency questions should be used to formulate and structure the ontology requirements, as part of an agile approach. The questions help contextualize and frame areas of potential content reuse.
    • Be tactical in your formalization. Reuse content based on your needs, represent it in a way that meets your objectives, and then consider how it might be improved and reused. Clearly document your objectives so that others understand why you made the choices that you did.
    • Small ontology design patterns provide more possibilities for reuse because they have low barriers for creation and potential applicability, and offer greater focus and cohesiveness. They are likely less dependent on the original context in which they were developed.
    • Use "integrating" modules to merge the semantics of reused, individual content and design patterns.
    • Separately consider the reuse of classes/concepts, from properties, from individuals and from axioms. By separating these semantics (whether for linked data or ontologies) and allowing their specific reuse, it is easier to target specific content and reduce the amount of transformation and cleaning that is necessary.
    • RDF provides a basis for semantic extension (for example, by OWL and RIF). But, RDF triples without these extensions may be underspecified bits of knowledge. They can help with the vocabulary aspects of work, but formalization with languages like OWL can more formally define and constrain meaning. This allows intended queries to be answerable and supports reasoning.
    • Provide metadata (providing definitions, history and any available mapping documentation) for your ontologies and schemas. Also, it is valuable to distinguish constraints or concepts that are definitive (mandatory to capture the semantics of the content) versus ones that are specific to a domain. Domain-specific usage, and "how-to" details for use in reasoning applications or data analytics are also valuable. Some work in this area, such as Linked Open Vocabularies and several efforts in the Summit's Hackathon, is underway and should be supported.
    • Use a governance process for your ontologies (and it would be even better if enforced by your tooling). The process should include open consideration, comment, revision and acceptance of revisions by a community.
  • Lastly, what are some of the interesting areas of investigation? One area, certainly, is the need for tooling to better support modular ontology development, integration, and reuse. Another is support for hybrid reasoning capabilities - supporting both description logic and first-order logic reasoning, and both logical and probabilistic reasoning. Third, tooling that combines data analytic and ontological processing would be valuable to make sense of "big data", and aid in the dissemination of the resulting knowledge to users and for decision support. To truly address this last area, it may be necessary to create specialized hardware and processing algorithms to combine and process data using the graph-structured representations of ontologies.
That's it for me, but please take a look at the communique, draw your own conclusions, and determine your own highlights.

Andrea

Wednesday, May 7, 2014

Updated metadata ontology file (V0.6.0) and new metadata-properties ontology (V0.2.0) on GitHub

I've spent some time doing more work on the general metadata ontologies (metadata-annotations and metadata-properties). Metadata-annotations is now at version 0.6.0. In this release, I mainly corrected the SPARQL queries that were defined as the competency queries. SPARQL is straightforward, but it is easy to make mistakes. I made a few in my previous version (because I just wrote the queries by hand, without testing them - my bad). Anyway, that is all fixed now and the queries are correct. My apologies on the errors.

You can also see that there is a new addition to the metadata directory with the metdata-properties ontology. Metadata-properties takes some of the concepts from metadata-annotations, and redefines them as data and object properties. In addition, a few supporting classes are defined (specifically, Actor and Modification), where required to fully specify the semantics.

Actor is used as the subject of the object properties, contributedTo and created. Modification is designed to collect all the information related to a change or update to an individual. This is important when one wants to track the specifics of each change as a set of related data. This may not be important - for example, if one only wants to track the date of last modification or only track a description of each change. In these cases, the data property, dateLastModified, or the annotation property, changeNote, can be the predicate of a triple involving the updated individual directly.

It is important to understand that only a minimum amount of information is provided for Actor and Modification. They are defined, but are purposefully underspecified to allow application- or domain-specific details to be provided in another ontology. (In which case, the IRIs of the corresponding classes in the other ontology would be related to Actor and Modification using an owl:equivalentClass axiom. This was discussed in the post on modular ontologies, and tying together the pieces.)

Also in the metadata-properties ontology, an identifier property is defined. It is similar to the identifier property from Dublin Core, but is not equivalent since the metadata-properties' identifier is defined as a functional data property. (The Dublin Core property is "officially" defined as an annotation property.)

To download the files, there is information in the blog post from Apr 17th.

Please let me know if you have any feedback or issues.

Andrea

Monday, April 28, 2014

General, Reusable Metadata Ontology - V0.2

This is just a short post that a newer version of the general metadata ontology is available. The ontology was originally discussed in a blog post on April 16th. And, if you have trouble downloading the files, there is help in the blog post from Apr 17th.

I have taken all the feedback, and reworked and simplified the ontology (I hope). All the changes are documented in the ontology's changeNote.

Important sidebar: I strongly recommend using something like a changeNote to track the evolution of every ontology and model.

As noted in the Apr 16th post, most of the concepts in the ontology are taken from the Dublin Core ELements vocabulary and the SKOS data model. In this version, the well-established properties from Dublin Core and SKOS use the namespaces/IRIs from those sources (http://purl.org/dc/elements/1.1/ and http://www.w3.org/2004/02/skos/core#, respectively). Some examples are dc:contributor, dc:description and skos:prefLabel. Where the semantics are different, or more obvious names are defined (for example, creating names that provide "directions" for the skos:narrower and broader relations), then the purl.org/ninepts namespace is used.

This release is getting much closer to a "finished" ontology. All of the properties have descriptions and examples, and most have scope/usage notes. The ontology's scope note describes what is not mapped from Dublin Core and SKOS, and why.

In addition, I have added two unique properties for the ontology. One is competencyQuestions and the other is competencyQuery. The concept of competency questions was originally defined in a 1995 paper by Gruninger and Fox as "requirements that are in the form of questions that [the] ontology must be able to answer." The questions help to define the scope of the ontology, and are [should be] translated to queries to validate the ontology. These queries are captured in the metadata ontology as SPARQL queries (and the corresponding competency question is included as a comment in the query, so that it can be tracked). This is a start at test-driven development for ontologies. :-)

Please take a look at the ontology (even if you did before since it has evolved), and feel free to comment or (even better) contribute.

Andrea

Thursday, April 17, 2014

Downloading the Metadata Ontology Files from GitHub

Since I posted my ontology files to GitHub, and got some emails that the downloads were corrupted, I thought that I should clarify the download process.

You are certainly free to fork the repository and get a local copy. Or, you can just download the file(s) by following these instructions:
  • LEFT click on the file in the directory on GitHub
  • The file is displayed with several tabs across the top. Select the Raw tab.
  • The file is now displayed in your browser window as text. Save the file to your local disk using the "Save Page As ..." drop-down option, under File.
After you download the file(s), you can then load one of them into something like Protege. (It is only necessary to load one since they are all the same.) Note that there are NO classes, data or object properties defined in the ontology. There are only annotation properties that can be used on classes, data and object properties. Since I need this all to be usable in reasoning applications, I started with defining and documenting annotation properties.

I try to note this in a short comment on the ontology (but given the confusion, I should probably expand the comment). I am also working on a metadata-properties ontology which defines some of the annotation properties as data and object properties. This will allow (for example) validating dateTime values and referencing objects/individuals in relations (as opposed to using literal values). It is important to note, however, that you can only use data and object properties with individuals (and not with class or property declarations, or you end up with OWL Full with no computational guarantees/no reasoning).

Lastly, for anyone that objects to using annotation properties for mappings (for example, where I map SKOS' exactMatch in the metadata-annotations ontology), no worries ... More is coming. As a place to start, I defined exactMatch, moreGeneralThan, moreSpecificThan, ... annotation properties for documentation and human-consumption. (I have to start somewhere. :-) And, I tried to be more precise in my naming than SKOS, which names the latter two relations, "broader" and "narrower", with no indication of whether the subject or the object is more broad or more narrow. (I always get this mixed up if I am away from the spec for more than a week. :-)

I want to unequivocally state that annotation properties are totally inadequate to do anything significant. But, they are a start, and something that another tool could query and use. Separately, I am working on a more formal approach to mapping but starting with documentation is where I am.

Obviously, there is a lot more work in the pipeline. I just wish I had more time (like everyone).

In the meantime, please let me know if you have more questions about the ontologies or any of my blog entries.

Andrea

Wednesday, April 16, 2014

General, Reusable, Metadata Ontology

I recently created a new ontology, following the principles discussed in Ontology Summit 2014's Track A. (If you are not familiar with the Summit, please check out some of my earlier posts.) My goal was to create a small, focused, general, reusable ontology (with usage and scope information, examples of each concept, and more). I must admit that it was a lot more time-consuming than I anticipated. It definitely takes time to create the documentation, validate and spell-check it, make sure that all the possible information is present, etc., etc.

I started with something relatively easy (I thought), which was a consolidation of basic Dublin Core and SKOS concepts into an OWL 2 ontology. The work is not yet finished (I have only been playing with the definition over the last few days). The "finished" pieces are the ontology metadata/documentation (including what I didn't map and why), and several of the properties (contributor, coverage, creator, date, language, mimeType, rights and their sub-properties). The rest is all still a work-in-progress.

It has been interesting creating and dog-fooding the ontology. I can definitely say that it was updated based on my experiences in using it!

You can check out the ontology definition on github (http://purl.org/ninepts/metadata). My "master" definition is in the .ofn file (OWL functional syntax), and I used Protege to generate a Turtle encoding from it. My goals are to maintain the master definition in a version-control-friendly format (ofn), and also providing a somewhat human-readable format (ttl). I also want to experiment with different natural language renderings that are more readable than Turtle (but I am getting ahead of myself).

I would appreciate feedback on this metadata work, and suggestions for other reusable ontologies (that would help to support industry and refine the development methodology). Some of the ontologies that I am contemplating are ontologies for collections, events (evaluating and bringing together concepts from several, existing event ontologies), actors, actions, policies, and a few others.

Please let me know what you think.

Andrea

Saturday, April 5, 2014

Ontology Reuse and Ontology Summit 2014

I've been doing a lot of thinking about ontology and vocabulary reuse (given my role as co-champion of Track A in Ontology Summit 2014). We are finally in our "synthesis" phase of the Summit, and I just updated our track's synthesis draft yesterday.

So, while this is all fresh in my mind, I want to highlight a few key take-aways ... For an ontology to be reused, it must provide something "that is commonly needed"; and then, the ontology must be found by someone looking to reuse it, understood by that person, and trusted as regards its quality. (Sam Adams made all these points in 1993 in a panel discussion on software reuse.) To be understood and trusted, it must be documented far more completely than is (usually) currently done.

Here are some of the suggestions for documentation:
  • Fully describe and define each of the concepts, relationships, axioms and rules that make up the ontology (or fragment)
  • Explain why the ontology was developed
  • Explain how the ontology is to be used (and perhaps how the uses may vary with different triple stores or tools)
  • Explain how the ontology was/is being used (history) and how it was tested in those environment(s)
    • Explain differences, if it is possible to use the ontology in different ways in different domains and/or for different purposes
  • Provide valid encoding(s) of the ontology
    • These encodings should discuss how each has evolved over time
    • "Valid" means that there are no consistency errors when a reasoner is run against the ontology
    • It is also valuable to create a few individuals, run a reasoner, and make sure that the individual's subsumption hierarchy is correct (e.g., an individual that is supposed to only be of type "ABC", is not also of type "DEF" and "XYZ")
    • Multiple encodings may exist due to the use of different syntaxes (Turtle and OWL Functional Syntax, for example, to provide better readability, and better version control, respectively) and to specifically separate the content to provide:
      • A "basic" version of the ontology with only the definitive concepts, axioms and properties
      • Other ontologies that add properties and axioms, perhaps to address particular domains
      • Rules that apply to the ontology, in general or for particular domains
Defining much of this information is a goal of the VOCREF (Vocabulary and Ontology Characteristics Related to Evaluation of Fitness) Ontology, which was a Hackathon event in this year's Ontology Summit. I participated in that event on March 29th and learned a lot. A summary of our experiences and learnings is posted on the Summit wiki.

VOCREF is a good start at specifying characteristics for an ontology. I will certainly continue to contribute to it. But, I also feel that too much content is contained in the vocref-top ontology (I did create an issue to address this). That makes it too top-heavy and not as reusable as I would like. Some of the content needs to be split into separate ontologies that can be reused independently of characterizing an ontology. Also, the VOCREF ontology needs to "dog-food" its own concepts, relationships, ... VOCREF itself needs to be more fully documented.

To try to help with ontology development and reuse, I decided to start a small catalog of content (I won't go so far as to call it a "repository"). The content in the catalog will vary from annotation properties that can provide basic documentation, to general concepts applicable to many domains (for example, a small event ontology), to content specific to a domain. The catalog may directly reference, document and (possibly) extend ontologies like VOCREF (with correct attribution), or may include content that is newly developed. For example, right now, I am working on some general patterns and a high level network management ontology. I will post my current work, and then drill-down to specific semantics.

All of the content will be posted on the Nine Points github page. The content will be fully documented, and licensed under the MIT License (unless prohibited by the author and the licensing of the original content). In addition, for much of the content, I will also try to discuss the ontology here, on my blog.

Let me know if you have feedback on this approach and if there is some specific content that you would like to see!

Andrea

Wednesday, February 5, 2014

More on modular ontologies and tying them together

There is a short email dialog on this topic on the Ontology Summit 2014 mail list. I thought that I would add the reply from Amanda Vizedom as a blog post (to keep everything in one place).

Amanda added:

The style of modularity you mention, with what another summit poster (forgive me for forgetting who at the moment) referred to as 'placeholder' concepts within modules, can be very effective. The most effective technique I've found to date, for some cases.

Two additional points are worth making about how two execute this for maximum effectiveness (they may match what you've done, in fact, but are sometimes missed & so worth calling out for others.

Point 1: lots of annotation on the placeholders. The location & connection of the well-defined concepts to link them to is often being saved for later and possibly for someone else. In order to make sure the right external concept is connected, whatever is known or desired of the underspecifies concept shoud be captured (in the location case, for example, may be that it needs to support enough granularit to be used for location at which a person can be contacted at current time, or must be the kind os location that has a shipping address, or is only intended to be the place of business of the enterprise to which Person is assigned & out of which they operate (e.g., embassy, business office, base, campus). That's often known or easily elicitable without leaving the focus of a specialized module, and can be captured in an annotation for use in finding existing, well defined ontology content and mapping.

Point 2: advantages of modules, as you described are best maintained when the import and mapping are done *not* in the specialized module, but in a "lower" mapping module that inherits the specialized module and the mapping-target ontologies. Spindles of ontologies, which can be more or less intricate, allow for independent development and reuse of specialized modules, with lower mapping and integration modules, with a spindle-bottom that imports all in the spindle and effectivle acts as the integrated query, testing, and application module for all the modules contained in that spindle, providing a simplified and integrated interface to a more complex and highly modular system of ontologies. Meanwhile, specialized modules can be developed with SMEs who don't know, care, or have time to think about the stuff they aren't experts about, like distinguishing kinds location or temporal relations or the weather. Using placeholders and doing your mapping elsewhere may sound like extra work, but considering what it can enable, it can be an incredibly effective approach.

Indeed, the second point is exactly my "integrating" ontology, which imports the target ontologies and does the mapping. As to the first point, that is very much worth highlighting. I err on the side of over-documenting and use various different kinds of notes and annotation. For a good example, take a look at the annotation properties in the FIBO Foundations ontology. It includes comment, description, directSource, keyword, definition, various kinds of notes, and much more.

Another set of annotation properties that I use (which I have not seen documented before, but that I think is valuable for future mapping exercises) are WordNet synset references - as direct references or designating them as hyponyms or hypernyms. (For those not familiar with WordNet, check out this page and a previous blog post.)

Andrea

Sunday, February 2, 2014

Creating a modular ontology and then tying the pieces together

In my previous post, I talked about creating small, focused "modules" of cohesive semantic content. And, since these modules have to be small, they can't (and shouldn't) completely define everything that might be referenced. Some concepts will be under-specified.

So, how we tie the modules together in an application?

In a recent project, I used the equivalentClass OWL semantic to do this. For example, in a Person ontology, I defined the Person concept with its relevant properties. When it came to the Person's Location - that was just an under-specified (i.e., empty) Location class. I then found a Location ontology, developed by another group, and opted to use that. Lastly, I defined an "integrating" ontology that imported the Person and Location ontologies, and specified an equivalence between the relevant concepts. So, PersonNamespace:Location was defined as an equivalentClass to LocationNamespace:Location. Obviously, the application covered up all this for the users, and my triple store (with reasoner) handled the rest.

This approach left me with a lot of flexibility for reuse and ontology evolution, and didn't force imports except in my "integrating" ontology. And, a different application could bring in its own definition of Location and create its own "integrating" ontology.

But, what happens if you can't find a Location ontology that does everything that you need? You can still integrate/reuse other work, perhaps defined in your integrating ontology as subclasses of the (under-specified) PersonNamespace:Location concept.

This approach also works well when developing and reusing ontologies across groups. Different groups may use different names for the same semantic, may need to expand on some concept, or want to incorporate different semantics. If you have a monolithic ontology, these differences will be impossible to overcome. But, if you can say things like "my concept X is equivalent to your concept Y" or "my concept X is a kind of your Y with some additional restrictions" - that is very valuable. Now you get reuse instead of redefinition.

Andrea

Wednesday, January 29, 2014

Reuse of ontology and model concepts

Reuse is a big topic in this year's Ontology Summit. In a Summit session last week, I discussed some experiences related to my recent work on a network management ontology. The complete presentation is available from the Summit wiki. And, I would encourage you to look at all the talks given that day since they were all very interesting! (The agenda, slides, chat transcript, etc. are accessible from the conference call page.)

But ... I know that you are busy. So, here are some take-aways from my talk:

  • What were the candidates for reuse? There were actually several ontologies and models that were looked at (and I will talk about them in later posts), but this talk was about two specific standards: ISO 15926 for the process industry, and FIBO for the financial industry.
  • Why did we reuse since there was not perfect overlap of the chosen domain models/ontologies and network management? Because there was good thought and insight put into the standards, and there also was tooling developed that we want to reuse. Besides that, we have limited time and money - so jump starting the development was "a good thing".
  • Did we find valuable concepts to reuse? Definitely. Details are in the talk but two examples are:
    • Defining individuals as possible versus actual. For anyone that worries about network and capacity planning, inventory management, or staging of new equipment, the distinction between what you have now, what you will have, and what you might have is really important.
    • Ontology annotation properties. Documentation of definitions, sources of information, keywords, notes, etc. are extremely valuable to understand semantics. I have rarely seen good documentation in an ontology itself (it might be done in a specification that goes with the ontology). The properties defined and used in FIBO were impressive.
  • Was reuse easy? Not really. It was difficult to pull apart sets of distinct concepts in ISO 15926, although we should have (and will do) more with templates in the future. Also, use of OWL was a mapping from the original definition, which made it far less "natural"/native. FIBO was much more modular and defined in OWL. But due to ontology imports, we pretty much ended up loading and working through the complete foundational ontology.

Given all this, what are some suggestions for getting more reuse?

  1. Create and publish more discrete, easily understood "modules" that:
    • Define a maximum of 12-15 core entities with their relationships (12-15 items is about the limit of what people can visually retain)
    • Document the assumptions made in the development (where perhaps short cuts were made, or could be made)
    • Capture the axioms (rules) that apply separately from the core entities (this could allow adjustments to the axioms or assumptions for different domains or problem spaces, without invalidating the core concepts and their semantics)
    • Encourage evolution and different renderings of the entities and relationships (for example, with and without short cuts)
  2. Focus on "necessary and sufficient" semantics when defining the core entities in a module and leave some things under-specified
    • Don't completely define everything just because it touches your semantics (admittedly, you have to bring all the necessary semantics together to create a complete model or ontology, but more on that in the next post)
    • A contrived example is that physical hardware is located somewhere in time and space, but it is unlikely that everyone's requirements for spatial and temporal information will be consistent. So, relate your Hardware entity to a Location and leave it at that. Let another module (or set of modules) handle the idiosyncrasies of Location.
In my next post, I promise to talk more about how to combine discrete "modules" with under-specified concepts to create a complete solution.

Andrea


Subscribe to: Comments (Atom)
 

AltStyle によって変換されたページ (->オリジナル) /