Abstract

HTML microdata [MICRODATA ] is an extension to HTML used to embed machine-readable data into HTML documents. Whereas the microdata specification describes a means of markup, the output format is JSON. This specification describes processing rules that may be used to extract RDF [RDF11-CONCEPTS ] from an HTML document containing microdata.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is an experimental work in progress. The concepts described herein are intended to provide guidance for a possible future Working Group chartered to provide a Recommendation for this transformation. As a consequence, implementers of this specification, either producers or consumers, should note that it may change prior to any possible publication as a Recommendation.

This Working Draft is an update of the W3C Interest Group Note, published in October 2012. This update simplifies processing using the following mechanisms:

The intention is to publish this draft as a new version of the Interest Group Note after gathering and incorporating community input.

This document was published by the Semantic Web Interest Group as an Interest Group Note. If you wish to make comments regarding this document, please send them to semantic-web@w3.org (subscribe, archives). All comments are welcome.

Publication as an Interest Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

The disclosure obligations of the Participants of this group are described in the charter.

This document is governed by the 1 August 2014 W3C Process Document.

Table of Contents

1. Introduction

This section is non-normative.

This document describes a means of transforming HTML containing microdata into RDF. HTML Microdata [MICRODATA ] is an extension to HTML used to embed machine-readable data to HTML documents. This specification describes transformation directly to RDF [RDF11-CONCEPTS ].

Note

There are a variety of ways in which a mapping from microdata to RDF might be configured to give a result that is closer to the required result for a particular vocabulary. This specification defines terms that can be used as hooks for vocabulary-specific behavior, which could be defined within a registry or on an implementation-defined basis.

For background on the trade-offs between these options, see http://www.w3.org/wiki/Mapping_Microdata_to_RDF and GitHub Issues.

1.1 Background

This section is non-normative.

Microdata [MICRODATA ] is a way of embedding data in HTML documents using attributes. The HTML DOM is extended to provide an API for accessing microdata information, and the microdata specification defines how to generate a JSON representation from microdata markup.

Mapping microdata to RDF enables consumers to merge data expressed in other RDF-based formats with microdata. It facilitates the use of RDF vocabularies within microdata, and enables microdata to be used with the full RDF toolchain. Some use cases for this mapping are described in Section 1.2 below.

Microdata's data model does not align neatly with RDF.

  • Non-URL microdata properties are disambiguated based on microdata item type; an item with the type http://example.org/Cat can have both the property color and the property http://example.org/color, and these properties are semantically distinct under microdata. In RDF, all properties have IRIs.
  • When an item has multiple properties with the same name, the values are always ordered; in RDF, property values are unordered unless they are explicitly listed in an RDF Collection.
  • Except for some specific element values, a value in microdata is always a simple string which is interpreted by the consuming application. In RDF, values can be tagged with a datatype or a language. According to the microdata specification, the HTML context of microdata markup should not change how microdata is interpreted, so although element names and HTML @lang attributes could be used to provide datatype and language information for RDF data, this would be contrary to the microdata specification.

Thus, in some places the needs of RDF consumers violate requirements of the microdata specification. This specification highlights where such violations occur and the reasons for them.

This specification allows for vocabulary-specific rules that affect the generation of property URIs and value serializations. This is facilitated by a registry that associates URIs with specific rules based on matching itemtype values against registered URI prefixes do determine a vocabulary and potentially vocabulary-specific processing rules.

This specification also assumes that consumers of RDF generated from microdata may have to process the results in order to, for example, assign appropriate datatypes to property values.

1.2 Use Cases

This section is non-normative.

During the period of the task force, a number of use cases were put forth for the use of microdata in generating RDF:

  • Semantic search engines such as Sindice use RDF as their backend data model. They want to gather information expressed using microdata alongside information expressed in RDF-based formats and make it available to others to use, as a service. In these cases, the ultimate consumer, who will need to understand the vocabularies used within the microdata, is the program or person who pulls out data from Sindice. Sindice needs to retain the distinctions in the original microdata (e.g. ordering of items) and might not have built-in knowledge about the vocabulary of interest to the ultimate consumer. In this case, the ultimate consumer is likely to have to map/validate/handle errors in the data they get from Sindice.
  • A consumer such as openelectiondata.org wants to support microdata-based markup of their vocabulary as well as RDFa-based markup, both going into an RDF-based data store. They want to use an off-the-shelf tool to extract the microdata. They want to configure the tool to give them the RDF that is appropriate for their known vocabulary.
  • A browser plugin that captures data for the user uses an RDF model as its backend store. Any time it encounters microdata on a page, it wants to pull that microdata into the store on the fly.
  • GoodRelations properties do not take rdf:List values; when they take multiple values they are unordered. The rdfs:range of a GoodRelations property indicates the datatype of the expected value, and GoodRelations processors will expect values to be cast to that type. Language information from the HTML needs to be captured as it is common that multiple values will be used to specify the same information in different languages.
  • Schema.org has an extension mechanism to allow authors to express information that is more detail than the pre-defined types, properties and enumerations. Property URIs are all in the same flat-namespace as types, but authors can add more detail by using a '/' after the type or property to provide more detail. For example, schema.org defines a musicGroupMember property having a URI of http://schema.org/musicGroupMember, and an author might express more detail through an ad-hoc sub-property musicGroupMember/leadVocalist, having the URI http://schema.org/musicGroupMember/leadVocalist.

1.3 Issues

This section is non-normative.

Decisions or open issues in the specification are tracked on the GitHub Issue Tracker. These include the following:

Experimental Feature

Experimental support itemprop-reverse. This attribute is not part of [MICRODATA ] and is included as an experimental feature. Specific feedback from the community is requested. Based on addoption, the attribute may be considered for inclusion in forthcoming versions of [MICRODATA ] and this note.

The purpose of this specification is to provide input to a future working group that can make decisions about the need for a registry and the details of processing. Among the options investigated by the Task Force are the following:

  • Property URI generation using the original microdata specification with a base URI and fragment made up of the in-scope item type and item properties.
  • Vocabulary-based URI generation, where the vocabulary is determined from the in-scope item type, either through an algorithmic modification of the type URL or by matching the URL against a registry. The vocabulary URI is then used to generate property URIs in a namespace parallel to the type URI.
  • When there are multiple top-level items in a document, place items in an RDF Collection. Alternatively, simply list the items as multiple values, or do not generate an http://www.w3.org/ns/md#item mapping at all.
  • When an item has multiple values for a given property, place the values in an RDF Collection. Alternatively, do not use collections, use an alternative such as rdf:Seq, or place all values, whether or not multiple, into some form of collection.

2. Attributes and Syntax

The microdata specification [MICRODATA ] defines a number of attributes and the way in which those attributes are to be interpreted. The microdata DOM API provides methods and attributes for retrieving microdata from the HTML DOM.

For reference, attributes used for specifying and retrieving HTML microdata are referenced here:

itemid
An attribute containing a URL used to identify the subject of triples associated with this item. (See itemid in [MICRODATA ]).
itemprop
An attribute used to identify one or more names of an items. An itemprop contains a space separated list of names which may either by absolute URLs or terms associated with the type of the item as defined by the referencing item's item type. (See itemprop in [MICRODATA ]).
itemref
An additional attribute on an element that references additional elements containing property definitions to be applied to the referencing item. (See itemref in [MICRODATA ]).
itemscope
An boolean attribute identifying an element as an item. (See itemscope in [MICRODATA ]).
itemtype
An additional attribute on an element used to specify one or more types of an item. The item type of an item is the first value returned from element.itemType on the element. The item type is also used to resolve non-URL names to absolute URLs. Available through the Microdata DOM API as element.itemType . (See itemtype in [MICRODATA ]).

In RDF, it is common for people to shorten vocabulary terms via abbreviated URIs that use a 'prefix' and a 'reference'. throughout this document assume that the following vocabulary prefixes have been defined:

dc: http://purl.org/dc/terms/
md: http://www.w3.org/ns/md#
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfa: http://www.w3.org/ns/rdfa#
xsd: http://www.w3.org/2001/XMLSchema#

3. Vocabulary Registry

This section is non-normative.

In a perfect world, all processors would be able to generate the same output for a given input without regards to the requirements of a particular vocabulary. However, microdata doesn't provide sufficient syntactic help in making these decisions. Different vocabularies have different needs.

The registry is located at the namespace defined for microdata: http://www.w3.org/ns/md in a variety of formats. Under control of a runtime option, a processor should use another provided by reference, to affect processing.

The registry associates a URI prefix with one or more key-value pairs denoting processor behavior. A hypothetical JSON representation of such a registry might be the following:

Example 1
{
 "http://schema.org/": {
 "properties": {
 "additionalType": {"subPropertyOf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"}
 },
 "http://microformats.org/profile/hcard": {}
 }
}

This structure associates mappings for two URIs: http://schema.org/ and http://microformats.org/profile/hcard. Items having an item type with a URI prefix from this registry use the the rules described for that prefix within the scope of that item type. For http://schema.org/, this mapping currently defines a single property: additionalType with a value to indicate specific behavior. It also allows overrides on a per-property basis; the item properties key associates an individual name with overrides for default behavior. The interpretation of these rules is defined in the following sections. If an item has no current type or the registry contains no URI prefix matching current type, a conforming processor MUST use the default values defined for these rules.

3.1 Property URI Generation

This section is non-normative.

For names which are not absolute URLs, this section defines the algorithm for generating an absolute URL given an evaluation context including a current type and current vocabulary.

The procedure for generating property URIs is defined in Generate Predicate URI.

The URI generation scheme appends names that are not absolute URLs to the URI prefix. When generating property URIs, if the URI prefix does not end with a '/' or '#', a '#' is appended to the URI prefix. (See Step 4 in Generate Predicate URI.)

URI creation uses a base URL with query parameters to indicate the in-scope type and name list. Consider the following example:

Example 2
<span itemscope itemtype="http://microformats.org/profile/hcard">
 <span itemprop="n" itemscope>
 <span itemprop="given-name">
 Princeton
 </span>
 </span>
</span>

Given the URI prefix http://microformats.org/profile/hcard, this would generate http://microformats.org/profile/hcard#n and http://microformats.org/profile/hcard#given-name. Note that the '#' is automatically added as a separator.

Looking at another example:

Example 3
<div itemscope itemtype="http://schema.org/Person">
 <h2 itemprop="name">Jeni</h2>
</div>

Given the URI prefix http://schema.org/, this would generate http://schema.org/name. Note that if the itemtype were http://schema.org/Person/Teacher, this would generate the same property URI.

If the registry contains no match for current type implementations MUST act as if there is a URI prefix made from the first itemtype value by stripping either the fragment content or last path segment , if the value has no fragment (See [RFC3986 ]).

The vocabulary URI prefix is made from the first itemtype value by stripping either the fragment content or last path segment , if the value has no fragment (See [RFC3986 ]).

Note

Deconstructing the itemtype URL to create or identify a vocabulary URI is a violation of the microdata specification which is necessary to support the use of existing vocabularies designed for use with RDF, and shared or inherited properties within all vocabularies.

Example 4
<div itemscope itemtype="http://example.org/Book">
 <h2 itemprop="title">Just a Geek</h2>
</div>

In this example, assuming no matching entry in the registry, the URI prefix is constructed by removing the last path segment , leaving the URI http://example.org/. The resulting property URI would be http://example.org/title.

If there is no in-scope itemtype, property URIs are generated using the base URI of the document and the name as a fragment Consider the following example:

Example 5
<div itemscope>
<p itemscope itemprop='bar'>
 <span itemprop='baz'>Baz</span>
</p>
</div>

If the document is located at http://example/author, the name bar generates the URI http://example/author#bar. However, the included name baz is included in untyped item. The inherited property URI is used to create a new property URI: http://example/author#baz.

This scheme is compatible with the needs of other RDF serialization formats such as RDF/XML [RDF-SYNTAX-GRAMMAR ], which rely on QNames for expressing properties. For example, the generated property URIs can be split as follows:

Example 6
<rdf:Description xmlns:base="http://example/author#"
 rdf:type="http://microformats.org/profile/hcard">
<base:bar>
 <rdf:Description>
 <base:baz>Baz</base:baz>
 </rdf:Description>
</base:bar>
</rdf:Description>

3.2 Value Typing

This section is non-normative.

In microdata, all values are strings. In RDF, values may be resources or may be typed with an appropriate datatype.

In some cases, the type of a microdata value can be determined from the element on which it is specified. In particular:

  • URL property elements provide URLs
  • time element provides dates, times and durations
  • data and meter elements provides doubles and integers

4. Vocabulary Expansion

Microdata requires that all values of itemtype come from the same vocabulary. This is required as itemprop values are resolved relative to that vocabulary. However, it is often useful to define an item to have types from multiple different vocabularies.

Vocabulary expansion uses simple rules to generate additional triples based on rules and property relationships described in the registry. Within the registry, a property definition may have either equivalentProperty or subPropertyOf keys having a IRI value (or array of IRI values) of the associated property. Such an entry causes the processor to generate triples associating the source property IRI with the target property IRI using either rdf:subPropertyOf or owl:equivalentProperty predicates.

For example, the registry definition for the additionalType property within schema.org, defines additionalType to have an rdfs:subPropertyOf relationship with rdf:type.

Example 7
<div itemscope itemtype="http://schema.org/Person">
 <link itemprop="additionalType" href="http://xmlns.com/foaf/0.1/Person"/>
 <a itemprop="email http://xmlns.com/foaf/0.1/mbox" href="mailto:mail@gmail.com">
 mail@gmail.com
 </a>
</div>

The previous example, indicates a registry rule, which causes the processor to emit an extra triple when first seeing the additionalProperty itemprop:

Example 8
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
[ a schema:Person;
 schema:additionalType foaf:Person;
 schema:email <mailto:mail@gmail.com>;
 foaf:mbox <mailto:mail@gmail.com>
] .

After performing vocabulary expansion, an additional rdf:type triple is generated:

Example 9
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
<> rdfa:usesVocabulary schema: .
[ a schema:Person, foaf:Person;
 schema:additionalType foaf:Person;
 schema:email <mailto:mail@gmail.com>;
 foaf:mbox <mailto:mail@gmail.com>
] . 
Note

The owl:equivalentProperty rule is more powerfull than rdfs:subPropertyOf, in that if any equivalent property matches, then the source property would also cause a triple to be generated. For example, if the registry stated that name was equivalent to rdfs:label, than any use of name in a itemprop would cause a triple using rdfs:label to be emitted, as with rdfs:subPropertyOf. However, logically, any use of label where the current voabulary were rdfs: could also cause a triple using schema:name to be emitted. To simplify processing, this specification requires that all values of a owl:equivalentProperty registry entry have their own rules with those values as keys within the property section of their respective vocabularies.

5. Control of Microdata Processors

The external registry may be controlled by the registry option passed to the microdata processor. If specified, the registry must be loaded from the location indicated as the option value, Otherwise, the processor MUST load the default registry from http://www.w3.org/ns/md.

Setting registry is performed in a processor-specific way.

When accessed as a web service using HTTP GET, POST or similar action, processors SHOULD use registry query parameter. Acceptable values for registry is a URI-encoded URL. Web service processors SHOULD return the resulting RDF graph using a requested format specified by HTTP Content Negotiation for an acceptable content type. Web service processors MUST support [N-TRIPLES ].

6. Algorithm

Transformation of Microdata to RDF makes use of general processing rules described in [MICRODATA ] for the treatment of items.

6.1 Algorithm Terms

absolute URL
The term absolute URL is defined in [HTML5 ].
blank node
A blank node is a node in a graph that is neither a URI reference nor a literal. Items without a global identifier have a blank node allocated to them. (See [RDF11-CONCEPTS ]).
canonicalized fragment
The term canonicalized fragment is defined in [URL ]. This involves transforming elements added to URLs to ensure that the result remains a valid URL. Non-Unicode characters, and characters less than U+0020 SPACE character (" ") are subject to percent escaping.
document base
The base address of the document being processed, as defined in Resolving URLs in [HTML5 ].
evaluation context
A data structure including the following elements:
memory
a mapping of items to subjects, initially empty;
current type
an absolute URL for the current type, used when an item does not contain an item type;
current vocabulary
an absolute URL for the current vocabulary, from the registry.
item
An item is described by an element containing an itemscope attribute. The list of top-level microdata items may be retrieved using the Microdata DOM API document.getItems method.
item properties
The mechanism for finding the properties of an item The list of item properties items may be retrieved using the Microdata DOM API element.properties attribute.
global identifier
The value of an item's itemid attribute, if it has one, resolved relative to the element on which the attribute is specified. (See itemscope in [MICRODATA ]).
literal
Literals are values such as strings and dates. These include typed literals, language-tagged strings and simple literals, as defined in [RDF11-CONCEPTS ].
property
Each name identifies a property of an item. An item may have multiple elements sharing the same name, creating a multi-valued property.
property names
The tokens of an element's itemprop attribute. Each token is a name. (See property names in [MICRODATA ]).
property value
The property value of a name-value pair added by an element with an itemprop attribute depends on the element.
If the element has no itemprop attribute
The value is null and no triple should be generated.
If the element creates an item (by having an itemscope attribute)
The value is the URI reference or blank node returned from generate the triples for that item.
If the element is a URL property element (a, area, audio, embed, iframe, img, link, object, source, track or video)
The value is a URI reference created from element.itemValue . (See relevant attribute descriptions in [HTML5 ]).
If the element is a meter or data element.
The value is a literal made from element.itemValue .
If the value is a valid integer having the lexical form of xsd:integer [XMLSCHEMA11-2 ]
The value is a typed literal composed of the value and http://www.w3.org/2001/XMLSchema#integer.
If the value is a valid float number having the lexical form of xsd:double [XMLSCHEMA11-2 ]
The value is a typed literal composed of the value and http://www.w3.org/2001/XMLSchema#double.
Otherwise
The value is a simple literal.
If the element is a meta element with a @content attribute.
If the element has a non-empty language , the value is a language-tagged string created from the value of the @content attribute with language information set from the language of the property element. Otherwise, the value is a simple literal created from the value of the @content attribute.
If the element is a time element.
The value is a literal made from element.itemValue .
If the value is a valid date string having the lexical form of xsd:date [XMLSCHEMA11-2 ].
The value is a typed literal composed of the value and http://www.w3.org/2001/XMLSchema#date.
If the value is a valid time string having the lexical form of xsd:time [XMLSCHEMA11-2 ].
The value is a typed literal composed of the value and http://www.w3.org/2001/XMLSchema#time.
If the value is a valid local date and time string or valid global date and time string having the lexical form of xsd:dateTime [XMLSCHEMA11-2 ].
The value is a typed literal composed of the value and http://www.w3.org/2001/XMLSchema#dateTime.
If the value is a valid month string having the lexical form of xsd:gYearMonth [XMLSCHEMA11-2 ].
The value is a typed literal composed of the value and http://www.w3.org/2001/XMLSchema#gYearMonth.
If the value is a valid non-negative integer having the lexical form of xsd:gYear [XMLSCHEMA11-2 ].
The value is a typed literal composed of the value and http://www.w3.org/2001/XMLSchema#gYear.
If the value is a valid duration string having the lexical form of xsd:duration [XMLSCHEMA11-2 ].
The value is a typed literal composed of the value and http://www.w3.org/2001/XMLSchema#duration.
Otherwise
If the element has a non-empty language , the value is a language-tagged string created from the value with language information set from the language of the property element. Otherwise, the value is a simple literal created from the value.
Note

The HTML valid yearless date string is similar to xsd:gMonthDay , but the lexical forms differ, so it is not included in this conversion.

See The time element in [HTML5 ].

Otherwise
If the element has a non-empty language , the value is a language-tagged string created from the value with language information set from the language of the property element. Otherwise, the value is a simple literal created from the value.

See The lang and xml:lang attributes in [HTML5 ] for determining the language of a node.

top-level item
An item which does not contain an itemprop attribute. Available through the Microdata DOM API as document.getItems. (See top-level microdata item in [MICRODATA ]).
URI reference
URI references are suitable to be used in subject, predicate or object positions within an RDF triple, as opposed to a literal value that may contain a string representation of a URI. (See [RDF11-CONCEPTS ]).
Issue

The HTML5/microdata content model for @href, @src, @data, itemtype and itemprop and itemid is that of a URL, not a URI or IRI.

A proposed mechanism for specifying the range of property values to be URI reference or IRI could allow these to be specified as subject or object using a @content attribute.

vocabulary
A vocabulary is a collection of URIs, suitable for use as an itemtype or itemprop value, that share a common URI prefix. That prefix is the vocabulary URI. A vocabulary URI is not allowed to be a prefix of another vocabulary URI.
Note
This definition differs from the language in the HTML spec and is just for the purpose of this document. In HTML, a vocabulary is a specification, and doesn't have a URI. In our view, if one specification defines ten itemtypes, then these could be treated as one vocabulary or as ten distinct vocabularies; it is entirely up to the vocabulary creator.

6.2 RDF Conversion Algorithm

A HTML document containing microdata MAY be converted to any other RDF-compatible document format using the algorithm specified in this section.

A conforming microdata processor implementing RDF conversion MUST implement a processing algorithm that results in the equivalent triples to those that the following algorithm generates:

  1. For each element that is also a top-level item, Generate the triples for that item using the evaluation context.

6.3 Generate the triples

When the user agent is to Generate triples for an item item, given evaluation context, it must run the following steps:

Note

This algorithm has undergone substantial change from the original microdata specification [MICRODATA ].

  1. If there is an entry for item in memory, then let subject be the subject of that entry. Otherwise, if item has a global identifier and that global identifier is an absolute URL, let subject be that global identifier. Otherwise, let subject be a new blank node.
  2. Add a mapping from item to subject in memory
  3. For each type returned from element.itemType of the element defining the item.
    1. If type is an absolute URL, generate the following triple:
      subject
      subject
      predicate
      http://www.w3.org/1999/02/22-rdf-syntax-ns#type
      object
      type (as a URI reference)
  4. Set type to the first value returned from element.itemType of the element defining the item.
  5. Otherwise, set type to current type from evaluation context if not empty.
  6. If the registry contains a URI prefix that is a character for character match of type up to the length of the URI prefix, set vocab as that URI prefix.
  7. Otherwise, if type is not empty, construct vocab by removing everything following the last SOLIDUS U+002F ("/") or NUMBER SIGN U+0023 ("#") from the path component of type.
  8. Update evaluation context setting current vocabulary to vocab.
  9. For each element element that has one or more property names and is one of the properties of the item item run the following substep:
    1. For each name in the element's property names, run the following substeps:
      1. Let context be a copy of evaluation context with current type set to type.
      2. Let predicate be the result of generate predicate URI using context and name.
      3. Let value be the property value of element.
      4. If value is an item, then generate the triples for value using context. Replace value by the subject returned from those steps.
      5. Generate the following triple:
        subject
        subject
        predicate
        predicate
        object
        value
      6. If an entry exists in the registry for name in the vocabulary associated with vocab having the key subPropertyOf or equivalentProperty, for each such value equiv, generate the following triple:
        subject
        subject
        predicate
        equiv
        object
        value
  10. Return subject

6.4 Generate Predicate URI

Predicate URI generation makes use of current type and current vocabulary from an evaluation context context along with name.

  1. If name is an absolute URL, return name as a URI reference.
  2. If current type from context is null, there can be no current vocabulary. Return the URI reference that is the document base with its fragment set to the canonicalized fragment value of name.
    Note
    This rule is intended to allow for a the case where no type is set, and therefore there is no vocabulary from which to extract rules. For example, if there is a document base of http://example.org/doc and an itemprop of 'title', a URI will be constructed to be http://example.org/doc#title.
  3. Set expandedURI to the URI reference constructed by appending the canonicalized fragment value of name to current vocabulary, separated by a U+0023 NUMBER SIGN character ("#") unless the current vocabulary ends with either a U+0023 NUMBER SIGN character ("#") or SOLIDUS U+002F ("/").
  4. Return expandedURI.

A. Reverse itemprop

This section is non-normative.

The WebSchemas community has proposed the use of a new Microdata attribute: itemprop-reverse. Although not present in [MICRODATA ] at this time, the attribute can be very useful in many markup examples where items are related using the reverse of a common property; this saves creating new properties which exist solely for the purpose of describing such reverse relationships. Evidence for the utility of such a feature can be seen in the RDFa @rev attribute [RDFA-CORE ] and the JSON-LD @reverse property [JSON-LD ].

Note

See issue 5 for further reference.

This feature adds the following attribute:

itemprop-reverse
An attribute used to identify one or more names of an items reversing the sense of itemprop. An itemprop-reverse contains a space separated list of names which may either by absolute URLs or terms associated with the type of the item as defined by the referencing item's item type.

The Algorithm is extended accordingly:

A.1 Algorithm Terms

reverse properties
The mechanism for finding the reverse properties of an item The list of reverse properties is the result of transforming each space-separated-value of an item's itemprop-reverse to a URL as defined in Property URI Generation.
reverse property names
The tokens of an element's itemprop-reverse attribute. Each token is a name.

A.2 Generate the triples

The Triples generation algorithm is extended with the following step to take place immediately after Step 9:

  1. For each element element that has one or more reverse property names and is one of the reverse properties of the item item, run the following substep:
    1. For each name in the element's reverse property names, run the following substeps:
      1. Let context be a copy of evaluation context with current type set to type and current vocabulary set to vocab.
      2. Let predicate be the result of generate predicate URI using context and name.
      3. Let value be the property value of element.
      4. If value is an item, then generate the triples for value using context. Replace value by the subject returned from those steps.
      5. Otherwise, if value is a literal ignore the value and continue to the next name; it is an error for the value of itemprop-reverse to be a literal.
      6. Generate the following triple:
        subject
        value
        predicate
        predicate
        object
        subject

Simple use of itemprop-reverse:

Example 10
<div itemscope itemtype="http://schema.org/Person">
 <span itemprop="name">William Shakespeare</span>
 <link itemprop-reverse="creator" href="http://www.freebase.com/m/0yq9mqd">
</div>

Results in the following Turtle:

Example 11
@prefix schema: <http://schema.org/> .
<http://www.freebase.com/m/0yq9mqd> schema:creator [
 a schema:Person;
 schema:name "William Shakespeare"
] .

B. Testing

This section is non-normative.

A test suite [MICRODATA-RDF-TESTS ] under development to help processor developers verify conformance to this specification.

C. Markup Examples

This section is non-normative.

The microdata example below expresses book information as an FRBR Work item.

Example 12
<dl itemscope
 itemtype="http://purl.org/vocab/frbr/core#Work"
 itemid="http://books.example.com/works/45U8QJGZSQKDH8N"
 lang="en">
 <dt>Title</dt>
 <dd><cite itemprop="http://purl.org/dc/terms/title">Just a Geek</cite></dd>
 <dt>By</dt>
 <dd><span itemprop="http://purl.org/dc/terms/creator">Wil Wheaton</span></dd>
 <dt>Format</dt>
 <dd itemprop="http://purl.org/vocab/frbr/core#realization"
 itemscope
 itemtype="http://purl.org/vocab/frbr/core#Expression"
 itemid="http://books.example.com/products/9780596007683.BOOK">
 <link itemprop="http://purl.org/dc/terms/type" href="http://books.example.com/product-types/BOOK">
 Print
 </dd>
 <dd itemprop="http://purl.org/vocab/frbr/core#realization"
 itemscope
 itemtype="http://purl.org/vocab/frbr/core#Expression"
 itemid="http://books.example.com/products/9780596802189.EBOOK">
 <link itemprop="http://purl.org/dc/terms/type" href="http://books.example.com/product-types/EBOOK">
 Ebook
 </dd>
</dl>

Assuming that registry contains a an entry for http://purl.org/vocab/frbr/core# this is equivalent to the following Turtle:

Example 13
@prefix dc: <http://purl.org/dc/terms/> .
@prefix frbr: <http://purl.org/vocab/frbr/core#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
<> rdfa:usesVocabulary frbr: .
<http://books.example.com/works/45U8QJGZSQKDH8N> a frbr:Work ;
 dc:creator "Wil Wheaton"@en ;
 dc:title "Just a Geek"@en ;
 frbr:realization <http://books.example.com/products/9780596007683.BOOK>,
 <http://books.example.com/products/9780596802189.EBOOK> .
<http://books.example.com/products/9780596007683.BOOK> a frbr:Expression ;
 dc:type <http://books.example.com/product-types/BOOK> .
<http://books.example.com/products/9780596802189.EBOOK> a frbr:Expression ;
 dc:type <http://books.example.com/product-types/EBOOK> .

The following snippet of HTML has microdata for two people with the same address. This illustrates two items referencing a third item, and how only a single RDF resource definition is created for that third item.

Example 14
<p>
 Both
 <span itemscope itemtype="http://microformats.org/profile/hcard" itemref="home">
 <span itemprop="fn"
 ><span itemprop="n" itemscope
 ><span itemprop="given-name">Princeton</span></span></span>
 </span>
 and
 <span itemscope itemtype="http://microformats.org/profile/hcard" itemref="home">
 <span itemprop="fn"
 ><span itemprop="n" itemscope
 ><span itemprop="given-name">Trekkie</span></span></span>
 </span>
 live at
 <span id="home" itemprop="adr" itemscope>
 <span itemprop="street-address">Avenue Q</span>.
 </span>
</p>

Assuming that registry contains a an entry for http://microformats.org/profile/hcard it generates these triples expressed in Turtle:

Example 15
@prefix hcard: <http://microformats.org/profile/hcard#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
[ a <http://microformats.org/profile/hcard>;
 hcard:fn "Princeton";
 hcard:n [ hcard:given-name "Princeton" ];
 hcard:adr _:a
] .
[ a <http://microformats.org/profile/hcard>;
 hcard:fn "Trekkie";
 hcard:n [ hcard:given-name "Trekkie" ];
 hcard:adr _:a
] .
_:a hcard:street-address "Avenue Q" .

The following snippet of HTML has microdata for a playlist and illustrates the use of the schema:additionalType property to relate recordings to the Music Ontology :

Example 16
<div itemscope itemtype="http://schema.org/MusicPlaylist">
 <span itemprop="name">Classic Rock Playlist</span>
 <meta itemprop="numTracks" content="2"/>
 <p>Including works by
 <span itemprop="byArtist">Lynard Skynard</span> and
 <span itemprop="byArtist">AC/DC</span></p>.
 <div itemprop="tracks" itemscope itemtype="http://schema.org/MusicRecording">
 <link itemprop="additionalType" href="http://purl.org/ontology/mo/MusicalManifestation"/>
 1.<span itemprop="name">Sweet Home Alabama</span> -
 <span itemprop="byArtist">Lynard Skynard</span>
 <link href="sweet-home-alabama" itemprop="url" />
 </div>
 <div itemprop="tracks" itemscope itemtype="http://schema.org/MusicRecording">
 <link itemprop="additionalType" href="http://purl.org/ontology/mo/MusicalManifestation"/>
 2.<span itemprop="name">Shook you all Night Long</span> -
 <span itemprop="byArtist">AC/DC</span>
 <link href="shook-you-all-night-long" itemprop="url" />
 </div>
</div>

Assuming that registry contains a an entry for http://schema.org/ it generates these triples expressed in Turtle:

Example 17
@prefix mo: <http://purl.org/ontology/mo/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
[ a schema:MusicPlaylist;
 schema:name "Classic Rock Playlist";
 schema:byArtist "Lynard Skynard" "AC/DC";
 schema:numTracks "2";
 schema:tracks [
 a schema:MusicRecording, mo:MusicalManifestation;
 schema:additionalType mo:MusicalManifestation;
 schema:byArtist "Lynard Skynard";
 schema:name "Sweet Home Alabama";
 schema:url <sweet-home-alabama>
 ], [
 a schema:MusicRecording, mo:MusicalManifestation;
 schema:additionalType mo:MusicalManifestation;
 schema:byArtist "AC/DC";;
 schema:name "Shook you all Night Long";
 schema:url <shook-you-all-night-long>
 ]
] .

D. Default registry

This section is non-normative.

The following is the default registry in JSON format, as of the time of publication.

Example 18
{
 "http://schema.org/": {
 "properties": {
 "additionalType": {"subPropertyOf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"}
 }
 },
 "http://microformats.org/profile/hcard": {}
}

E. Acknowledgements

This section is non-normative.

Thanks to Richard Cyganiak for property URI and vocabulary terminology and the general excellent consideration of practical problems in generating RDF from microdata.

F. References

F.1 Normative references

[HTML5]
Ian Hickson; Robin Berjon; Steve Faulkner; Travis Leithead; Erika Doyle Navara; Theresa O'Connor; Silvia Pfeiffer. HTML5. 28 October 2014. W3C Recommendation. URL: http://www.w3.org/TR/html5/
[MICRODATA]
Ian Hickson. HTML Microdata. 29 October 2013. W3C Note. URL: http://www.w3.org/TR/microdata/
[N-TRIPLES]
Gavin Carothers; Andy Seaborne. RDF 1.1 N-Triples. 25 February 2014. W3C Recommendation. URL: http://www.w3.org/TR/n-triples/
[RDF11-CONCEPTS]
Richard Cyganiak; David Wood; Markus Lanthaler. RDF 1.1 Concepts and Abstract Syntax. 25 February 2014. W3C Recommendation. URL: http://www.w3.org/TR/rdf11-concepts/
[RFC3986]
T. Berners-Lee; R. Fielding; L. Masinter. Uniform Resource Identifier (URI): Generic Syntax. January 2005. Internet Standard. URL: https://tools.ietf.org/html/rfc3986
[URL]
Anne van Kesteren; Sam Ruby. URL. 9 December 2014. W3C Working Draft. URL: http://www.w3.org/TR/url-1/
[XMLSCHEMA11-2]
David Peterson; Sandy Gao; Ashok Malhotra; Michael Sperberg-McQueen; Henry Thompson; Paul V. Biron et al. W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes. 5 April 2012. W3C Recommendation. URL: http://www.w3.org/TR/xmlschema11-2/

F.2 Informative references

[JSON-LD]
Manu Sporny; Gregg Kellogg; Markus Lanthaler. JSON-LD 1.0. 16 January 2014. W3C Recommendation. URL: http://www.w3.org/TR/json-ld/
[MICRODATA-RDF-TESTS]
Gregg Kellogg; Ivan Herman. Microdata to RDF Tests. unofficial. URL: http://w3c.github.io/microdata-rdf/tests/
[RDF-SYNTAX-GRAMMAR]
Fabien Gandon; Guus Schreiber. RDF 1.1 XML Syntax. 25 February 2014. W3C Recommendation. URL: http://www.w3.org/TR/rdf-syntax-grammar/
[RDFA-CORE]
Ben Adida; Mark Birbeck; Shane McCarron; Ivan Herman et al. RDFa Core 1.1 - Second Edition. 22 August 2013. W3C Recommendation. URL: http://www.w3.org/TR/rdfa-core/

AltStyle によって変換されたページ (->オリジナル) /