CSAIL Logo

Semantic Web URIs

Tim Berners-Lee

Decentralized Information Group
MIT Computer Science and Artificial Intelligence Laboratory

Overview

How the Semantic Web Works

Choosing URIs

Supporting URIs

How the web Works

A URI identifies a document

HTTP allows a client to get a representation of the document

How the Semantic Web works

A URI identifies any thing

That URI is related to the URI of the document

The document is data about the thing

Tree Vs Web

Tree Web

OO systems Semantic Web

Information hiding Information sharing

Social Hierarchies Social web

Top-down elaboration Top-down, Bottom-up & middle-out

Fixed slots Anyone can say anything about anything

Can't be a web Can be a tree

Info stored with subject Can also be stored anywhere else

One ID per object Things can have many URIs

Put aside Object Oriented and XML paradigms.

URI of Thing to URI of document

Two alternatives

The URI has a hash:

Eg: http://example.com/products#sku105

The part on the left of the # is the document URI

The URI has no hash:

Eg: http://example.com/products/sku105

This URI must HTTP redirect 303 See Also to the document URI

Do NOT confuse the URI of a thing and the URI of a web page about it.

Each URI identifies one thing only

Hash vs Slash

Hash allows document URI to be generated instantly

Hash works on local file system too

Hash does not work when document would be huge

Useful idiom: http://example.org/product/sku105#it

Criteria for Choosing URIs

Rational classification of the subjects of the data.

Persistence

Persistence

Persistence

Persistence tips

Institutional commitment.

Domain name controlled by those ensuring persistence.

Cleanliness of URI names.

No mechanisms /cgi-bin/foo.asp?id=x

No passing attributes /~harriet/latest/

No internals of file type /foo.rdf

Technique: Date space

Technique: Apache rewrite rules

See "Cool URIs don't change"

URN's, DOIs, etc don't help... you need exactly the same...

Institutional commitment. (Try asking W3C)

See also: W3C TAG finding Metadata in URIs, W3C Persistence Policy

Multiple URIs for the same thing

Not the end of the world

Advisable when many agents have data on same thing.

E.G. http://www.w3.org/People/Berners-Lee/card#i and http://www4.wiwiss.fu-berlin.de/dblp/resource/person/100007

Connect using owl:sameAs ('=' in N3)

Store and serve other people's URIs: link!

Be part of the Semantic Web

Persitence - examples: discuss

http://www.amazon.com/HP-450Ci-Mobile-DeskJet-Printer/dp/B00006LLJ7

http://www.shopping.hp.com/product/category/photosmart_printers/1/storefronts/Q7091A%2523ABA

What to serve: Linked data

Use URIs as names for things

Use HTTP URIs so that people can look up those names.

When someone looks up a URI, provide useful information.

Include links to other URIs. so that they can discover more things.

Ideally: all links out and in

Reasonable size of data returned

Not knowledge bases in zip/tar files!

Link: The use in document d of URI which is in document d'

(See DesignIssues note: Linked Data)

Examples of linked data

Most ontologies

dbview-generated from SQL virtual RDF data

New: D2R Server provides linked data as well as SPARQL

FOAF (home brew, LiveJournal, Opera Community etc)

Semantic wikipedia

Place-names ... etc

The biggest challenge is links to other systems

Lots of data? SPARQL service

SPARQL is the RDF query language

In final stages of standardization

Many implementations

Graph match, optional match,

More concise than equivalent SQL or XQ

Higher conceptual level

More robust against implementation changes

See Jim Melton's W3C Tech Plenary talk ( slides and XTech paper.)

Self-describing Web

"Follow your nose" from URI to what it means.

Ladder of authority

Domin owner controls IP address of server

HTTP URIs have owners who say what they mean

Serve links in both directions

Remember web architecture

Server files with correct Content-type

eg application/rdf+xml, text/rdf+n3

Not text/plain or application/binary

Serve links in both directions

Summary

URIs for things, URIs for documents

Pick URIs to be persistent

Serve data about the thing in response to HTTP GET

Use SPARQL for large-scale access.

More Information

This presentation : http://dig.csail.mit.edu/2007/Talks/0108-swuri-tbl/

The deliverables of the (now closed) Semantic Web Best Practices and Deployment Working Group, including

Best Practice Recipes for Publishing RDF Vocabularies

The deliverables of the new Semantic Web Deployment Working Group

Creative Commons License
This work is licensed under a Creative Commons Attribution - Non-Commercial - No Derivatives 2.5 License.

Tree	Web
OO systems	Semantic Web
Information hiding	Information sharing
Social Hierarchies	Social web
Top-down elaboration	Top-down, Bottom-up & middle-out
Fixed slots	Anyone can say anything about anything
Can't be a web	Can be a tree
Info stored with subject	Can also be stored anywhere else
One ID per object	Things can have many URIs