W3CWD-P3P-19981109/vocab
P3P Harmonized Vocabulary Specification
W3C Working Draft 9-November-1998
-
-
This Version:
-
http://www.w3.org/TR/1998/WD-P3P-19981109/vocab
-
Latest Version:
-
http://www.w3.org/TR/WD-P3P/vocab
-
Previous Version:
-
http://www.w3.org/TR/1998/WD-P3P10-harmonization-19980330
-
Editor:
-
Joseph Reagle (W3C) reagle@w3.org
Status of This Document
This is a subspecification of the P3P1.0
specification for review by W3C members and other interested parties.
This document has been produced as part of the P3P Activity, and will eventually
be advanced toward W3C Recommendation status. It is inappropriate to use
W3C Working Drafts as reference material or to cite them as other than "work
in progress." The underlying concepts of the draft are fairly stable and
we encourage the development of experimental implementations and prototypes
so as to provide feedback on the specification. However, this Working Group
will not allow early implementations to affect their ability to make changes
to future versions of this document.
This draft document will be considered by W3C and its members according to
W3C process. This document is made public for the purpose of receiving comments
that inform the W3C membership and staff on issues likely to affect the
implementation, acceptance, and adoption of P3P.
Send comments to
www-p3p-public-comments@w3.org (archived
at
http://lists.w3.org/Archives/Public/www-p3p-public-comments).
Table of Contents
-
Introduction
-
Compliance Requirements
-
Definitions
-
Data Categories: a type, or quality of specific
data element such as last_name.
-
Data Collection Purposes: the purpose of the
data collection
-
Qualifications on Purposes: additional information
on how the purpose is realized
-
General Disclosures: describe the user's capabilities
to further understand a service provider's practices
-
References
-
Acknowledgements
The P3P Specification [P3P]
specifies an [XML] / [RDF] application
that defines the structure, or grammar, of a P3P
proposal. This
document, the harmonized vocabulary, describes the terms that fit
into the P3P grammar; this process is technically called the "semantic definition
of a XML/RDF schema or vocabulary." For example, the P3P specification states
that P3P statements must declare the purposes for which
data are collected, this document specifies a list of six such purposes and
their meaning.
P3P can support multiple schemas. However, P3P is likely to be most effective
when a single base vocabulary is widely used since information practice
statements are most useful when they can be readily understood by users and
their computer agents. Complementary vocabularies may develop to
cater to jurisdiction-specific concerns not addressed by the base
vocabulary. This can be easily accomplished
through the XML-namespace
[XML-names] facility, which allows tags from different
XML schemas to be intermixed. However, the semantics of this specification
always dominate those of an external namespace. For instance, someone can
not place an attribute within a proposal that says "and this proposal is
void on Tuesdays" and argue this excuses them from the semantics defined
in the P3P specification.
Therefore, this document includes a base set of vocabulary elements useful
for expressing privacy policies reflective of a diversity of privacy
laws, self-regulatory norms, and cultural notions
about privacy. This vocabulary can be used to express policies as diverse
as anonymous browsing to the provision of personalized Web content and services.
However, P3P implementations need not restrict themselves solely to vocabularies
defined within this document.
Note, in addition to the terms specified in the harmonized vocabulary, P3P
requires services to specify in their proposals the
service provider's identity, an experience space to which their practices
apply (e.g., realm: http://www.w3.org), the location at
which users can find a human-readable explanation of the service's privacy
policies (discURI) and an optional human-readable description
of the result (e.g., consequence: "to offer customized sports
updates").
Security issues and protocols are not addressed by
this document. Information about the characteristics and strength of those
protocols is critical to a user's decision regarding the transmission of
information. However, an assumption of P3P is that communication and storage
security is achieved through means other than P3P itself (such as SSL).
Legal issues regarding law enforcement demands for information are not addressed
by this document. It is possible that a service provider that otherwise abides
by its proposal of not redistributing data to others may be required to do
so by force of law.
Comment: Much of the work done on this schema was conducted
under significant time pressure. Accordingly, there is interest from members
of the working group to have some of these issues revisited in the future
by the W3C or other entities as appropriate.
This specification is a representation of a rough, inclusive consensus from
the Harmonization WG -- meaning that which is specified is recommended as
a minimal set of terms. The recommendation and requirements are offset in
a colored table. Requirements are expressed over variables which the
WG thinks values must be defined for in order to be a valid P3P proposal.
Products must support the ability to parse and act upon all the variables
defined, though we do not specify the way such values need to be acted upon
or presented in a graphical user interface; these are left to implementations
and user configuration -- which is addressed in the P3P Implementation Guide.
To simplify practice declaration, service providers may
aggregate any of the disclosures (purposes, recipients,
and identifiable use) within a statement over data elements. Service providers
MUST make such aggregations as an additive operation. For instance,
a site that distributes your age to (0: ourselves), but distributes your
zip code to (3: the public), MAY say they distribute your {name and zip code}
to {0, 3: ourselves and the public}. Such a statement appears to distribute
more data than actually happens. It is up to the service provider to determine
if their disclosure deserves specificity or brevity.
Also, one must always disclose all options that apply. Consider a site with
the sole purpose of collecting information for the purposes of {4: Contacting
Visitors for Marketing of Services or Products}. Even though this is considered
to be for the {0: Completion and Support of Current Activity}, the site must
state {0,4}. Consider a site which distributes information to {0: Ourselves
and our agents}in order to redistribute it to {3: Unrelated third parties
or public fora}, the site must state {0,3}.
-
Personally Identifiable Data
-
Data that is used to identify, contact, or locate a person. This includes
data from which other personally identifying data can easily be derived.
This definition focuses on use because it is difficult to determine whether
certain data or combinations of data are personally identifiable without
information about the context. For example, whether an IP address is static
or randomly generated will influence whether it can be used to identify a
person -- see Identifiable Use for more of
an explanation.
-
Purpose
-
The reason(s) for data collection and use.
-
Practice / Statement
-
The set of disclosures and (optional) solicitations regarding data usage,
including purpose, identifiable use, recipients and other disclosures.
-
Equable Practice
-
A practice that is very similar to another in that the purpose, recipients,
and identifiable use are the same or more constrained than the original (a
lower value), and the other disclosures are not substantially different.
For example, two sites with otherwise similar practices that follow different
-- but similar -- sets of industry guidelines. )
-
Service Provider (Data Controller, Legal Entity)
-
The person or organization which offers information, products or services
from a Web site, collects information, and is responsible for the representations
made in a practice statement.
A data category is a quality of a data element or class that may be used
by the user's agent to determine what type of element is under discussion.
Status Optional: Service providers MAY use data categories to
describe data elements or data sets. If a service provider requires a
representation of data that is not otherwise referenceable in an easily
understood way, we recommend the following terms be used according to their
corresponding definitions.
0
-
Physical Contact Information
-
Information that allows an individual to be contacted or located in the physical
world -- such as phone number or address.
1
-
Online Contact Information
-
Information that allows an individual to be contacted or located on the Internet
-- such as email. Often, this information is independent of the specific
computer used to access the network. (See
Computer Information)
2
-
Unique Identifiers
-
Non-financial identifiers issued for purposes of consistently identifying
the individual -- such as SSN or Web site IDs.
3
-
Financial Account Identifiers
-
Identifiers that tie an individual to a financial instrument, account, or
payment system -- such as a credit card or bank account number.
4
*
-
Computer Information
-
Information about the computer system that the individual is using to access
the network -- such as the IP number, domain name, browser type or operating
system.
5
*
-
Navigation and Click-stream Data
-
Data passively generated by browsing the Web site -- such
as which pages are visited, and how long users stay on each page.
6
*
-
Interactive Data
-
Data actively generated from or reflecting explicit
interactions with a service provider through its site -- such as queries
to a search engine, logs of account activity, or purchases made on the
Web.
7
-
Demographic and Socio-economic Data
-
Data about an individual's characteristics -- such as gender, age, and income.
8
-
Preference Data
-
Data about an individual's likes and dislikes -- such as favorite color or
musical tastes.
9*
-
Content
-
The words and expressions contained in the body of a communication -- such
as the text of email, bulletin board postings, or chat room communications.
* Note: The Computer, Navigation, Interactive and Content categories can
be distinguished as follows. The Computer category includes information
about the user's computer including IP address and software configuration.
Navigation data describes actual user behavior related to browsing. When
an IP address is stored in a log file with information related to browsing
activity, both the Computer category and the Navigation category should be
used. Interactive Data is data actively solicited to provide some useful
service at a site beyond browsing. Content is information exchanged on a
site for the purposes of communication.
The following specifies and defines a set of six purposes for data processing
relevant to the Web.
Status Required: Service providers MUST use the following terms
to explain the purpose of data collection. Service providers MUST disclose
all that apply. If a service provider does not disclose that a data
element will be used for a given purpose, that is a representation that data
will not be used for that purpose. Service providers that disclose that they
use data for "other" purposes MUST provide human readable explanations of
those purposes.
0
-
Completion and Support of Current Activity
-
The use of information by the service provider to complete the activity
for which it was provided, such as the provision of information, communications,
or interactive services -- for example to return the results from a Web search,
to forward email, or place an order.
1
-
Web Site and System Administration
-
The use of information solely for the technical support of the Web site and
its computer system. This would include processing computer account information,
and information used in the course of securing and maintaining the site.
2
-
Customization of Site to Individuals
-
The use of information to tailor or modify the content or design of the site
to the particular individual.
3
-
Research and Development
-
The use of information to enhance, evaluate, or otherwise review the site,
service, product, or market. This does not include personal information used
to tailor or modify the content to the specific individual nor information
used to evaluate, target, profile or contact the individual.
4
-
Contacting Visitors for Marketing of Services or Products
-
The use of information to contact the individual for the promotion of a product
or service. This includes notifying visitors about updates to the Web site.
5
-
Other Uses
-
The use of information not captured by the above definitions. (A human readable
explanation should be provided in these instances.)
Qualifiers are appended to a purpose to provide additional information on
how the purpose is realized with respect to a data element or set of data
elements.
-
Identifiable Use
-
Is data used in a way that is personally identifiable -- including linking
it with identifiable information about you from other sources? While some
data is obviously identifiable, such as (full_name), other data, such as
(zip_code, salary, birth_date), could allow a person to be identified. Also,
a technically astute person in some circumstances could determine the identity
of a user from the IP number in a HTTP log. This requires a specific effort
and is based on how that IP number is registered, whether it is used by more
than one person on a computer, or if it is dynamically allocated by an internet
service provider. Consequently, we refrain from defining any particular data
or set of data as identifiable and focus on whether it is used in an identifiable
way.
Status Required: Services MUST disclose one of the values
of the Identifiable qualifier.
-
0 No
1 Yes
-
Recipients (Domain of Use)
-
The recipients defines an organizational area, or domain,
beyond the service provider and its agents where data may be distributed.
Status Required: Services must disclose
all the
Recipients that apply.
Comment: Creating a set of values which are simple, informative to
the user, and accurate for service provider representations is very challenging
and the WG is not completely satisfied with the results. For instance, the
issue of transaction facilitators, such as shipping or payment processors,
who are necessary for the completion and support of the activity but may
follow different practices was problematic. As it stands, such organizations
should be represented in whichever category most accurately reflects their
practices with respect to the original service provider.
0
-
Ourselves and/or our agents
-
Ourselves and our agents. We define an agent in this instance as a third
party that processes data only on behalf of the service provider for the
completion of the stated purposes. (e.g., The service provider and its printing
bureau which prints address labels and does nothing further with the
information.)
1
-
Organizations following our practices
-
Organizations who use the data on their own behalf under
equable practices. (e.g., Consider a service
provider that grants the user access to collected personal information, they
also provide it to a partner who uses it once but discards it. Since the
recipient, who has otherwise similar practices, cannot grant the user access
to information that it discarded, they are considered to have equable practices.)
2
-
Organizations following different practices
-
Organizations that are constrained by and accountable to the original service
provider, but may use the data in a way not specified in the service provider's
practices. (e.g. The service provider collects data that is shared with a
partner who may use it for other purposes. However, it is in the service
providers interest to ensure that the data is not used in a way that would
be considered abusive to the users' and its own interests.)
3
-
Unrelated third parties or public fora
-
Organizations or fora whose data usage practices are not known by the original
service provider. (e.g. data is provided as part of a commercial CD-ROM
directory, or it is posted on a public on-line Web directory.)
The following are general disclosures about the policies of the service provider.
Further information on the policies would be found at the discURI.
-
Access to Identifiable Information
-
the ability of the individual to view identifiable information and address
questions or concerns to the service provider.
Status Required: Service providers must disclose
all
Access capabilities
that apply. The methods of access is
not specified. This disclosure applies to the identifiable use disclosure.
Any disclosure is not meant to imply that access to all data is possible,
but that some of the data may be accessible and that the user should communicate
further with the service provider to determine what capabilities they have.
Comment: Service providers may also wish to provide capabilities
to access to information collected through means other than the Web at the
discURI.However, the scope of P3P statements are
limited to data collected through HTTP or other Web transport protocols.
Also, if access is provided through the Web we recommend the use of strong
authentication and security mechanisms for such access, however security
issues are outside the scope of this document.
-
0 Identifiable Data is Not Used
-
[this should be consistent with the use of the identifiable qualifier].
-
1 Identifiable Contact Information
-
access is given to identifiable online and physical contact information (e.g.,
users can access things such as a postal address).
-
2 Other Identifiable Information
-
access is given to other information linked to an identifiable person. (e.g.,
users can access things such as a their online account charges).
-
3 None
-
no access to identifiable information is given.
-
Assurance (Accountability)
-
Does the site have an assuring party that attests that the
service will abide by its proposal, follows guidelines in the processing
of data, or other relevant assertions. Assurance may come from the service
provider or an independent assuring party.
Status Required (but specified elsewhere): A required version
of this disclosure is implemented through the assurance field, defined in
the P3P1.0 specification.
Comment: We expect this field can be used in a number of
ways, from representing that one's privacy practices are self assured, audited
by a third party, or under the jurisdiction of a regulatory authority.
-
Other_Disclosures
-
Are Disclosures Made with respect to the following:
Status Optional: If a site wishes to signify in a proposal that
it makes a disclosure about change_agreement, or retention, it may do so
with the following. No disclosure means that the service provider makes no
representation of a policy on that topic.
Comment: Some members of the working group felt that 1)
disclosures could be made about other topics such as security (see the
purpose section), 2) more specific values should
be provided, and 3) that such disclosures should be required. However, a
strong consensus for this could not be reached in the available time.
-
0 Change_Agreement
-
Does the service provider make a disclosure regarding the capability for
the user to cancel, or renegotiate the existing agreement at a future time?
-
1 Retention
-
Does the service provider make a disclosure on how long data is retained?
-
[P3P]
-
Marchiori M. and Jaye D. Platform for Privacy Preferences
(P3P) Syntax Specification. World Wide Web
Consortium. 09-November-1998 (Working Draft)
-
[RDF]
-
O. Lassila, R. Swick. "Resource
Description Framework (RDF) Model and Syntax Specification,"
World Wide Web Consortium. 29-July-1998.
(Working Draft)
-
[XML-names]
-
T. Bray, D. Hollander, A. Layman.
"Namespaces in
XML." World Wide Web Consortium.
02-August-1998. (Working Draft).
-
[XML]
-
T. Bray, J. Paoli, C. M. Sperberg-McQueen.
"Extensible Markup Language (XML)
1.0 Specification," World Wide Web
Consortium. 10-February-1998. (Recommendation)
-
Liz Blumenfeld, America Online
-
Ann Cavoukian, Information and Privacy Commission/Ontario
-
Scott Chalfant, Matchlogic
-
Lorrie Cranor, AT&T
-
Jim Crowe, Direct Marketing Association
-
Josef Dietl, World Wide Web Consortium
-
David Duncan, Information and Privacy Commission/Ontario
-
Melissa Dunn, Microsoft
-
Patricica Faley, Direct Marketing Association
-
Marit Köhntopp, Privacy Commissioner of Schleswig-Holstein, Germany
-
Tony LAM, Hong Kong Privacy Commissioner's Office
-
Tara Lemmey, Narrowline
-
Jill Lesser, America Online
-
Steve Lucas, Matchlogic
-
Deirdre Mulligan, Center for Democracy and Technology
-
Nick Platten, Data Protection Consultant (formerly of DG XV, European Commission)
-
Joseph Reagle, World Wide Web Consortium
-
Ari Schwartz, Center for Democracy and Technology
-
Jonathan Stark, TRUSTe
_________
Copyright © 1998
W3C
(MIT,
INRIA,
Keio ), All Rights Reserved. W3C
liability,
trademark,
document
use and
software
licensing rules apply.